Setting Up Server Monitoring (Zabbix) for Your Website
Zabbix is not "set and forget". It's a platform that requires thoughtful architecture: what to monitor, how frequently, which thresholds actually indicate real problems vs noise. A typical "follow the tutorial" setup produces 200 triggers per server, half of which fire constantly and get disabled after a week.
This guide describes a Zabbix setup approach that works in production: from deployment architecture to specific items and alert policies.
Deployment Architecture
For a single site with multiple servers, the standard pattern is: Zabbix Server + PostgreSQL on a dedicated VM, agents on each target host.
For more than 20 servers or geographically distributed infrastructure, add Zabbix Proxy in each zone. The proxy buffers data locally and sends it to the server in batches, reducing load on the central server and being resilient to connection failures.
[Web server] [DB server] [Cache server]
| | |
zabbix-agent zabbix-agent zabbix-agent
\ | /
\ | /
[Zabbix Proxy (optional)]
|
[Zabbix Server]
|
[PostgreSQL]
|
[Zabbix Frontend]
Minimal server requirements for ~10 hosts: 2 vCPU, 4 GB RAM, 50 GB SSD for the database (accounting for 90 days history retention).
Installing Zabbix Server
On Ubuntu 22.04:
wget https://repo.zabbix.com/zabbix/7.0/ubuntu/pool/main/z/zabbix-release/zabbix-release_7.0-2+ubuntu22.04_all.deb
dpkg -i zabbix-release_7.0-2+ubuntu22.04_all.deb
apt update
apt install -y zabbix-server-pgsql zabbix-frontend-php zabbix-apache-conf zabbix-sql-scripts zabbix-agent2
Creating the database:
sudo -u postgres createuser --pwprompt zabbix
sudo -u postgres createdb -O zabbix zabbix
zcat /usr/share/zabbix-sql-scripts/postgresql/server.sql.gz | sudo -u zabbix psql zabbix
Configuration in /etc/zabbix/zabbix_server.conf:
DBHost=localhost
DBName=zabbix
DBUser=zabbix
DBPassword=your_password
# Performance
StartPollers=10
StartPingers=5
StartTrappers=5
CacheSize=128M
HistoryCacheSize=64M
TrendCacheSize=32M
ValueCacheSize=256M
# History retention (days)
# Overridden at the item level
Installing the Agent on Target Servers
Zabbix Agent 2 (Go-based, more performant):
# On each monitored server
apt install -y zabbix-agent2
cat > /etc/zabbix/zabbix_agent2.conf << 'EOF'
Server=<ZABBIX_SERVER_IP>
ServerActive=<ZABBIX_SERVER_IP>
Hostname=web-server-01
# For active checks — the agent initiates connection
# Useful when server is behind NAT
# Allow custom parameters
AllowKey=system.run[*]
# Process monitoring
# system.process.num[nginx] etc.
EOF
systemctl enable --now zabbix-agent2
Auto-adding hosts via API (for infrastructure-as-code):
import requests
ZABBIX_URL = "http://zabbix.example.com/api_jsonrpc.php"
def zabbix_login(user, password):
resp = requests.post(ZABBIX_URL, json={
"jsonrpc": "2.0",
"method": "user.login",
"params": {"username": user, "password": password},
"id": 1
})
return resp.json()["result"]
def add_host(token, hostname, ip, group_id, template_ids):
resp = requests.post(ZABBIX_URL, json={
"jsonrpc": "2.0",
"method": "host.create",
"auth": token,
"params": {
"host": hostname,
"interfaces": [{
"type": 1, # agent
"main": 1,
"useip": 1,
"ip": ip,
"dns": "",
"port": "10050"
}],
"groups": [{"groupid": group_id}],
"templates": [{"templateid": tid} for tid in template_ids]
},
"id": 2
})
return resp.json()
Templates and Items
Zabbix comes with pre-built templates. For a web server:
- Linux by Zabbix agent — CPU, memory, disk, network, processes
- Nginx by Zabbix agent — active connections, requests/s, status codes
- PostgreSQL by Zabbix agent — connections, transactions, replication lag
- PHP-FPM by Zabbix agent — pool status, slow requests
Link templates to host via UI: Configuration → Hosts → Templates → Link new templates.
Custom item for monitoring PHP-FPM via unix socket:
# /etc/zabbix/zabbix_agent2.d/php-fpm.conf
UserParameter=php-fpm.status[*],curl -s --unix-socket /run/php/php8.2-fpm.sock http://localhost/status?json 2>/dev/null
Item in Zabbix:
- Type: Zabbix agent
- Key:
php-fpm.status[] - Information type: Text
- Preprocessing: JSONPath
$.active processes
Triggers — Real Thresholds
Don't blindly copy default triggers. Here are working thresholds for a production web server:
CPU:
Warning: avg(/hostname/system.cpu.util,5m) > 75
Critical: avg(/hostname/system.cpu.util,5m) > 90 and avg(/hostname/system.cpu.util,1m) > 90
Memory (accounting for cache):
Critical: last(/hostname/vm.memory.size[pavailable]) < 10
Disk — absolute value, not percentage:
Warning: last(/hostname/vfs.fs.size[/,pfree]) < 15
Critical: last(/hostname/vfs.fs.size[/,pfree]) < 5
Nginx — request drop (anomaly):
# If rps dropped 3x vs average for past hour
last(/hostname/nginx.requests) < avg(/hostname/nginx.requests,1h) * 0.3
and avg(/hostname/nginx.requests,1h) > 10
Website availability:
last(/hostname/web.test.fail[site-check]) <> 0
Web Scenarios — HTTP Monitoring
Built-in availability checks via HTTP:
Configuration → Hosts → Web → Create web scenario:
Name: Site availability check
Update interval: 60s
Steps:
1. Homepage
URL: https://example.com/
Required status codes: 200
Required string: (something unique on the page)
Timeout: 15s
2. Health endpoint
URL: https://example.com/health
Required status codes: 200
Required string: ok
This creates automatic items: web.test.fail, web.test.time, web.test.error.
Media and Notifications
Setting up Telegram notifications via Media Type:
Administration → Media types → Telegram:
Bot token: <your_bot_token>
Chat ID: <your_chat_id>
Action to send on trigger:
Name: Notify on PROBLEM
Conditions:
- Trigger severity >= Warning
- Maintenance status not in maintenance
Operations:
Send message to: Admin group
Via: Telegram
Recovery operations:
Send recovery message
A template message that's actually informative:
{TRIGGER.SEVERITY}: {TRIGGER.NAME}
Host: {HOST.NAME} ({HOST.IP})
Time: {EVENT.TIME} {EVENT.DATE}
Value: {ITEM.LASTVALUE}
{TRIGGER.URL}
Dashboards and Visualization
A standard dashboard for the web team includes:
- Problem widget — active issues sorted by severity
- Graph widget — CPU/memory overlay for all web servers
- Top hosts — by CPU utilization
- Web monitoring — uptime for all web scenarios
Integration with Grafana via Zabbix datasource plugin (alexanderzobnin-zabbix-app):
grafana-cli plugins install alexanderzobnin-zabbix-app
Grafana provides more flexible visualization and is convenient for building custom dashboards, especially if data comes from multiple sources.
Data Retention and Performance
For large installations, Zabbix recommends TimescaleDB as a PostgreSQL extension:
-- Migrate existing database
SELECT create_hypertable('history', 'clock', chunk_time_interval => 86400, migrate_data => true);
SELECT create_hypertable('history_uint', 'clock', chunk_time_interval => 86400, migrate_data => true);
-- and for other history_* tables
TimescaleDB provides ~10:1 compression and significantly speeds up aggregate queries on time series.
Partitioning via built-in Housekeeping (Administration → Housekeeping):
- Trend storage: 365 days
- History storage: 90 days (or less for high-frequency metrics)
Typical Implementation Timeline
Basic setup monitoring 3-5 servers (agents, templates, main triggers, Telegram notifications): 1 working day.
Full production setup with custom items, web scenarios for all critical URLs, configured dashboards and AlertManager integration: 3-5 days.
Migration from existing monitoring or integration with Grafana as a unified frontend: additional 1-2 days.







