RabbitMQ Cluster Setup for Web Applications
A standalone RabbitMQ is a single point of failure. A cluster of three nodes with quorum queues provides fault tolerance when losing one node without losing messages.
RabbitMQ cluster shares metadata (exchanges, bindings, users) across all nodes, but queues can be local (classic) or replicated (quorum/stream). For production—only quorum queues.
Cluster Architecture
Load Balancer (HAProxy / Nginx)
|
┌───────────────┼───────────────┐
↓ ↓ ↓
rabbit-1:5672 rabbit-2:5672 rabbit-3:5672
rabbit-1:15672 rabbit-2:15672 rabbit-3:15672 (management)
Quorum queues replicate via Raft protocol. Quorum: 2 of 3 nodes must confirm write.
Installation on Ubuntu 22.04
# Add Erlang repository (version matters—RabbitMQ 3.13 requires Erlang 26)
curl -1sLf 'https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-erlang/setup.deb.sh' | bash
apt install -y erlang-base erlang-asn1 erlang-crypto erlang-eldap erlang-ftp \
erlang-inets erlang-mnesia erlang-os-mon erlang-parsetools erlang-public-key \
erlang-runtime-tools erlang-snmp erlang-ssl erlang-syntax-tools erlang-tftp \
erlang-tools erlang-xmerl
# RabbitMQ
curl -1sLf 'https://dl.cloudsmith.io/public/rabbitmq/rabbitmq-server/setup.deb.sh' | bash
apt install -y rabbitmq-server=3.13.*
systemctl enable rabbitmq-server
Node Configuration
/etc/rabbitmq/rabbitmq.conf—same for all nodes (except names):
# rabbit-1 (change only name for other nodes)
nodename = rabbit@rabbit-1
# Network
listeners.tcp.default = 5672
management.tcp.port = 15672
# Cluster—all nodes must know each other
cluster_formation.peer_discovery_backend = rabbit_peer_discovery_classic_config
cluster_formation.classic_config.nodes.1 = rabbit@rabbit-1
cluster_formation.classic_config.nodes.2 = rabbit@rabbit-2
cluster_formation.classic_config.nodes.3 = rabbit@rabbit-3
# Alerting at memory watermark
vm_memory_high_watermark.relative = 0.6
vm_memory_high_watermark_paging_ratio = 0.75
disk_free_limit.relative = 1.5
# TLS
# ssl_options.cacertfile = /etc/rabbitmq/ssl/ca.crt
# ssl_options.certfile = /etc/rabbitmq/ssl/rabbit-1.crt
# ssl_options.keyfile = /etc/rabbitmq/ssl/rabbit-1.key
# Heartbeat
heartbeat = 60
# Frame size
frame_max = 131072
# Logging
log.file.level = warning
log.console = true
log.console.level = warning
Erlang cookie—must be identical on all nodes (used for cluster authentication):
# Generate on first node
openssl rand -hex 32 | tr -d '\n' > /var/lib/rabbitmq/.erlang.cookie
chmod 400 /var/lib/rabbitmq/.erlang.cookie
chown rabbitmq:rabbitmq /var/lib/rabbitmq/.erlang.cookie
# Copy to other nodes
scp /var/lib/rabbitmq/.erlang.cookie rabbit-2:/var/lib/rabbitmq/.erlang.cookie
scp /var/lib/rabbitmq/.erlang.cookie rabbit-3:/var/lib/rabbitmq/.erlang.cookie
Joining Nodes to Cluster
# Start all three nodes
systemctl start rabbitmq-server
# On rabbit-2 and rabbit-3:
rabbitmqctl stop_app
rabbitmqctl reset
rabbitmqctl join_cluster rabbit@rabbit-1
rabbitmqctl start_app
# Check cluster status
rabbitmqctl cluster_status
Configuring Quorum Queues
Quorum queues—the right choice for production. Classic mirrored queues are deprecated and removed in RabbitMQ 4.0.
# Create policy for quorum queues
rabbitmqctl set_policy quorum-queues "^quorum\." \
'{"queue-mode":"quorum"}' \
--priority 1 \
--apply-to queues
# Or create queue with explicit type via Management API
curl -u admin:password -X PUT http://rabbit-1:15672/api/queues/%2F/order-processing \
-H "Content-Type: application/json" \
-d '{
"durable": true,
"arguments": {
"x-queue-type": "quorum",
"x-quorum-initial-group-size": 3,
"x-delivery-limit": 5,
"x-dead-letter-exchange": "dlx",
"x-dead-letter-routing-key": "order-processing.failed"
}
}'
HAProxy for Load Balancing
# /etc/haproxy/haproxy.cfg
global
log /dev/log local0
maxconn 50000
defaults
mode tcp
log global
retries 3
timeout connect 5s
timeout client 30s
timeout server 30s
frontend rabbitmq-frontend
bind *:5672
default_backend rabbitmq-backend
backend rabbitmq-backend
balance roundrobin
option tcp-check
server rabbit-1 rabbit-1:5672 check inter 5s rise 2 fall 3
server rabbit-2 rabbit-2:5672 check inter 5s rise 2 fall 3
server rabbit-3 rabbit-3:5672 check inter 5s rise 2 fall 3
frontend rabbitmq-mgmt
bind *:15672
default_backend rabbitmq-mgmt-backend
backend rabbitmq-mgmt-backend
balance roundrobin
server rabbit-1 rabbit-1:15672 check
server rabbit-2 rabbit-2:15672 check
server rabbit-3 rabbit-3:15672 check
User Configuration and Permissions
# Delete default guest user
rabbitmqctl delete_user guest
# Create administrator
rabbitmqctl add_user admin $(openssl rand -base64 32)
rabbitmqctl set_user_tags admin administrator
rabbitmqctl set_permissions -p / admin ".*" ".*" ".*"
# Application—limited permissions
rabbitmqctl add_user webapp $(openssl rand -base64 32)
rabbitmqctl set_user_tags webapp
# Only required exchange and queues
rabbitmqctl set_permissions -p / webapp \
"^(order|notification|user)\." \
"^(order|notification|user)\." \
"^(order|notification|user)\."
# Monitoring—read only
rabbitmqctl add_user monitoring $(openssl rand -base64 32)
rabbitmqctl set_user_tags monitoring monitoring
Monitoring
Prometheus exporter is built-in since RabbitMQ 3.8:
# Enable plugins
rabbitmq-plugins enable rabbitmq_prometheus rabbitmq_management
# Metrics available on :15692/metrics
Key alerts:
-
rabbitmq_queue_messages{queue="order-processing"} > 10000—queue growing -
rabbitmq_node_mem_used / rabbitmq_node_mem_limit > 0.8—memory pressure -
rabbitmq_disk_space_available_bytes < 5368709120—low disk space -
rabbitmq_nodes{running="1"} < 3—cluster node down
Timeline
Day 1—install Erlang and RabbitMQ on 3 nodes, sync Erlang cookie, configure hostname in /etc/hosts.
Day 2—form cluster, create quorum queues, configure users, policy, dead letter exchange.
Day 3—configure HAProxy, enable Prometheus exporter, import Grafana dashboard (official: ID 10991), test fault tolerance (disable one node).
Day 4—integrate with application, performance test, configure alerts.







