Distributed background jobs with multiple workers

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.
Development and maintenance of all types of websites:
Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Showing 1 of 1 servicesAll 2065 services
Distributed background jobs with multiple workers
Complex
~2-3 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    847
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    451

Setting Up Distributed Background Jobs (Multiple Workers)

One worker — one point of failure and limited throughput. Multiple workers on multiple servers — this is horizontal scaling for processing and resilience to individual node failures. Implementation requires a centralized broker, proper configuration, and understanding of problems arising from parallel processing.

Architecture

[App Server 1]  [App Server 2]  [App Server 3]
      ↓                ↓               ↓
   dispatch         dispatch        dispatch
      ↓                ↓               ↓
         ┌─────────────────────────────┐
         │     Redis / RabbitMQ        │  ← centralized broker
         └─────────────────────────────┘
              ↓           ↓         ↓
        [Worker 1]  [Worker 2]  [Worker 3]   ← can be on different servers

The broker is the only component that must be accessible to all servers. Other nodes don't communicate directly.

Broker Requirements

Redis — standard choice for Laravel. Requires phpredis or predis. For high availability — Redis Sentinel or Redis Cluster.

RabbitMQ — suits complex routing scenarios (fanout, topic exchanges). Laravel supports via vladimir-yuldashev/laravel-queue-rabbitmq package.

Amazon SQS — managed service, no maintenance needed. Suitable for AWS infrastructure.

Minimal Redis production configuration — separate server (not shared with main DB), persistence enabled (appendonly yes), maxmemory-policy configured.

Laravel Configuration for Distributed Workers

// config/queue.php
'connections' => [
    'redis' => [
        'driver'       => 'redis',
        'connection'   => 'queue',   // separate Redis connection for queues
        'queue'        => env('REDIS_QUEUE', 'default'),
        'retry_after'  => 90,        // seconds before stuck task retries
        'block_for'    => 5,         // blocking BLPOP instead of polling
        'after_commit' => true,      // dispatch only after DB commit
    ],
],

retry_after is key parameter for distributed workers: if a worker crashes during task execution, the task will be visible to other workers again after retry_after seconds. Should be greater than Job's timeout.

Horizontal Scaling via Horizon

Horizon supports running on multiple servers. Each server runs its own Horizon instance, they don't coordinate directly — Redis acts as the common registry.

Same Supervisor config runs on each server:

[program:horizon]
command=php /var/www/artisan horizon
autostart=true
autorestart=true
user=www-data
stdout_logfile=/var/log/horizon.log
stopwaitsecs=3600

Horizon automatically balances workers within one server. For inter-server balancing — manual process count tuning based on each server's capacity.

Concurrent Access and Deduplication

With multiple workers, one task can be taken twice if a worker hangs and doesn't release lock. Redis LPOP mechanism is atomic — task is taken by one worker. But "invisible" tasks (taken but not completed) return to queue after retry_after.

If a task must execute strictly once (idempotency) — check this explicitly:

class ProcessPaymentJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public function __construct(private string $paymentId) {}

    public function handle(): void
    {
        // Distributed lock via Redis — only one worker processes payment
        $lock = Cache::lock("payment:{$this->paymentId}", 120);

        if (!$lock->get()) {
            // Another worker is already processing
            $this->release(10); // return to queue after 10 seconds
            return;
        }

        try {
            $payment = Payment::find($this->paymentId);

            // Idempotency check
            if ($payment?->status !== 'pending') {
                return; // already processed
            }

            $this->processPayment($payment);
        } finally {
            $lock->release();
        }
    }
}

Cache::lock() uses Redis SET NX PX — atomic operation guaranteeing exactly one worker gets the lock.

Worker Pool Separation by Load Type

Different servers can run workers for different queues if tasks require specific resources:

[Server: API-1, API-2]     → workers for 'critical', 'default'
[Server: Media-1]          → workers for 'transcoding', 'media'
[Server: Worker-1]         → workers for 'batch', 'reports', 'low'

Media server has GPU or powerful CPU for FFmpeg; API servers have fast workers with small timeout.

Supervisor on Media server:

[program:media-worker]
command=php /var/www/artisan queue:work --queue=transcoding,media --timeout=3600 --max-jobs=1
numprocs=2
autostart=true
autorestart=true
user=www-data

--max-jobs=1 — worker takes one task and restarts (frees memory after heavy operation).

Graceful Shutdown

On deploy you need to wait for current tasks to finish, not kill workers abruptly:

php artisan queue:restart

This command sets a flag in Redis — workers finish current task and stop. Supervisor will restart them with new code.

In Supervisor stopwaitsecs should be at least the maximum Job timeout:

stopwaitsecs=3600   # for transcoding server
stopwaitsecs=120    # for standard workers

Monitoring Distributed State

Horizon aggregates metrics from all servers in one dashboard. Key indicators:

  • Throughput (tasks/minute) per queue
  • Wait time — average task wait in queue
  • Runtime — average execution time
  • Failed jobs — count of failed tasks

Automatic worker scaling (if infrastructure on Kubernetes):

# HPA for scaling worker pods by queue depth metric
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: queue-workers
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: queue-worker
  minReplicas: 2
  maxReplicas: 20
  metrics:
    - type: External
      external:
        metric:
          name: redis_queue_depth
          selector:
            matchLabels:
              queue: default
        target:
          type: AverageValue
          averageValue: "50"  # scale if > 50 tasks per worker

Custom metric redis_queue_depth exported via Prometheus Redis Exporter.

RabbitMQ as Alternative

When complex routing is needed (different event types → different queues, fanout broadcast), RabbitMQ provides more flexibility:

// config/queue.php
'rabbitmq' => [
    'driver'   => 'rabbitmq',
    'dsn'      => env('RABBITMQ_DSN', 'amqp://user:pass@localhost:5672/'),
    'queue'    => env('RABBITMQ_QUEUE', 'default'),
    'options'  => [
        'exchange' => [
            'name' => 'app-exchange',
            'type' => 'direct',
        ],
        'queue' => [
            'durable'     => true,
            'exclusive'   => false,
            'auto_delete' => false,
        ],
    ],
],

RabbitMQ Management UI (port 15672) provides detailed monitoring: consumers, connections, channel load, message rates.

Timeline

Setting up Redis Sentinel/Cluster or RabbitMQ, Horizon configuration on multiple servers, Supervisor — 1 working day. Distributed locks, idempotency checks in critical Jobs — 6–8 hours. Kubernetes HPA and Prometheus integration — separate project for 1–2 days.