Background jobs error handling with retry dead letter and alerting

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.
Development and maintenance of all types of websites:
Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Showing 1 of 1 servicesAll 2065 services
Background jobs error handling with retry dead letter and alerting
Medium
~2-3 business days
FAQ
Our competencies:
Development stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    847
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    451

Configuring Error Handling in Background Jobs (retry, dead letter, alerting)

Job failed - what's next? By default, Laravel simply marks the task as failed and forgets about it. Without retry logic, without a dead letter queue, without notifications, error handling happens by chance. The right architecture defines: how many times to retry, what interval to use, what to do with permanently failed tasks, and who should be notified.

Retry Parameters

In the Job class:

class SendEmailJob implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public int $tries   = 5;        // maximum attempts
    public int $backoff = 60;       // fixed pause between attempts (seconds)
    public int $timeout = 30;       // timeout for one attempt

    // Exponential backoff instead of fixed
    public function backoff(): array
    {
        return [10, 30, 60, 120, 300]; // attempt 1→10s, 2→30s, 3→60s, 4→120s, 5→300s
    }
}

backoff() as a method overrides the $backoff property. An array allows you to specify different intervals for each attempt — this is exponential backoff. It's especially important for external APIs: if a service is temporarily unavailable, you shouldn't hammer it every 10 seconds.

Differentiating Errors: Retryable vs Fatal

Not all errors are worth retrying. Invalid data format won't fix itself on the second attempt — that's a fatal error. An unavailable API might respond after a minute — that's temporary:

public function handle(): void
{
    try {
        $this->processData();
    } catch (ValidationException $e) {
        // Data is invalid — retry is pointless
        $this->fail($e);
        return;
    } catch (ModelNotFoundException $e) {
        // Record is deleted — retry won't help
        $this->fail($e);
        return;
    } catch (ConnectionException | TimeoutException $e) {
        // Temporary network error — retry
        throw $e; // let Queue handle retry
    } catch (\Throwable $e) {
        // Unknown error — retry too, but log it
        Log::warning("Unexpected error in SendEmailJob, attempt {$this->attempts()}: {$e->getMessage()}");
        throw $e;
    }
}

$this->fail($e) immediately marks the Job as failed, without using remaining attempts. throw $e increments the attempt counter and schedules a retry.

The failed() Method

Called after all attempts are exhausted:

public function failed(\Throwable $e): void
{
    // Notify the user
    if ($this->userId) {
        $user = User::find($this->userId);
        $user?->notify(new JobFailedNotification($this->jobType, $e->getMessage()));
    }

    // Log with context
    Log::error('Job permanently failed', [
        'job'       => static::class,
        'payload'   => $this->getPayloadForLog(),
        'attempts'  => $this->attempts(),
        'exception' => [
            'class'   => get_class($e),
            'message' => $e->getMessage(),
            'file'    => $e->getFile() . ':' . $e->getLine(),
        ],
    ]);

    // Save to own table for audit
    FailedJobAudit::create([
        'job_class'   => static::class,
        'payload'     => json_encode($this->getPayloadForLog()),
        'error'       => $e->getMessage(),
        'failed_at'   => now(),
    ]);

    // Alert DevOps channel
    $this->alertSlack($e);
}

private function getPayloadForLog(): array
{
    // Return only safe data (without passwords, tokens)
    return ['user_id' => $this->userId, 'type' => $this->jobType];
}

Dead Letter Queue

Dead Letter Queue (DLQ) — a separate queue for permanently failed tasks. Allows you to analyze and process them later manually or automatically.

Laravel doesn't implement DLQ out of the box, but the pattern is straightforward to build:

// app/Jobs/Middleware/DeadLetterMiddleware.php
class DeadLetterMiddleware
{
    public function handle(object $job, callable $next): void
    {
        try {
            $next($job);
        } catch (\Throwable $e) {
            if ($job->attempts() >= $job->tries) {
                // Last attempt — send to DLQ
                dispatch(new DeadLetterJob(
                    originalClass:   get_class($job),
                    serializedJob:   serialize($job),
                    errorMessage:    $e->getMessage(),
                    errorTrace:      $e->getTraceAsString(),
                ))->onQueue('dead-letter');
            }
            throw $e;
        }
    }
}

Apply middleware to the Job:

public function middleware(): array
{
    return [new DeadLetterMiddleware()];
}

DeadLetterJob is a simple wrapper that stores the serialized task and allows later recovery:

class DeadLetterJob implements ShouldQueue
{
    public int $tries = 1; // DLQ tasks don't retry

    public function __construct(
        public string $originalClass,
        public string $serializedJob,
        public string $errorMessage,
        public string $errorTrace,
        public \Carbon\Carbon $failedAt = new \Carbon\Carbon(),
    ) {}

    public function handle(): void
    {
        // Simply save for audit
        DeadLetterRecord::create([
            'original_class'  => $this->originalClass,
            'serialized_job'  => $this->serializedJob,
            'error_message'   => $this->errorMessage,
            'failed_at'       => $this->failedAt,
        ]);
    }

    public function restore(): void
    {
        $originalJob = unserialize($this->serializedJob);
        dispatch($originalJob);
    }
}

Command to rerun tasks from DLQ:

// app/Console/Commands/RetryDeadLetterJobs.php
public function handle(): void
{
    DeadLetterRecord::where('failed_at', '>=', now()->subDays(3))
        ->whereNull('retried_at')
        ->each(function (DeadLetterRecord $record) {
            $job = unserialize($record->serialized_job);
            dispatch($job);
            $record->update(['retried_at' => now()]);
            $this->info("Retried: {$record->original_class} [{$record->id}]");
        });
}

Alerting

Notify Slack when a Job fails:

private function alertSlack(\Throwable $e): void
{
    $env     = config('app.env');
    $payload = [
        'text'        => null,
        'attachments' => [[
            'color'  => 'danger',
            'title'  => "Job Failed [{$env}]",
            'fields' => [
                ['title' => 'Job',     'value' => static::class,        'short' => true],
                ['title' => 'Error',   'value' => $e->getMessage(),     'short' => false],
                ['title' => 'Attempts','value' => (string)$this->attempts(), 'short' => true],
                ['title' => 'Time',    'value' => now()->toDateTimeString(), 'short' => true],
            ],
            'footer' => config('app.url'),
        ]],
    ];

    rescue(fn() => Http::post(config('services.slack.job_alerts_webhook'), $payload));
}

rescue() wraps the call so that an error in alerting doesn't cause recursive failure.

Monitoring Failed Jobs Count

Periodic check via Artisan command in cron:

// app/Console/Commands/CheckFailedJobs.php
public function handle(): void
{
    $count     = DB::table('failed_jobs')
        ->where('failed_at', '>=', now()->subHour())
        ->count();

    $threshold = (int) config('queue.failed_jobs_alert_threshold', 10);

    if ($count >= $threshold) {
        Http::post(config('services.telegram.webhook_url'), [
            'chat_id' => config('services.telegram.admin_chat_id'),
            'text'    => "⚠️ {$count} jobs failed in the last hour",
        ]);
    }
}
// routes/console.php
Schedule::command('queue:check-failed')->everyFiveMinutes();

Timeline

Configuring retry strategy, failed() method, alerting — 3–4 hours. Implementing Dead Letter Queue with recovery command — another 4–5 hours. Integration with Horizon and monitoring dashboard — 2–3 hours.