Throttling Implementation for Web Application API

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.
Development and maintenance of all types of websites:
Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Our competencies:
Development stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    847
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    451

API Throttling for Web Applications

Throttling — managing request processing speed at the server level, unlike rate limiting which restricts the client. Throttling slows down or queues incoming requests to protect the backend from overload. The difference is fundamental: rate limit says "you made too many requests", throttling says "we process as much as we can".

Throttling vs Rate Limiting

Aspect Rate Limiting Throttling
Subject Client (IP, user_id) Server (CPU, queue)
Action on limit 429, request denied Request delayed or queued
Purpose Protection from abuse Backend resource protection
Response to client Immediate 429 Delay or 503

In practice, both mechanisms are applied together.

Throttling Heavy Operations

Some operations — report export, file processing, email distribution — should not run in unlimited parallel:

// BullMQ — throttle via concurrency + rateLimit
const queue = new Queue('reports', { connection: redis });

const worker = new Worker('reports', processReport, {
  connection: redis,
  concurrency: 5,            // maximum 5 parallel tasks
  limiter: {
    max: 10,                 // 10 tasks
    duration: 60_000,        // per 60 seconds
  },
});

// Add task with priority
await queue.add('generate-csv', { userId, filters }, {
  priority: user.plan === 'enterprise' ? 1 : 10,
  attempts: 3,
  backoff: { type: 'exponential', delay: 2000 },
});

Adaptive Throttling

Adaptive throttling reduces limits when latency or errors increase:

class AdaptiveThrottler {
  private limit = 100;
  private readonly minLimit = 10;
  private readonly maxLimit = 100;

  async check(): Promise<boolean> {
    const metrics = await this.getMetrics();

    // Lower limit when p95 latency is high
    if (metrics.p95Latency > 500) {
      this.limit = Math.max(this.minLimit, this.limit * 0.8);
    } else if (metrics.p95Latency < 200 && metrics.errorRate < 0.01) {
      this.limit = Math.min(this.maxLimit, this.limit * 1.1);
    }

    return this.counter.increment() <= this.limit;
  }
}

Google uses similar mechanism in their services ("Client-Side Throttling" from SRE book).

Circuit Breaker

Throttling for outgoing requests to external APIs — Circuit Breaker pattern:

import CircuitBreaker from 'opossum';

const options = {
  timeout: 3000,          // request > 3 sec = fail
  errorThresholdPercentage: 50, // 50% errors → open
  resetTimeout: 30000,    // retry after 30 sec (half-open)
  volumeThreshold: 10,    // minimum 10 requests to calculate
};

const breaker = new CircuitBreaker(callExternalAPI, options);

breaker.on('open', () => logger.warn('Circuit breaker OPEN — external API unavailable'));
breaker.on('halfOpen', () => logger.info('Circuit breaker HALF-OPEN — testing'));
breaker.on('close', () => logger.info('Circuit breaker CLOSE — external API recovered'));

// Fallback when circuit open
breaker.fallback(() => ({ status: 'cached', data: getCachedData() }));

States: Closed (normal) → Open (too many errors, requests not sent) → Half-Open (trial request) → Closed (if successful).

Throttling Incoming Webhooks

Partners can send thousands of webhooks simultaneously (e.g., on bulk status updates). Correct pattern — accept quickly (202), queue:

// WebhookController.php — immediate response
public function handle(Request $request)
{
    $payload = $request->all();
    $signature = $request->header('X-Signature');

    if (!$this->verifySignature($payload, $signature)) {
        return response()->json(['error' => 'Invalid signature'], 401);
    }

    // Put in queue with throttle
    ProcessWebhook::dispatch($payload)
        ->onQueue('webhooks')
        ->delay(now()); // immediately, but via queue

    return response()->json(['accepted' => true], 202);
}
// config/queue.php — worker limit for webhooks queue
// Horizon:
'webhooks' => [
    'connection' => 'redis',
    'queue' => ['webhooks'],
    'balance' => 'auto',
    'maxProcesses' => 10, // no more than 10 parallel
],

Throttling in Nginx Upstream

upstream backend {
    server app1:3000;
    server app2:3000;

    # Limit concurrent connections to upstream
    keepalive 32;
}

# queue — buffering on overload
location /api/ {
    proxy_pass http://backend;
    proxy_connect_timeout 1s;
    proxy_read_timeout 30s;

    # If backend slow to respond — 503 instead of hanging
    proxy_next_upstream error timeout http_503;
    proxy_next_upstream_tries 2;
}

Monitoring Throttling

Metrics for dashboard:

// Prometheus counters
throttleRejected.inc({ reason: 'queue_full', endpoint: '/api/export' });
throttleDelayed.observe({ endpoint: '/api/export' }, delayMs);
queueDepth.set({ queue: 'reports' }, await queue.count());

Alert: queue depth > 1000 for 5 minutes → Scale up workers or notify on-call.

Timeline

BullMQ with concurrency + rateLimit, circuit breaker for external APIs, webhook queue: 3–5 days. With adaptive throttling, Prometheus metrics, Grafana dashboard and alerts: 1–2 weeks.