API Rate Limiting for Web Applications
Rate limiting restricts the number of requests from a single source within a time unit. It protects against brute force, credential stuffing, application-level DDoS, and uncontrolled resource consumption by partner integrations. Without rate limiting, a single buggy client (infinite retry-loop) can bring down an entire application.
Algorithms
Fixed Window — counter resets every N seconds. Simple to implement, but vulnerable to bursts at reset moment: 100 requests at end of window + 100 at start of next = 200 per second.
Sliding Window — averaging over a sliding window. Distributes load more evenly.
Token Bucket — accumulates "tokens" at refill rate, each request costs one token. Allows bursts up to bucket size, then limited.
Leaky Bucket — queue of requests with fixed drain rate. Most evenly distributed load.
For most web applications, Sliding Window with Redis is sufficient.
Implementation in Laravel
Laravel Throttle middleware out-of-box uses cache (Redis):
// routes/api.php
Route::middleware(['auth:sanctum', 'throttle:api'])->group(function () {
Route::apiResource('articles', ArticleController::class);
});
// config/cache.php — limits via RateLimiter
// app/Providers/RouteServiceProvider.php
RateLimiter::for('api', function (Request $request) {
return $request->user()
? Limit::perMinute(300)->by($request->user()->id)
: Limit::perMinute(60)->by($request->ip());
});
// Different limits for different tiers
RateLimiter::for('api', function (Request $request) {
$user = $request->user();
if (!$user) return Limit::perMinute(30)->by($request->ip());
return match($user->plan) {
'enterprise' => Limit::perMinute(1000)->by($user->id),
'pro' => Limit::perMinute(300)->by($user->id),
default => Limit::perMinute(60)->by($user->id),
};
});
Laravel automatically adds headers X-RateLimit-Limit, X-RateLimit-Remaining, Retry-After.
Implementation in NestJS + Redis
import { ThrottlerModule, ThrottlerGuard } from '@nestjs/throttler';
import { ThrottlerStorageRedisService } from 'nestjs-throttler-storage-redis';
// app.module.ts
ThrottlerModule.forRoot({
throttlers: [
{ name: 'short', ttl: 1000, limit: 10 }, // 10 req/sec
{ name: 'medium', ttl: 60000, limit: 300 }, // 300 req/min
{ name: 'long', ttl: 3600000, limit: 5000 }, // 5000 req/hour
],
storage: new ThrottlerStorageRedisService(redisClient),
}),
// Decorator on specific endpoint
@Throttle({ short: { limit: 3, ttl: 60000 } }) // 3 requests per minute
@Post('auth/login')
async login(@Body() dto: LoginDto) { ... }
Rate limiting at Nginx Level
First line of defense before PHP/Node:
# limit_req_zone — define zones
limit_req_zone $binary_remote_addr zone=api_general:10m rate=10r/s;
limit_req_zone $binary_remote_addr zone=api_auth:10m rate=3r/m;
server {
location /api/ {
limit_req zone=api_general burst=20 nodelay;
limit_req_status 429;
proxy_pass http://backend;
}
location /api/auth/ {
limit_req zone=api_auth burst=5 nodelay;
limit_req_status 429;
proxy_pass http://backend;
}
}
burst=20 nodelay — allows burst up to 20 requests simultaneously without delay, then strict limit.
Response Headers
Correct rate limit headers are part of the API contract:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1735689600
Retry-After: 47
{
"error": {
"code": "RATE_LIMIT_EXCEEDED",
"message": "Request limit exceeded. Retry after 47 seconds.",
"retry_after": 47
}
}
Retry-After — Unix timestamp or seconds. Client must respect it and not retry earlier.
Distributed Rate Limiting
With multiple application servers — counters must be stored centrally. Redis + Lua script for atomic Sliding Window:
-- sliding_window.lua
local key = KEYS[1]
local now = tonumber(ARGV[1])
local window = tonumber(ARGV[2])
local limit = tonumber(ARGV[3])
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
local count = redis.call('ZCARD', key)
if count < limit then
redis.call('ZADD', key, now, now .. math.random())
redis.call('EXPIRE', key, window / 1000)
return 1
end
return 0
Bypass Strategies
Not all requests should consume limit:
- Internal services (IP-whitelist or service token with unlimited rate)
- Webhook endpoint (receives from external services — cannot limit)
- Health check
/health— do not limit
RateLimiter::for('api', function (Request $request) {
if ($request->ip() === config('services.internal_ip')) {
return Limit::none();
}
// ...
});
Timeline
Rate limiting with Redis (Sliding Window, different limits by tier, correct headers): 1–2 days. With Nginx-level, Lua script for distributed counting, 429-response monitoring in Grafana: 3–4 days.







