API Rate Limiting and Usage Tracking Implementation for SaaS Application
Without request limits, one aggressive client can knock down infrastructure or exhaust downstream service quotas in minutes. Usage Tracking is not only protection, but the foundation for billing: without accurate consumption data, pay-per-use models are impossible and tariff plan correctness cannot be verified.
Rate limiting layers
Restrictions are built in multiple levels. At IP level—protection from DDoS and scrapers. At API key or JWT token level—per-tenant quotas. At endpoint level—separate limits for expensive operations (export, report generation, AI requests).
Algorithms:
| Algorithm | Characteristic | Application |
|---|---|---|
| Fixed Window | Simple, allows burst at window boundary | Basic plans |
| Sliding Window Log | Precise, expensive memory | Premium endpoints |
| Token Bucket | Allows burst within bucket-size | Most SaaS APIs |
| Leaky Bucket | Smooths spikes, strict output rate | Integrations with external APIs |
For most SaaS, Token Bucket is optimal: client can make 10-request burst if accumulated tokens, but won't exceed average rate.
Middleware-level implementation
Node.js / Express — express-rate-limit library with Redis storage via rate-limit-redis:
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
const planLimits = { free: 100, pro: 1000, enterprise: 10000 };
const apiLimiter = rateLimit({
windowMs: 60 * 1000,
limit: (req) => planLimits[req.tenant.plan] ?? 100,
keyGenerator: (req) => `rl:${req.tenant.id}:${req.path}`,
store: new RedisStore({ client: redisClient }),
handler: (req, res) => {
res.status(429).json({
error: 'rate_limit_exceeded',
retryAfter: res.getHeader('Retry-After'),
});
},
standardHeaders: 'draft-7',
legacyHeaders: false,
});
Headers RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset per RFC 6585 / draft-7 are mandatory—SDK clients use them for back-off.
Python / FastAPI — slowapi library on top of limits:
from slowapi import Limiter
limiter = Limiter(key_func=lambda req: req.state.tenant_id,
storage_uri="redis://localhost:6379")
@app.get("/api/reports")
@limiter.limit("10/minute")
async def generate_report(request: Request):
...
Nginx — at reverse proxy level for rough protection before reaching application:
limit_req_zone $http_x_api_key zone=api:10m rate=100r/m;
limit_req zone=api burst=20 nodelay;
limit_req_status 429;
Usage Tracking: what and how to count
Metrics divide into two groups. Billing metrics—what customer pays for: API calls count, data volume, active users count, compute time. Operational metrics—for monitoring: latency by percentiles, error rate, top endpoints by load.
Data collection architecture:
Synchronous database write on every request is an antipattern. Right approach:
- In middleware, atomically increment counter in Redis:
INCR usage:{tenant_id}:{date}:{endpoint} - Celery/BullMQ job every 5 minutes flushes aggregates from Redis to PostgreSQL
- Detailed request log written asynchronously to ClickHouse or TimescaleDB for analytics
-- Aggregated usage table in PostgreSQL
CREATE TABLE api_usage_daily (
tenant_id UUID NOT NULL,
date DATE NOT NULL,
endpoint VARCHAR(200),
plan VARCHAR(50),
requests BIGINT DEFAULT 0,
bytes_in BIGINT DEFAULT 0,
bytes_out BIGINT DEFAULT 0,
errors_4xx INT DEFAULT 0,
errors_5xx INT DEFAULT 0,
PRIMARY KEY (tenant_id, date, endpoint)
);
Dashboard and alerts for customers
Customers should see their consumption in real-time—reduces unexpected blocks and support tickets. Minimum set: current usage vs quota (progress bar), 30-day graph, top-5 endpoints by call count.
Alerts: email/webhook notification at 80% quota, warning a day before billing period end with overage forecast.
Billing integration
With pay-per-use model, usage data goes to Stripe via Billing Meters API:
await stripe.billing.meters.createEvent({
event_name: 'api_requests',
payload: {
stripe_customer_id: tenant.stripeCustomerId,
value: requestCount,
},
timestamp: Math.floor(Date.now() / 1000),
});
For fixed plans with overage—compare usage vs plan at end of billing period and issue additional invoice for overage.
Typical timeline
Basic rate limiting with Redis and RFC 6585 headers — 2–3 days. Usage Tracking with PostgreSQL aggregation and customer dashboard — 5–7 days. Stripe Billing Meters integration and alerts — 3 more days.







