API Rate Limiting and Usage Tracking for SaaS Application

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.
Development and maintenance of all types of websites:
Informational websites or web applications
Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators
E-commerce websites or web applications
Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers
Business process management web applications
CRM systems, ERP systems, corporate portals, production management systems, information parsers
Electronic service websites or web applications
Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Our competencies:
Development stages
Latest works
  • image_web-applications_feedme_466_0.webp
    Development of a web application for FEEDME
    1161
  • image_ecommerce_furnoro_435_0.webp
    Development of an online store for the company FURNORO
    1041
  • image_crm_enviok_479_0.webp
    Development of a web application for Enviok
    822
  • image_crm_chasseurs_493_0.webp
    CRM development for Chasseurs
    847
  • image_website-sbh_0.png
    Website development for SBH Partners
    999
  • image_website-_0.png
    Website development for Red Pear
    451

API Rate Limiting and Usage Tracking Implementation for SaaS Application

Without request limits, one aggressive client can knock down infrastructure or exhaust downstream service quotas in minutes. Usage Tracking is not only protection, but the foundation for billing: without accurate consumption data, pay-per-use models are impossible and tariff plan correctness cannot be verified.

Rate limiting layers

Restrictions are built in multiple levels. At IP level—protection from DDoS and scrapers. At API key or JWT token level—per-tenant quotas. At endpoint level—separate limits for expensive operations (export, report generation, AI requests).

Algorithms:

Algorithm Characteristic Application
Fixed Window Simple, allows burst at window boundary Basic plans
Sliding Window Log Precise, expensive memory Premium endpoints
Token Bucket Allows burst within bucket-size Most SaaS APIs
Leaky Bucket Smooths spikes, strict output rate Integrations with external APIs

For most SaaS, Token Bucket is optimal: client can make 10-request burst if accumulated tokens, but won't exceed average rate.

Middleware-level implementation

Node.js / Expressexpress-rate-limit library with Redis storage via rate-limit-redis:

import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';

const planLimits = { free: 100, pro: 1000, enterprise: 10000 };

const apiLimiter = rateLimit({
  windowMs: 60 * 1000,
  limit: (req) => planLimits[req.tenant.plan] ?? 100,
  keyGenerator: (req) => `rl:${req.tenant.id}:${req.path}`,
  store: new RedisStore({ client: redisClient }),
  handler: (req, res) => {
    res.status(429).json({
      error: 'rate_limit_exceeded',
      retryAfter: res.getHeader('Retry-After'),
    });
  },
  standardHeaders: 'draft-7',
  legacyHeaders: false,
});

Headers RateLimit-Limit, RateLimit-Remaining, RateLimit-Reset per RFC 6585 / draft-7 are mandatory—SDK clients use them for back-off.

Python / FastAPIslowapi library on top of limits:

from slowapi import Limiter

limiter = Limiter(key_func=lambda req: req.state.tenant_id,
                  storage_uri="redis://localhost:6379")

@app.get("/api/reports")
@limiter.limit("10/minute")
async def generate_report(request: Request):
    ...

Nginx — at reverse proxy level for rough protection before reaching application:

limit_req_zone $http_x_api_key zone=api:10m rate=100r/m;
limit_req zone=api burst=20 nodelay;
limit_req_status 429;

Usage Tracking: what and how to count

Metrics divide into two groups. Billing metrics—what customer pays for: API calls count, data volume, active users count, compute time. Operational metrics—for monitoring: latency by percentiles, error rate, top endpoints by load.

Data collection architecture:

Synchronous database write on every request is an antipattern. Right approach:

  1. In middleware, atomically increment counter in Redis: INCR usage:{tenant_id}:{date}:{endpoint}
  2. Celery/BullMQ job every 5 minutes flushes aggregates from Redis to PostgreSQL
  3. Detailed request log written asynchronously to ClickHouse or TimescaleDB for analytics
-- Aggregated usage table in PostgreSQL
CREATE TABLE api_usage_daily (
  tenant_id   UUID NOT NULL,
  date        DATE NOT NULL,
  endpoint    VARCHAR(200),
  plan        VARCHAR(50),
  requests    BIGINT DEFAULT 0,
  bytes_in    BIGINT DEFAULT 0,
  bytes_out   BIGINT DEFAULT 0,
  errors_4xx  INT DEFAULT 0,
  errors_5xx  INT DEFAULT 0,
  PRIMARY KEY (tenant_id, date, endpoint)
);

Dashboard and alerts for customers

Customers should see their consumption in real-time—reduces unexpected blocks and support tickets. Minimum set: current usage vs quota (progress bar), 30-day graph, top-5 endpoints by call count.

Alerts: email/webhook notification at 80% quota, warning a day before billing period end with overage forecast.

Billing integration

With pay-per-use model, usage data goes to Stripe via Billing Meters API:

await stripe.billing.meters.createEvent({
  event_name: 'api_requests',
  payload: {
    stripe_customer_id: tenant.stripeCustomerId,
    value: requestCount,
  },
  timestamp: Math.floor(Date.now() / 1000),
});

For fixed plans with overage—compare usage vs plan at end of billing period and issue additional invoice for overage.

Typical timeline

Basic rate limiting with Redis and RFC 6585 headers — 2–3 days. Usage Tracking with PostgreSQL aggregation and customer dashboard — 5–7 days. Stripe Billing Meters integration and alerts — 3 more days.