Response caching at API Gateway level

Our company is engaged in the development, support and maintenance of sites of any complexity. From simple one-page sites to large-scale cluster systems built on micro services. Experience of developers is confirmed by certificates from vendors.

8+Years of workmore info 900+Completed projectsmore info 100+In house employeesmore info 19+Partnersmore info

Development and maintenance of all types of websites:

Informational websites or web applications

Business card websites, landing pages, corporate websites, online catalogs, quizzes, promo websites, blogs, news resources, informational portals, forums, aggregators

E-commerce websites or web applications

Online stores, B2B portals, marketplaces, online exchanges, cashback websites, exchanges, dropshipping platforms, product parsers

Business process management web applications

CRM systems, ERP systems, corporate portals, production management systems, information parsers

Electronic service websites or web applications

Classified ads platforms, online schools, online cinemas, website builders, portals for electronic services, video hosting platforms, thematic portals

These are just some of the technical types of websites we work with, and each of them can have its own specific features and functionality, as well as be customized to meet the specific needs and goals of the client.

Offered services

Showing 1 of 1 servicesAll 2065 services

Response caching at API Gateway level

Medium

from 1 business day to 3 business days

FAQ

Our competencies:

Free consultation

Book a free consultation if you have any questions. A dedicated specialist will advise you.

Cost calculation

If you know what exactly you need to develop, or you already have a ready-made technical task.

Development stages

Latest works

Development of a web application for FEEDME
1170
Development of an online store for the company FURNORO
1094
Development of a web application for Enviok
830
CRM development for Chasseurs
879
Website development for SBH Partners
999
Website development for Red Pear
453

Show more works

Implementing Response Caching at API Gateway Level

Gateway-level caching works before requests reach the backend. Properly configured caching removes 30–70% load from services without code changes. Misconfigured caching serves other people's data or keeps stale responses for days.

What to Cache and What Not

Safe to cache:

GET requests with public data (catalog, references, articles)
Responses with explicit Cache-Control: public, max-age=N from upstream
Endpoints that don't depend on session ID

Don't cache:

Any POST, PUT, PATCH, DELETE
Requests with Authorization: Bearer ... header — unless user-level isolation is configured
Responses with 4xx/5xx status (except 404 for public resources if intentional)
Data with personal information

Kong Gateway: proxy-cache Plugin

plugins:
  - name: proxy-cache
    config:
      response_code: [200, 301, 404]
      request_method: [GET, HEAD]
      content_type:
        - application/json
        - application/json; charset=utf-8
      cache_ttl: 300
      strategy: memory
      memory:
        dictionary_name: kong_db_cache

For production — use redis strategy instead of memory:

      strategy: redis
      redis:
        host: redis.internal
        port: 6379
        timeout: 2000
        database: 0
        password: ${REDIS_PASSWORD}

Kong adds X-Cache-Status header to response: Hit, Miss, Bypass, Refresh. Useful for debugging.

Cache invalidation via Admin API:

# Delete specific key
curl -X DELETE http://kong-admin:8001/proxy-cache/caches/{cache_key}

# Clear all cache
curl -X DELETE http://kong-admin:8001/proxy-cache/

AWS API Gateway + ElastiCache

AWS doesn't have native cache for HTTP APIs, but it does for REST APIs:

{
  "cacheClusterEnabled": true,
  "cacheClusterSize": "0.5",
  "methodSettings": {
    "GET /products": {
      "cachingEnabled": true,
      "cacheTtlInSeconds": 300,
      "cacheDataEncrypted": false,
      "requireAuthorizationForCacheControl": false
    }
  }
}

Terraform:

resource "aws_api_gateway_stage" "main" {
  deployment_id        = aws_api_gateway_deployment.main.id
  rest_api_id          = aws_api_gateway_rest_api.main.id
  stage_name           = "v1"
  cache_cluster_enabled = true
  cache_cluster_size    = "0.5"

  method_settings {
    resource_path = "/products"
    http_method   = "GET"

    settings {
      caching_enabled      = true
      cache_ttl_in_seconds = 300
    }
  }
}

Invalidation via request with Cache-Control: max-age=0 header — if client has execute-api:InvalidateCache right.

Nginx: proxy_cache

If gateway is on Nginx:

proxy_cache_path /var/cache/nginx/api
  levels=1:2
  keys_zone=api_cache:10m
  max_size=1g
  inactive=10m
  use_temp_path=off;

server {
  location /api/v1/catalog/ {
    proxy_cache api_cache;
    proxy_cache_valid 200 5m;
    proxy_cache_valid 404 1m;
    proxy_cache_use_stale error timeout updating http_500 http_502 http_503;
    proxy_cache_background_update on;
    proxy_cache_lock on;

    # Cache key — without auth params
    proxy_cache_key "$scheme$request_method$host$uri$is_args$args";

    # Don't cache if client sends session cookie
    proxy_cache_bypass $cookie_session_id;
    proxy_no_cache $cookie_session_id;

    add_header X-Cache-Status $upstream_cache_status;
    proxy_pass http://backend;
  }
}

proxy_cache_use_stale updating + proxy_cache_background_update on — stale-while-revalidate pattern: user gets old response instantly, update happens in background. Critical for heavy endpoints.

Vary and Cache Spaces

If API returns different responses based on headers, configure cache key properly. Typical example — multilingual API:

# Split cache by language
proxy_cache_key "$scheme$request_method$host$uri$is_args$args$http_accept_language";

In Kong via vary_headers:

    config:
      vary_headers:
        - Accept-Language
        - Accept-Encoding

Without this, the first user with Accept-Language: en "claims" the cache, and everyone else gets English responses.

Cache Stampede

When a popular key's TTL expires, multiple workers hit upstream simultaneously — "cache storm". Protection:

Mutex/lock: only one worker updates, others wait (proxy_cache_lock on in Nginx)
Probabilistic early expiration: refresh cache slightly before TTL with increasing probability as expiration approaches
Stale responses: serve stale response while update is in progress

Timeline

Basic cache setup for 2–3 routes: 1 day. Full strategy with invalidation, Vary, hit rate monitoring, and stale-while-revalidate: 3–5 days.