Cache hierarchies: CDN → gateway → app → DB
A layered cache turns latency into throughput. Start at the edge (CDN), then the gateway, then your app (in‑memory/Redis), and finally database query/result caches.
Design principles
- Make responses cacheable by default; add
Cache-ControlandETag. - Use read‑through caches for hot paths and background refresh for stability.
- Separate idempotent reads from writes; invalidate precisely with keys/scopes.
Metrics to watch
- TTFB, hit ratios at each layer, p95 latencies, DB QPS.
- Error amplification and thundering herd protection (jitter, request coalescing).