Why your caching strategy is your performance strategy

Most performance problems are not compute problems. They are caching problems in disguise. The application is fast enough - it is just doing the same work over and over again.

A good caching strategy does not mean adding Redis and calling it done. It means understanding what data changes, how often, and who needs it - then designing a layered system that minimises unnecessary work at every tier.

The four caching layers

Every production web application has at least four places where data can be cached. Most teams use two of them.

1. Browser cache

The fastest cache is the one closest to the user. HTTP caching headers - Cache-Control, ETag, Last-Modified - tell the browser when it can reuse a response without hitting the network at all. A properly configured browser cache means returning visitors load your site in milliseconds.

Most teams set these headers incorrectly, too conservatively, or not at all.

2. CDN / Edge cache

Your CDN sits between the internet and your origin server. It can serve entire pages, API responses, and static assets without touching your backend at all. Properly configured, a CDN can absorb 90% of your traffic before it ever reaches your infrastructure.

The mistake here is treating the CDN as a static file server. Dynamic content can be cached at the edge too - you just need to be deliberate about invalidation.

3. Application cache

This is where Redis lives. In-memory caches at the application layer are ideal for data that is expensive to compute, shared across users, and changes on a predictable schedule - session data, rate limit counters, computed aggregates, feature flags.

Application caches fail when they become a crutch for slow database queries rather than a deliberate design decision.

4. Database cache

PostgreSQL and other databases maintain their own buffer caches. If your hot data fits in memory, the database will serve it from there without touching disk. Understanding your working set size - the data accessed most frequently - is critical for sizing database instances correctly.

The cache invalidation problem

There is a well-known saying in software engineering: there are only two hard things - naming things and cache invalidation.

Cache invalidation is hard because it requires you to know, at write time, every cache entry that depends on the data you are changing. Get this wrong and you serve stale data. Get too aggressive and you eliminate the performance benefit entirely.

Practical approaches that work:

TTL-based expiry - Set a time-to-live on every cache entry. Accept that some users will see data that is a few seconds or minutes old. Works well for read-heavy, eventually-consistent use cases like product listings, blog posts, and pricing data.

Event-driven invalidation - When data changes, publish an event. Cache consumers listen for that event and invalidate their entries. More complex to implement, but gives you real-time accuracy when you need it.

Versioned cache keys - Embed a version number or hash in the cache key. When the underlying data changes, change the key. Old entries expire naturally via TTL. Simpler than event-driven invalidation for cases where you control both the writer and the reader.

Stale-while-revalidate - Serve the cached entry immediately, then refresh it in the background. The user gets a fast response. The next user gets fresh data. This pattern is built into HTTP (Cache-Control: stale-while-revalidate) and supported natively by Next.js.

What to cache and what not to

Not all data benefits equally from caching. Some data should never be cached at all.

Cache these:

Product listings, category pages, and search results
Computed aggregates (total counts, averages, rankings)
External API responses with stable data
Rendered HTML for public, non-personalised pages
Static assets - fonts, images, scripts

Cache carefully:

User-specific data (session scope only, never shared)
Pricing data with active promotions
Inventory levels where accuracy affects purchase decisions

Do not cache:

Payment and checkout data
Authentication tokens and session secrets
Real-time data where stale values cause incorrect behaviour

The Next.js caching model

Next.js 15 has a multi-layer caching system built in - and it is worth understanding before adding a third-party solution.

The full route cache stores rendered HTML for static routes at build time. Static pages are served from disk without touching your database.

The data cache persists fetch() responses across requests. You control it per-request with the cache and next.revalidate options.

The request memoization layer deduplicates identical fetch() calls within a single render cycle, so the same data fetch in multiple components only hits the network once.

The router cache stores prefetched page segments in the browser, making client-side navigation feel instant.

Understanding which layer applies to your use case - and which combination to reach for - is the difference between a site that is fast and one that only appears fast in development.

A practical caching audit

Before adding infrastructure, run through this checklist:

Are your static assets cached at the CDN with long TTLs? Images, fonts, and scripts should have max-age of at least one year with content hashing for busting.
Are your public API responses cached at the edge? If the same 1000 users are fetching the same product data, your origin should see one request, not 1000.
Are your database queries hitting the same rows repeatedly? Use EXPLAIN ANALYZE to identify hot queries. Check whether the data changes faster than it is read.
Do you have a warm-up strategy? Cold caches hurt. After deployments, prefetch your most critical routes before traffic hits.
Do you measure cache hit rates? You cannot improve what you do not measure. Track hit rate, eviction rate, and memory pressure per cache tier.

The compounding return

Caching has a compounding effect. A 50ms database query that runs 10,000 times a minute costs 500 server-seconds of compute every minute. Cache it for 60 seconds and that same data serves those 10,000 requests from memory - at microsecond latency.

Performance is not about making your slowest path faster. It is about making your most common path essentially free. A well-designed caching strategy does exactly that.

If you are working on a performance problem and want a second opinion on your caching architecture, get in touch.