Caching strategies for your website: SSG, SSR, and CDN

When choosing the tech stack for a project, we often ask ourselves what kind of website we'll be building to better assess the right options for the project.

Every option has its own tradeoffs, so making sure we choose the one that best fits the project requirements is a must. But making this decision can be challenging once project requirements change in the future, and what looks like a good solution now can very well not be later.

In this article, we'll go through one of web developers' most complex problems: cache.

Caching

These questions are often underestimated, and the results can lead to unintended behaviors in our users. And before we can answer the previous questions, we should know what a cache is and what kind of caches are available for us.

What is a cache?

A cache is a spot to put stuff. It temporarily stores data, so you don't have to get it whenever you need it. It enhances the performance of recently or frequently accessed data, according to your context.

Types of caches

There are two main types of caches in the HTTP Caching specification: private caches and shared caches.

Private caches

A private cache is a cache bonded to a specific client, typically a browser cache. Since the response isn't shared with other clients, a private cache can store a personalized response for that user.

You must specify a private directive if a response contains personalized content and you want to store the response only in the private cache.

https://gist.github.com/pixelmattersdev/889ea3b3b1758658d7e129d7f614d7e6?file=1-cache-control-private.yml

Suppose personalized contents are stored in a cache other than a private cache. In that case, other users may be able to retrieve those contents, which may cause the information to be leaked unintentionally.

Note that if the response has an Authorization header, it cannot be stored in the private cache (or a shared cache, unless public is specified).

Shared cache

The shared cache is located between the client and the server and can store responses that can be shared among users. And shared caches can be further sub-classified into:

https://gist.github.com/pixelmattersdev/889ea3b3b1758658d7e129d7f614d7e6?file=2-cache-control-no-store.yml

Next, we’ll go over the different caching strategies so you can better understand which one best suits your website.

Static Site Generation

You got a website and ran the build script to pre-render all pages into static HTML files. After all pages are built, you can upload your static assets into a CDN.

Steps for generating and deploying pages with the static site generation approach.

So, what's really great about static site generation is now that all those documents are pre-rendered. They're static, sitting on the CDN waiting for somebody to come and visit your website to get one of those documents.

The user makes a request, and the CDN doesn't have to do any work at all. It doesn't have to build the page and doesn't have to render anything. It can just send it right to the user, and the user is super happy because it was a fast and cached response resulting in a snappy experience.

Now let's say you edit some of the data — what does that mean for static site generation? Well, something on the database/CMS changed, but your CDN still has all those static assets from the last deploy. So, if the user visits the page, they will get a fast response, but it will be stale; it's not updated with the new data yet, and sometimes that's fine as well.

To have the latest changes from your data available to your users, you need to go over the build of every single page of the website again, even if you only changed data for a particular page.

You build every page for every deploy and any edit on your data.

Server Side Rendering

In SSR, you don't have a big build step; you upload your website to the internet, and then build the pages on demand. So, when somebody asks for a page, you build the page on your server, and then you send it back.

Steps for all visitors for SSR without CDN.

The user has to wait for the page to build, so with static sites, it's nice because you built it before for the users not to have to wait for the build.

This means that every visit to every page rebuilds, but only the visited pages are built. If you have some pages that are never visited, you never build them.

In a nutshell, with a server, you only build the pages that people visit as opposed to static site generation, where you build every single page whether people visit them or not.

The first visitor shows up, and they request the CDN, not your server. CDN doesn't have the document yet, so it goes over to the origin server (your actual web server). The origin server builds the page, sends the page to the CDN, the CDN caches the page, and then sends it to the user. The user won't be super happy because it had to wait for the full cycle to finish.

Steps for the first visitor for SSR + CDN.

However, with the CDN, the second visitor requests the page from the CDN, and the CDN already has the document, so it can leave the origin server alone and send a response back to the user. The user is happy because it was fast, cached, and fresh/accurate. This scenario isn't just the second visitor. It's the 3rd, the 4th, the 100th, and so on, visitors.

Steps for the illustration for second, third, etc. visitors.

When using a CDN, the idea of max age can be configured via the cache-control HTTP header using a max-age property. The max-age property says how long a thing should be cached. The value of max-age is in seconds, so you can say to cache it for 60 seconds, cache it for a day, cache it for a month, etc.

Let's say you set the max-age to 60 (1 minute) on a page, so the CDN caches the page for 60 seconds, and the cache expires when 60 seconds have passed. When a cache expires, the CDN will request a fresh page from the origin server, store the fresh page in the cache and send the page to the user.

The next visitor that asks for the same page within 60 seconds will receive the page from the CDN cache. After 60 seconds, the same process repeats when a new request comes in. We’re building a page once a minute, as requests come in only when requests to the page are made.

Incremental Static Regeneration

With the stale while revalidate HTTP caching value, when a user requests a page that has been cached on the CDN but is expired, the CDN will return the expired version of the page.

Then, in the background, it makes a request to the origin server to get a fresh version of the same page. After getting the fresh page, it saves it in the cache and will be used for new user requests.

How often do you want to rebuild?

At the end of the day, if you’re using SSR + CDN + cache headers, you should ask — how often do you want to rebuild? What kind of page is this? Is this a page with data that is frequently changed? Does it matter if, after the data changes, we still show a stale version of it to the user for a bit? Answering those questions will help you understand the best caching strategy for your website.

Rui Saraiva
Front-End Developer