Luka builds things

Caching adventures with Caddy and Nix

• Published:

So if you've followed along my Create a static blog with Nix blog post you might have noticed an issue when serving the files.

When you update a blog post and deploy it to the server, you check it on your phone to see the new changes, but they are just not showing! You have to do a hard-refresh for them to show up. You investigate and notice that the server is always returning 304 Not Modified for all static files. You try creating a new blog post, just to test it out, and you get 200 response with the content of the post just fine. You refresh the page, you get a 304 which is also fine - you think caching is working. You then update the blog post and expect a 200 response with the content, but you still get a 304 Not Modified response with no content. Browser then displays the old blog post. Weird. Something is not working as it should.

Let's investigate.

How browser caching works

Say we're serving a static file /index.html. When you first navigate to this page, browser makes a request for that file and displays it. It also saves it to its own cache so that it does not have to fetch it everytime you open that page, saving you time and bandwidth.

But if the file changes on the server, how does browser usually display the new content?

When a browser navigates to a page for the second time (or any subsequent visit), it includes an If-Modified-Since header in its request. This header contains the Last-Modified timestamp that the server provided during the initial request.

Upon receiving this request, the server compares the header timestamp with the file's actual modification time. If the file has been modified since the indicated time, the server responds with the complete file. If no changes have occurred, it returns a 304 Not Modified status code.

When browser receives the full file, it displays the content, updates its cache, and records the new timestamp for future requests. If it receives a 304 response, it simply displays the cached version of the page.

That works great, if the server has proper file modification timestamps on the files it serves. That is not the case for files in the nix store.

Nix's peculiarities strike again

When Nix adds a file to the /nix/store, it strips away the file modification times.

Let's take a look at a file in the store:

ls -l /nix/store/50n81h7r3mp2v8ybjc7qrjx5igrcghmc-just-1.42.3/bin/just
-r-xr-xr-x 1 root nixbld 3.7M Jan  1  1970 /nix/store/50n81h7r3mp2v8ybjc7qrjx5igrcghmc-just-1.42.3/bin/just*

❯ stat --format %y /nix/store/50n81h7r3mp2v8ybjc7qrjx5igrcghmc-just-1.42.3/bin/just
1970-01-01 01:00:01.000000000 +0100

❯ stat --format %Y /nix/store/50n81h7r3mp2v8ybjc7qrjx5igrcghmc-just-1.42.3/bin/just
1stat --help | rg "(format=|modif)"
  -c  --format=FORMAT   use the specified FORMAT instead of the default;
  %y   time of last data modification, human-readable
  %Y   time of last data modification, seconds since Epoch

You can see that the modification time is one second after the Unix epoch.

This is clearly an issue for our Caddy server. When the If-Modified-Since time is always after the file's modified time, Caddy will consistently return a 304 Not Modified response for requests containing the If-Modified-Since header.

This behavior is actually correct - Caddy is operating according to specification.

So what can we do?

Well, if browser does not have the last modification time, it won't send the If-Modified-Since header and Caddy will return the whole file always. Which does fix our issue of updates not being seen by people, but also wastes bandwidth and increases load times. Not cool.

When searching for this I encountered ETags. ETag stands for Entity Tag. You can think of them as file hashes. If a file changes, so does its hash.

Browsers can use this for cache control via the ETag header.

In my tests I found that Caddy would not generate etags for my files. After some searching I confirmed it when I found this PR: fileserver: Don't set Etag if mtime is 0 or 1. Remember when we inspected the modification times in the nix store? Yeah, its 1.

So Caddy won't do it automatically, but can we somehow force it? We can!

The file_server module accepts etag_file_extensions, an array of extensions from which ETags will be read by Caddy. So for each file we'll just have to create an ETag file.

Alright, so we have a plan!

Implementation time

Let's first generate the ETag files.

Generating ETag files

We add some additional build inputs (fd and coreutils for md5sum and cut) and generate the files as part of our installPhase.

diff --git a/flake.nix b/flake.nix
index d8c6552..d1b602f 100644
--- a/flake.nix
+++ b/flake.nix
@@ -30,12 +30,22 @@
           pname = "my-static-site";
           version = "1.0.0";
           src = ./.;
-          buildInputs = [ eleventy ];
+          buildInputs = with pkgs; [
+            eleventy
+            fd
+            coreutils
+          ];
           buildPhase = "eleventy";
           installPhase = ''
             mkdir -p $out/
             echo $out
             cp -r _site/* $out/
+
+            # Generate .etag files for cache validation
+            for file in $(fd --type f . "$out"); do
+              hash=$(md5sum "$file" | cut -d" " -f1)
+              echo "\"$hash\"" > "$file.etag"
+            done;
           '';
         };
       in

If you wan't you can swap out the md5sum for something faster/slower, in the end it just has to be uniqu enough.

Alright lets test it out with nix build and inspect the results:

❯ nix build
warning: Git tree '[...]/create-a-static-blog-with-nix' is dirty

❯ tree ./result
./result
├── index.html
└── index.html.etag

1 directory, 2 files

❯ cat ./result/index.html.etag
"5273e1f62e43bab55701f318492d7cad"

Important: The contents of the .etag file have to be quoted for this to work. I pulled my hair out until the kind folks at the Caddy helped me solve the issue.

Configuring Caddy

Now that we have the ETag files, lets tell Caddy to use them.

So we need to remove the Last-Modified header, to force browser to use ETags and specify the file extension Caddy should use to get the ETags for our files.

diff --git a/nixos-configuration/my-static-site.nix b/nixos-configuration/my-static-site.nix
index 82737c4..650898a 100644
--- a/nixos-configuration/my-static-site.nix
+++ b/nixos-configuration/my-static-site.nix
@@ -7,7 +7,12 @@ in
     enable = true;
     # Needs `http://` prefix so that it does not try to request TLS certificates and redirect to 443
     virtualHosts."http://${site-url}".extraConfig = ''
-      file_server
+      header {
+        -Last-Modified
+      }
+      file_server {
+          etag_file_extensions .etag
+      }
       root * ${inputs.our-site.packages."${pkgs.system}".default}
       encode gzip
     '';

Now is a good time to commit this: Use ETags instead of Last-Modified for caching.

ETag browser caching process

With ETags properly configured, browser caching process follows this sequence:

Initial Request: The server retrieves the ETag from the index.html.etag file and includes it in the response headers alongside the index.html content. Browser receives both the file and its corresponding ETag value, caching them together.

Subsequent Requests: When browser requests the same file again, it automatically includes an If-None-Match header containing the previously cached ETag value. The server compares this ETag against the current value stored in index.html.etag. If they match, the server responds with a 304 Not Modified status, allowing browser to use its cached version. If the ETags differ, the server sends the updated file with the new ETag.

That's it, we fixed it!

This was one of those frustrating debugging sessions where everything seemed to be working correctly - Caddy was following HTTP specifications to the letter, browsers were caching as expected, and yet we couldn't see updates. The real culprit turned out to be Nix's timestamp normalization, a detail of how the Nix store works.

What made me miss this issue in the first place is that it only manifested for me after making some updates to my posts. But with some persistence we found a clean solution using ETags that, in my opinion, works better than modification times anyway.

Now our static site updates show up immediately for users, without sacrificing caching performance. Problem solved!