Nginx: Async Cache Refresh

A story of RFC 5861 and proxy_cache_background_update directive
notion image
Photo by Robin Pierre on Unsplash


Sometimes you need to do third party server side integration with your core site. And if you are into media industry (or similar) domain. You might have to do this synchronously before serving the request for proper server side rendering and SEO benefit.
A lot of times, third parties are slow and this hits the core latency of your service. making you look slow and in worst case, run out of worker processes and crash.
notion image
Random latency graph for some context — Yes we monitor things (at least some of them)
A solution to such a scenario would be to serve stale content and asynchronously refresh the cache and yes, there is a RFC (Specifically, RFC 5861 https://tools.ietf.org/rfc/rfc5861) that defines the required HTTP procedure.
Thankfully! the newer versions of nginx (1.11+) has a directive — proxy_cache_background_update (http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_cache_background_update) that implements the RFC 5861.
Let us explore it further.


I will keep this post minimal, we will use micro as our server but the implementation is backend agnostic. Also, for brevity I will use docker nginx but this will work with standalone installation of nginx equally well.
We start with a tiny backend that slowly (10 sec slow) — generates a random number.
const sleep = require('then-sleep'); var rn = require('random-number'); module.exports = async (req, res) => { await sleep(10000); const generator = rn.generator({ min: 0, max: 1000, integer: true }); const randomNumber = generator().toString(); res.end(`${randomNumber} `); };
server.js — powered by micro
And this would be the relavant part from package.json
"devDependencies": { "micro-dev": "^1.3.0" }, "scripts": { "dev": "micro-dev" }
Our nginx.conf file would look like this (again only the relevant parts)
http { proxy_read_timeout 20; proxy_cache_path /var/cache/nginx/cache levels=1:2 keys_zone=cache:10m inactive=24h max_size=100m; proxy_cache cache; proxy_cache_valid 200 20s; proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504; proxy_cache_background_update on; server { listen 80; server_name ,[object Object], localhost; location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header HOST $http_host; proxy_pass http://<host-ip-use-ifconfig>:3000; proxy_redirect off; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; add_header X-Cache-Status $upstream_cache_status; } }
nginx.conf — powered by docker
And you can use the following docker command to run this nginx in a container
docker run --name nginx -p 80:80 -v <absolute-path>/nginx.conf:/etc/nginx/nginx.conf -d nginx
And you also want to add the following line to /etc/hosts file. ,[object Object]
The nginx.conf file is there all the magic happens. Let us break it down.
proxy_read_timeout 20;
Wait for the backend for 20s and then quit on it.
proxy_cache_use_stale updating
(some things removed for brevity). When the backend is updating, serve stale content from cache.
proxy_cache_background_update on;
The is the main directive that implements RFC 5861 — proxy_cache_background_update
And that is all, please comment if you need clarification for any other directives/code.


The first N requests (when no stale cache is available) stall and hit the backend for response.
notion image
As we can see above, the first two requests, when no cache is available, stall and wait for the backend to respond. Look for the logged response to request #1 and request #2.
notion image
The next N requests serve stale response from request #2 and wait for request #3 (10 sec sleep) to refresh the cache while serving the stale response from request #2.
This is a big win for the backend scalability. However, if there are a lot of concurrent requests at first, we have a problem.


Nginx directive — proxy_cache_background_update gives us a lot of powers to serve stale data if our business logic permits that as a valid use case and can be used to serve stale and asynchronously update the cache from a slow backend.
We can address the issue of N initial concurrent request (that hit a no cache zone) by warming the cache up front (preemptively at publish action or some trigger in the business workflow).
Happy scaling!