Building Better Teams

Articles and Posts

Blog

 

Fast Web Requests

Handling a web-request is perhaps the most taken-for-granted feature of modern web development. However, if we dig a bit deeper we’ll find every large website or framework is making some hidden compromise to prevent things from falling over.

In this article we’ll explore some of the most common compromises, (including the solutions and trade-offs) you will find. We’ll start with the easiest concepts first and move into the more complicated ones. We’ll look at the real solutions and issues that are common to most web frameworks and look at futuristic (but imperfect) new solutions too.

The level of complexity discussed will start shallow (starting with caching) and grow into the more advanced concepts step-by-step.

Page Cache

One critical feature for any web technology is dealing with multiple requests. On many major websites (I will use ebay-kleinanzeigen.de as an example), they’re caching responses to requests, meaning many requests can simply hit a cache, bypass their web applications entirely and deliver a good looking page. In fact, if you open their site and refresh a product page few times you’ll see a timing difference between cache hits (100ms) and actual lookups (250ms-800ms). If you flood them with enough requests you’ll even see a varnish error page.

Note the one 429 “Too Many Requests” response, and the different response times due to cache hits/misses.

Note the one 429 “Too Many Requests” response, and the different response times due to cache hits/misses.

This works by isolating specific parts of the page and putting them into cache, but others require lookups to find the value. Many small parts get composed into a larger page request, reducing the total time needed to get the generated HTML.

This strategy helps some customer-specific pages get cached but there’s a lot of inherent issues with this strategy. First of all, each cache must be short-lived. If the cache is held too long customers will notice pages are outdated. It also means the website now needs some complex cache-invalidation mechanism that can invalidate specific pages whenever they have outdated data. That means many update messages must be sent very quickly to remove many caches across the infrastructure.

The last burden from this technique is storage costs: every page (and page fragment) rendered needs to be stored somewhere. If you’re the one writing code for a project like this, you’ll need to be careful to not break page caching accidentally, which sounds simple but gets harder when using composition and component based design (which are now ubiquitous concepts in web design). To prevent having a slow site, you need special techniques to bust the cache of low-level commonly-used components.

Data Layer Optimization

A lot of projects are afraid to use varnish caching because of the high storage requirements, and some are afraid that caching rendered HTML makes it easier for developers to write bugs (because cache often finds a way into development environments over time). Whether you agree or disagree, one can easily imagine how a mature project using a template system that doesn’t think about things like Varnish Server-Side-Includes can be really hard to migrate into varnish. This leads to projects to take another popular approach: cache the data, not the templates.

The idea is that joining together a bunch of HTML strings is not the slow part when rendering a web page, but rather, all the data lookups and queries are the real issue.

The biggest hurdle with this is that cache only works when fetching idempotent results. This means if you have one part of the page that is dynamic you cannot cache it. It is probably easier to make the dynamic parts using JavaScript and XHR, however that may also bring some drawbacks such as additional latency or complexity.

General Advice

Tools like varnish are a good choice in some contexts, but it is not a magic wand that will fix all your issues, and may introduce a lot of technical debt if you need to update all frontend applications to be aware of Server-Side-Includes (SSI).

Shared state is easy to miss. If your page needs to wait for a resource to be modified (by the same user, or by any user) before it can be shown it will guarantee slowness at scale. You should design your frontend applications in a way they have no access to a database (I mean the ones that can lock a table, like postgres or mysql).

String concatenation is usually fast, and data lookups are usually slow. Your queries should be fast, and should be profiled. A simple index miss on a page can bring down applications in some monolithic architectures.

Brian Graham