Andrew Welch · Insights · #craftcms #cache #performance

Making the web better one site at a time, with a focus on performance, usability & SEO

2016.11.30 · 5 min read ·

The Craft {% cache %} Tag In-Depth

Craft CMS has a {% cache %} tag that can help with performance, if used effectively. Here’s how it works.

The {% cache %} tag available in Craft CMS can help with performance, if you understand it and apply it correctly. However, as so eloquently stated on the Craft CMS Templating Docs page:

If you’re suffering from abnormal page load times, you may be experiencing a suboptimal hosting environment. Please consult a specialist before trying {% cache %}. {% cache %} is not a substitute for fast database connections, efficient templates, or moderate query counts. Possible side effects include stale content, excessively long-running background tasks, stuck tasks, and in rare cases, death. Ask your hosting provider if {% cache %} is right for you.

In other words, the {% cache %} tag is not something you should use to mask performance problems, be they from an anemic or poorly configured server, or templates that are written inefficiently. You can’t just write horribly inefficient templates, slap them up on GoDaddy shared hosting, and expect to “solve” the performance problems with the {% cache %} tag. We call that putting lipstick on a pig.

However, it can be an extremely useful tool once you understand what it is, and how to utilize it.

Why Do We Cache?

The general idea behind a cache is that we want to keep data that we access frequently around in a format that can quickly be retrieved. We do this in our daily lives, too. When you wake up in the morning and open up the medicine cabinet, everything you need to get ready is right there waiting for you.

Imagine how much time it would take you to get ready in the morning if your toiletries were scattered all over your house, and you had to hunt for them every morning…

We are preflighting our morning routine by caching the stuff we need all in one place, so we can get our job of getting ready done efficiently.

And that’s exactly why we cache data on our computers as well. Rather than compute some insanely complicated formula pulled from a massive dataset each time a webpage is loaded, we can cache it. We calculate the result at a non-critical time, and present this cached result to the user when they load the page.

Everyone’s happy.

What Exactly Does the {% cache %} Tag Do?

So now that we understand why we want to cache data, let’s have a look at exactly how the {% cache %} tag in Craft CMS works. Here’s a pretty standard chunk of Twig code:

{% for entry in craft.entries.section('blog').relatedTo(someCategory) %}
    <article>
        <h1>{{ entry.title }}</h1>
        <p>{{ entry.summary }}</p>
        <img src="{{ entry.image.first().getUrl(myTransform) }}">
    </article>
{% endfor %}

This code generates a number of database queries to pull in all of our blog entries that are related to someCategory and then loops through and displays them. The {{ entry.image.first().getUrl(myTransform) }} actually generates several database queries each time through the loop, because Assets are relations that need to be looked up, and it has to look up the Asset transform. This is called the N+1 query problem, and it basically just means that the more entries we have, the slower things are gonna get.

Do we really need to do all of these database queries every time someone loads our web page? Nope. We only post a new blog entry once a week at best; there’s no reason to do all of these database queries every single time a page is loaded.

So instead, we can wrap it in a {% cache %} tag:

{% cache %}
    {% for entry in craft.entries.section('blog').relatedTo(someCategory) %}
        <article>
            <h1>{{ entry.title }}</h1>
            <p>{{ entry.summary }}</p>
            <img src="{{ entry.image.first().getUrl(myTransform) }}">
        </article>
    {% endfor %}
{% endcache %}

All that the {% cache %} tag does is capture the parsed output of whatever it’s wrapped around, and stores the result as text in the database. So instead of a bunch of queries for our entries, and a number of queries for our entry.image.first().getUrl(myTransform) each time through the loop, we only end up with one query for our cached result.

This is why we want to use {% cache %} tags around blocks of Twig code that are either compute-intensive, or blocks of Twig code that generate a large number of database queries. If you just wrap plain old HTML text in {% cache %} tags, you won’t see any gains. It might even be ever so slightly slower, because we’ve added a database query where before there was none.

Tangent: Astute readers will note that the above example is a perfect candidate for Eager Loading, but that’s beyond the scope of this article.

If you have devMode enabled, you can view your JavaScript Console to see the number of database queries executed on a page. Here’s what it looks like for this blog page you’re reading right now, without {% cache %} tags:

Without {% cache %} tags

And here’s what the same blog page looks like with {% cache %} tags judiciously used:

With {% cache %} tags

We went from 102 database queries down to 43 database queries, and the time these queries took went from 1.84s down to just 0.49s! This is a huge gain, for very little work! And the number of queries for the page without {% cache %} tags will keep going up as we add more blog entries, while the number of queries for our cached page will stay constant. We have substituted many database queries for fewer, which results in better performance.

The key take-away here is that what the {% cache %} tag is doing is storing the parsed output of the Twig code that’s inside of it. If you have Twig code such as {{ now|date("M d, Y") }} inside a {% cache %} tag, what’s saved in the cache is not the Twig code, but rather the parsed result (in this case, a date). So the date displayed on the webpage is not going to change as long as the data is cached.

So if our code is:

{% cache %}
    {{ now|date("M d, Y") }}
{% endcache %}

What’s actually saved in the database for this cached chunk of Twig code is just:

Nov 30, 2016

Tangent: The cacheDuration General Config Setting determines how long a template cache will be kept around. This setting defaults to P1D (1 day, or 24 hours), which means that your caches are going to be regenerated every 24 hours. While this is a sensible default setting, it is a bit conservative, given Craft’s automatic cache breaking. I typically set it to cacheDuration => false in my craft/config/general.php file, which means that the template caches never expire on their own, they only are regenerated when content inside the {% cache %} tags changes. For a discussion of how to use multi-environment config settings, check out the Multi-Environment Config for Craft CMS article.

Cache-Busters!

That brings up an interesting conundrum. What happens if someone changes one of the blog entries that was inside of our {% cache %} tags? Craft is actually pretty clever about it: it keeps track of any elements (entries, assets, etc.) that are inside of your {% cache %} tags, and will mark the cache as being stale if they have changed. So it will automatically break the cache for you, causing it to regenerate and re-cache the data.

This is great, but we also have to do our jobs being smart about what we cache, and how we cache it. For example, a typical blog page will have the blog entry itself, as well as a “blog archives” section that lists all of our older blog entries.

Now, we could just wrap the entire template in {% cache %} tags and call it a day. However, think about what happens when a new blog is published: it will invalidate every single blog page cache, because our archives changed with the addition of a new blog post! This seems a bit silly, because our other blogs entries didn’t change at all. All that changed was our blog archives section, to list the new blog entry.

If you think about it from an abstract point of view, we really have two entities we want to cache in this case:

  1. The blog entry itself
  2. The blog archives

Because they really are independent of each other. We might add a new blog post that would cause the blog archives to need to change, but this doesn’t affect our each blog entry.

Further, every single blog entry page is going to share the same exact blog archives section. So, we can do something like this:

{# -- Our blog entry -- #}
{% cache globally using key craft.request.path %}
    <article>
        <h1>{{ entry.title }}</h1>
        <p>{{ entry.body }}</p>
        <img src="{{ entry.image.first().getUrl(myTransform) }}">
    </article>
{% endcache %}

{# -- Our blog archives -- #}
{% cache globally using key "blog-archives" %}
    <section>
        {% for blogArchive in craft.entries.section('blog') %}
            <h1>{{ blogArchive.title }}</h1>
            <p>{{ blogArchive.summary }}</p>
        {% endfor %}
    </section>
{% endcache %}

Here we have two separate caches on one page. The first is for our actual blog entry; the second is a globally cached output of our blog archives, stored in the database as blog-archives. This means that there is a unique cache of each blog entry, but a single globally shared cache of our blog archives.

Which is great, because we don’t want every single blog page to have to be re-cached whenever we add a new blog entry. It’ll just re-cache the blog archives cache, which all of our blog pages share.

You’ll also notice that instead of just using the {% cache %} tag on its own, we’re actually using {% cache globally using key craft.request.path %}. We do this because the default behavior for the {% cache %} tag is to use the full path as a way of uniquely identifying our cache (in addition to a unique hash that Craft automatically generates, so we can have more than one {% cache %} tag per page). This full path includes the query string, which is anything after the ? in a URI, e.g.: ?utm_source=GHDJ14J.

But we don’t really want the query string to cause a new, unique cached item, otherwise we could potentially end up with hundreds or even thousands of entries in the craft_templatecaches database table. For instance, if we just use the {% cache %} tag on its own, a request to /blog and /blog?utm_source=F1GMAT refer to the same page, but would result in additional craft_templatecaches entries.

If you end up with a large number of entries in the craft_templatecaches database table, it can actually hinder performance rather than help it—which defeats the whole point of caching to begin with! Pixel & Tonic is actually changing the default behavior of the {% cache %} tag in Craft 3 to not include the query string, for this very reason.

If you are using the Retour plugin, and you are implementing {% cache %} tags in your _layout.twig that other templates extend, you might consider using a pattern like this:

{% cache globally using key craft.request.path unless craft.retour.getHttpStatus != 200 %}

This will cause it to ignore the query string for the cache key, and also never cache anything that’s not a 200 OK http status code. This is needed for proper error handling as described in the Handling Errors Gracefully in Craft CMS article. We don’t want to cache our error pages!

Keep in mind that if you use {% cache globally using key craft.request.path %} you can only have one {% cache %} tag per page, because the unique key will be the craft.request.path. So if you require more than one {% cache %} tag on a page, add a descriptive name to the key to make it unique, such as {% cache globally using key "header-block" ~ craft.request.path %}.

Think about what is on your page, and how it might best be to cache it effectively.

Cache Exemptions

A reasonably common pattern is that you want to cache your entire webpage, but there are certain exceptions that you don’t want cached. Let’s call them cache exemptions.

For instance, you might want to cache all of you pages by putting a {% cache %} tag in your main layout.twig file, as mentioned above. But we don’t want to cache pages with non-200 http status codes, and maybe we don’t want to cache certain pages like search results pages, and so on.

A nice pattern you can use is something like this:

{# ##### Cache exemption ##### #}
{% set cacheExempt = false %}
{# Exempt certain pages #}
{% set cacheExemptSegments = [
    'dont-cache-me-bro',
] %}
{% if craft.request.getSegment(1) in cacheExemptSegments %}
    {% set cacheExempt = true %}
{% endif %}
{# Also exempt pages with non-200 OK status codes #}
{% if craft.retour.getHttpStatus != 200 %}
    {% set cacheExempt = true %}
{% endif %}
{# Add any cache exemption conditions #}
...

{% cache globally using key craft.request.path unless cacheExempt %}

Using this technique, you can add as many pages as you like to the cacheExemptSegments array, and any request that matches the first segment won’t be cached. You can leave this code in with an empty array to have it not exempt any pages by first segment, to future-proof things in case you need to down the road.

You can then add any other conditions you might have, which makes this pattern extensible, and it also keeps your actual {% cache %} tag fairly clean.

A Peek Under the Hood

So let’s get down and dirty, and have a look at what the entries in the craft_templatecaches table actually look like. Here’s the table schema for the craft_templatecaches table:

MariaDB [nystudio]> describe craft_templatecaches;
+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | int(11)      | NO   | PRI | NULL    | auto_increment |
| cacheKey   | varchar(255) | NO   |     | NULL    |                |
| locale     | char(12)     | NO   | MUL | NULL    |                |
| path       | varchar(255) | YES  |     | NULL    |                |
| expiryDate | datetime     | NO   |     | NULL    |                |
| body       | mediumtext   | NO   |     | NULL    |                |
+------------+--------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)

The cache entries are sensibly done on a per-locale basis, and have a cacheKey and a path that uniquely identify them. If we just use the regular old {% cache %} tag for our pages, the database entries in the table might look something like this:

MariaDB [nystudio]> select cacheKey,path from craft_templatecaches;
+--------------------------------------+--------------------------------------------------+
| cacheKey                             | path                                             |
+--------------------------------------+--------------------------------------------------+
| blog-archives                        | NULL                                             |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog                                        |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog/creating-optimized-images-in-craft-cms |
+--------------------------------------+--------------------------------------------------+

As you can see, Craft stores both a cacheKey and a path for each entry. These two in combination are what describe a unique entry in the craft_templatecaches table. The cacheKey is what we asked it to be for our global blog-archives, and is a random hash for places where we just used the regular old {% cache %} tag.

If we instead use the {% cache globally using key craft.request.path %} pattern, our craft_templatecaches ends up looking like this:

MariaDB [nystudio]> select cacheKey,path from craft_templatecaches;
+---------------------------------------------+----------------+
| cacheKey                                    | path           |
+---------------------------------------------+----------------+
| blog-archives                               | NULL           |
| blog                                        | NULL           |
| blog/creating-optimized-images-in-craft-cms | NULL           |
+---------------------------------------------+----------------+

As you can see, it’s using the craft.request.path for the key, and there’s nothing stored in the path. If we don’t strip the query string out via {% cache globally using key craft.request.path %}, we could end up with a craft_templatecaches table that looks like this:

MariaDB [nystudio]> select cacheKey,path from craft_templatecaches;
+--------------------------------------+--------------------------------------------------+
| cacheKey                             | path                                             |
+--------------------------------------+--------------------------------------------------+
| blog-archives                        | NULL                                             |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog                                        |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=F1GMAT                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=ADG12F                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=GS13FA                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=SM66MS                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=CMBKA4                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=HGHAJ2                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=24GSJ2                      |
+--------------------------------------+--------------------------------------------------+

And on and on. We can end up with a ton of entries in the craft_templatecaches table that are really the same thing, which can result in performance degradation, and the dreaded hung Deleting Stale Template Caches task.

The same is true for the case where we {% cache %} things in our _layout.twig file, and our 404 template extends that _layout.twig: if we don’t exclude 404 Not Found http status codes, we’ll end up with a craft_templatecaches entry for every single 404 that hits our site. Every. Single. One.

And believe me, there are swarms of bots out there on the Internet that probe your websites every day looking for vulnerabilities that will generate a massive number of 404 errors, and can cause serious performance implications for your website.

So cache wisely.

Cache as Cache Can

The {% cache %} tag can do a whole lot more, which you can read up on at the Craft Templating Docs page. Hopefully this gentle introduction got you thinking about when and how to use it.

If you want the ultimate in cache-based performance, check out the Static Page Caching with Craft CMS article.