Andrew Welch · Insights · #craftcms #cache #performance

Published , updated · 5 min read ·

For more tools, technologies, and techniques, check out the podcast!

The Craft Cache Tag In-Depth

Craft CMS has a cache tag that can help with per­for­mance, if used effec­tive­ly. Here’s how it works.

Craft Cms Cache Tag

The {% cache %} tag avail­able in Craft CMS can help with per­for­mance, if you under­stand it and apply it cor­rect­ly. How­ev­er, as so elo­quent­ly stat­ed on the Craft CMS Tem­plat­ing Docs page:

If you’re suffering from abnormal page load times, you may be experiencing a suboptimal hosting environment. Please consult a specialist before trying {% cache %}. {% cache %} is not a substitute for fast database connections, efficient templates, or moderate query counts. Possible side effects include stale content, excessively long-running background tasks, stuck tasks, and in rare cases, death. Ask your hosting provider if {% cache %} is right for you.

In oth­er words, the {% cache %} tag is not some­thing you should use to mask per­for­mance prob­lems, be they from an ane­mic or poor­ly con­fig­ured serv­er, or tem­plates that are writ­ten inef­fi­cient­ly. You can’t just write hor­ri­bly inef­fi­cient tem­plates, slap them up on GoDad­dy shared host­ing, and expect to solve” the per­for­mance prob­lems with the {% cache %} tag. We call that putting lip­stick on a pig.

How­ev­er, it can be an extreme­ly use­ful tool once you under­stand what it is, and how to uti­lize it.

Link Why Do We Cache?

The gen­er­al idea behind a cache is that we want to keep data that we access fre­quent­ly around in a for­mat that can quick­ly be retrieved. We do this in our dai­ly lives, too. When you wake up in the morn­ing and open up the med­i­cine cab­i­net, every­thing you need to get ready is right there wait­ing for you.

Imag­ine how much time it would take you to get ready in the morn­ing if your toi­letries were scat­tered all over your house, and you had to hunt for them every morning…

We are pre­flight­ing our morn­ing rou­tine by caching the stuff we need all in one place, so we can get our job of get­ting ready done efficiently.

And that’s exact­ly why we cache data on our com­put­ers as well. Rather than com­pute some insane­ly com­pli­cat­ed for­mu­la pulled from a mas­sive dataset each time a web­page is loaded, we can cache it. We cal­cu­late the result at a non-crit­i­cal time, and present this cached result to the user when they load the page.

Everyone’s hap­py.

Link What Exactly Does the {% cache %} Tag Do?

So now that we under­stand why we want to cache data, let’s have a look at exact­ly how the {% cache %} tag in Craft CMS works. Here’s a pret­ty stan­dard chunk of Twig code:

{% for entry in craft.entries.section('blog').relatedTo(someCategory) %}
        <h1>{{ entry.title }}</h1>
        <p>{{ entry.summary }}</p>
        <img src="{{ entry.image.first().getUrl(myTransform) }}">
{% endfor %}

This code gen­er­ates a num­ber of data­base queries to pull in all of our blog entries that are relat­ed to someCategory and then loops through and dis­plays them. The {{ entry.image.first().getUrl(myTransform) }} actu­al­ly gen­er­ates sev­er­al data­base queries each time through the loop, because Assets are rela­tions that need to be looked up, and it has to look up the Asset trans­form. This is called the N+1 query prob­lem, and it basi­cal­ly just means that the more entries we have, the slow­er things are gonna get.

Do we real­ly need to do all of these data­base queries every time some­one loads our web page? Nope. We only post a new blog entry once a week at best; there’s no rea­son to do all of these data­base queries every sin­gle time a page is loaded.

So instead, we can wrap it in a {% cache %} tag:

{% cache %}
    {% for entry in craft.entries.section('blog').relatedTo(someCategory) %}
            <h1>{{ entry.title }}</h1>
            <p>{{ entry.summary }}</p>
            <img src="{{ entry.image.first().getUrl(myTransform) }}">
    {% endfor %}
{% endcache %}

All that the {% cache %} tag does is cap­ture the parsed out­put of what­ev­er it’s wrapped around, and stores the result as text in the data­base. So instead of a bunch of queries for our entries, and a num­ber of queries for our entry.image.first().getUrl(myTransform) each time through the loop, we only end up with one query for our cached result.

This is why we want to use {% cache %} tags around blocks of Twig code that are either com­pute-inten­sive, or blocks of Twig code that gen­er­ate a large num­ber of data­base queries. If you just wrap plain old HTML text in {% cache %} tags, you won’t see any gains. It might even be ever so slight­ly slow­er, because we’ve added a data­base query where before there was none.

Tan­gent: Astute read­ers will note that the above exam­ple is a per­fect can­di­date for Eager Load­ing, but that’s beyond the scope of this article.

If you have devMode enabled, you can view your JavaScript Con­sole to see the num­ber of data­base queries exe­cut­ed on a page. Here’s what it looks like for this blog page you’re read­ing right now, with­out {% cache %} tags:

Caching Off

With­out {% cache %} tags

And here’s what the same blog page looks like with {% cache %} tags judi­cious­ly used:

Caching On

With {% cache %} tags

We went from 102 data­base queries down to 43 data­base queries, and the time these queries took went from 1.84s down to just 0.49s! This is a huge gain, for very lit­tle work! And the num­ber of queries for the page with­out {% cache %} tags will keep going up as we add more blog entries, while the num­ber of queries for our cached page will stay con­stant. We have sub­sti­tut­ed many data­base queries for few­er, which results in bet­ter performance.

The key take-away here is that what the {% cache %} tag is doing is stor­ing the parsed out­put of the Twig code that’s inside of it. If you have Twig code such as {{ now|date("M d, Y") }} inside a {% cache %} tag, what’s saved in the cache is not the Twig code, but rather the parsed result (in this case, a date). So the date dis­played on the web­page is not going to change as long as the data is cached.

So if our code is:

{% cache %}
    {{ now|date("M d, Y") }}
{% endcache %}

What’s actu­al­ly saved in the data­base for this cached chunk of Twig code is just:

Nov 30, 2016

Tan­gent: The cacheDuration Gen­er­al Con­fig Set­ting deter­mines how long a tem­plate cache will be kept around. This set­ting defaults to P1D (1 day, or 24 hours), which means that your caches are going to be regen­er­at­ed every 24 hours. While this is a sen­si­ble default set­ting, it is a bit con­ser­v­a­tive, giv­en Craft’s auto­mat­ic cache break­ing. I typ­i­cal­ly set it to cacheDuration => false in my craft/config/general.php file, which means that the tem­plate caches nev­er expire on their own, they only are regen­er­at­ed when con­tent inside the {% cache %} tags changes. For a dis­cus­sion of how to use mul­ti-envi­ron­ment con­fig set­tings, check out the Mul­ti-Envi­ron­ment Con­fig for Craft CMS article.

Link Cache-Busters!

That brings up an inter­est­ing conun­drum. What hap­pens if some­one changes one of the blog entries that was inside of our {% cache %} tags? Craft is actu­al­ly pret­ty clever about it: it keeps track of any ele­ments (entries, assets, etc.) that are inside of your {% cache %} tags, and will mark the cache as being stale if they have changed. So it will auto­mat­i­cal­ly break the cache for you, caus­ing it to regen­er­ate and re-cache the data.

This is great, but we also have to do our jobs being smart about what we cache, and how we cache it. For exam­ple, a typ­i­cal blog page will have the blog entry itself, as well as a blog archives” sec­tion that lists all of our old­er blog entries.

Now, we could just wrap the entire tem­plate in {% cache %} tags and call it a day. How­ev­er, think about what hap­pens when a new blog is pub­lished: it will inval­i­date every sin­gle blog page cache, because our archives changed with the addi­tion of a new blog post! This seems a bit sil­ly, because our oth­er blogs entries didn’t change at all. All that changed was our blog archives sec­tion, to list the new blog entry.

If you think about it from an abstract point of view, we real­ly have two enti­ties we want to cache in this case:

  1. The blog entry itself
  2. The blog archives

Because they real­ly are inde­pen­dent of each oth­er. We might add a new blog post that would cause the blog archives to need to change, but this doesn’t affect our each blog entry.

Fur­ther, every sin­gle blog entry page is going to share the same exact blog archives sec­tion. So, we can do some­thing like this:

{# -- Our blog entry -- #}
{% cache globally using key craft.request.path %}
        <h1>{{ entry.title }}</h1>
        <p>{{ entry.body }}</p>
        <img src="{{ entry.image.first().getUrl(myTransform) }}">
{% endcache %}

{# -- Our blog archives -- #}
{% cache globally using key "blog-archives" %}
        {% for blogArchive in craft.entries.section('blog') %}
            <h1>{{ blogArchive.title }}</h1>
            <p>{{ blogArchive.summary }}</p>
        {% endfor %}
{% endcache %}

Here we have two sep­a­rate caches on one page. The first is for our actu­al blog entry; the sec­ond is a glob­al­ly cached out­put of our blog archives, stored in the data­base as blog-archives. This means that there is a unique cache of each blog entry, but a sin­gle glob­al­ly shared cache of our blog archives.

Which is great, because we don’t want every sin­gle blog page to have to be re-cached when­ev­er we add a new blog entry. It’ll just re-cache the blog archives cache, which all of our blog pages share.

You’ll also notice that instead of just using the {% cache %} tag on its own, we’re actu­al­ly using {% cache globally using key craft.request.path %}. We do this because the default behav­ior for the {% cache %} tag is to use the full path as a way of unique­ly iden­ti­fy­ing our cache (in addi­tion to a unique hash that Craft auto­mat­i­cal­ly gen­er­ates, so we can have more than one {% cache %} tag per page). This full path includes the query string, which is any­thing after the ? in a URI, e.g.: ?utm_source=GHDJ14J.

But we don’t real­ly want the query string to cause a new, unique cached item, oth­er­wise we could poten­tial­ly end up with hun­dreds or even thou­sands of entries in the craft_templatecaches data­base table. For instance, if we just use the {% cache %} tag on its own, a request to /blog and /blog?utm_source=F1GMAT refer to the same page, but would result in addi­tion­al craft_templatecaches entries.

If you end up with a large num­ber of entries in the craft_templatecaches data­base table, it can actu­al­ly hin­der per­for­mance rather than help it — which defeats the whole point of caching to begin with! Pix­el & Ton­ic is actu­al­ly chang­ing the default behav­ior of the {% cache %} tag in Craft 3 to not include the query string, for this very reason.

If you are using the Retour plu­g­in, and you are imple­ment­ing {% cache %} tags in your _layout.twig that oth­er tem­plates extend, you might con­sid­er using a pat­tern like this:

{% cache globally using key craft.request.path unless craft.retour.getHttpStatus != 200 %}

This will cause it to ignore the query string for the cache key, and also nev­er cache any­thing that’s not a 200 OK http sta­tus code. This is need­ed for prop­er error han­dling as described in the Han­dling Errors Grace­ful­ly in Craft CMS arti­cle. We don’t want to cache our error pages!

Keep in mind that if you use {% cache globally using key craft.request.path %} you can only have one {% cache %} tag per page, because the unique key will be the craft.request.path. So if you require more than one {% cache %} tag on a page, add a descrip­tive name to the key to make it unique, such as {% cache globally using key "header-block" ~ craft.request.path %}.

Think about what is on your page, and how it might best be to cache it effectively.

Link Cache Exemptions

A rea­son­ably com­mon pat­tern is that you want to cache your entire web­page, but there are cer­tain excep­tions that you don’t want cached. Let’s call them cache exemp­tions

For instance, you might want to cache all of you pages by putting a {% cache %} tag in your main layout.twig file, as men­tioned above. But we don’t want to cache pages with non-200 http sta­tus codes, and maybe we don’t want to cache cer­tain pages like search results pages, and so on.

A nice pat­tern you can use is some­thing like this:

{# ##### Cache exemption ##### #}
{% set cacheExempt = false %}
{# Exempt certain pages #}
{% set cacheExemptSegments = [
] %}
{% if craft.request.getSegment(1) in cacheExemptSegments %}
    {% set cacheExempt = true %}
{% endif %}
{# Also exempt pages with non-200 OK status codes #}
{% if craft.retour.getHttpStatus != 200 %}
    {% set cacheExempt = true %}
{% endif %}
{# Add any cache exemption conditions #}

{% cache globally using key craft.request.path unless cacheExempt %}

Using this tech­nique, you can add as many pages as you like to the cacheExemptSegments array, and any request that match­es the first seg­ment won’t be cached. You can leave this code in with an emp­ty array to have it not exempt any pages by first seg­ment, to future-proof things in case you need to down the road.

You can then add any oth­er con­di­tions you might have, which makes this pat­tern exten­si­ble, and it also keeps your actu­al {% cache %} tag fair­ly clean.

Link A Peek Under the Hood

So let’s get down and dirty, and have a look at what the entries in the craft_templatecaches table actu­al­ly look like. Here’s the table schema for the craft_templatecaches table:

MariaDB [nystudio]> describe craft_templatecaches;
| Field      | Type         | Null | Key | Default | Extra          |
| id         | int(11)      | NO   | PRI | NULL    | auto_increment |
| cacheKey   | varchar(255) | NO   |     | NULL    |                |
| locale     | char(12)     | NO   | MUL | NULL    |                |
| path       | varchar(255) | YES  |     | NULL    |                |
| expiryDate | datetime     | NO   |     | NULL    |                |
| body       | mediumtext   | NO   |     | NULL    |                |
6 rows in set (0.00 sec)

The cache entries are sen­si­bly done on a per-locale basis, and have a cacheKey and a path that unique­ly iden­ti­fy them. If we just use the reg­u­lar old {% cache %} tag for our pages, the data­base entries in the table might look some­thing like this:

MariaDB [nystudio]> select cacheKey,path from craft_templatecaches;
| cacheKey                             | path                                             |
| blog-archives                        | NULL                                             |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog                                        |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog/creating-optimized-images-in-craft-cms |

As you can see, Craft stores both a cacheKey and a path for each entry. These two in com­bi­na­tion are what describe a unique entry in the craft_templatecaches table. The cacheKey is what we asked it to be for our glob­al blog-archives, and is a ran­dom hash for places where we just used the reg­u­lar old {% cache %} tag.

If we instead use the {% cache globally using key craft.request.path %} pat­tern, our craft_templatecaches ends up look­ing like this:

MariaDB [nystudio]> select cacheKey,path from craft_templatecaches;
| cacheKey                                    | path           |
| blog-archives                               | NULL           |
| blog                                        | NULL           |
| blog/creating-optimized-images-in-craft-cms | NULL           |

As you can see, it’s using the craft.request.path for the key, and there’s noth­ing stored in the path. If we don’t strip the query string out via {% cache globally using key craft.request.path %}, we could end up with a craft_templatecaches table that looks like this:

MariaDB [nystudio]> select cacheKey,path from craft_templatecaches;
| cacheKey                             | path                                             |
| blog-archives                        | NULL                                             |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog                                        |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=F1GMAT                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=ADG12F                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=GS13FA                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=SM66MS                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=CMBKA4                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=HGHAJ2                      |
| yYNGYDNbbsxeRCDCWhBqbmpHbN2NBELomimv | site:blog?utm_source=24GSJ2                      |

And on and on. We can end up with a ton of entries in the craft_templatecaches table that are real­ly the same thing, which can result in per­for­mance degra­da­tion, and the dread­ed hung Delet­ing Stale Tem­plate Caches task.

The same is true for the case where we {% cache %} things in our _layout.twig file, and our 404 tem­plate extends that _layout.twig: if we don’t exclude 404 Not Found http sta­tus codes, we’ll end up with a craft_templatecaches entry for every sin­gle 404 that hits our site. Every. Sin­gle. One.

And believe me, there are swarms of bots out there on the Inter­net that probe your web­sites every day look­ing for vul­ner­a­bil­i­ties that will gen­er­ate a mas­sive num­ber of 404 errors, and can cause seri­ous per­for­mance impli­ca­tions for your website.

So cache wisely.

Link Cache as Cache Can

The {% cache %} tag can do a whole lot more, which you can read up on at the Craft Tem­plat­ing Docs page. Hope­ful­ly this gen­tle intro­duc­tion got you think­ing about when and how to use it.

If you want the ulti­mate in cache-based per­for­mance, check out the Sta­t­ic Page Caching with Craft CMS article.