Mitigating Disaster via Website Backups

A sol­id back­up strat­e­gy is an insur­ance pol­i­cy for your clients, and can make you the hero if dis­as­ter strikes

Tidal Wave Disaster

Build­ing a web­site takes a ton of work; nobody knows this bet­ter than the web devel­op­ers and design­ers who build the web­site. Clients at least under­stand the amount of work involved when they pay their invoice, and yet more often than not their invest­ment isn’t pro­tect­ed with a prop­er back­up strategy.

This would be like building a house, but not bothering to get home owners insurance.

While many host­ing facil­i­ties and VPS ser­vices offer snap­shot” back­ups, what they do is essen­tial­ly make a back­up disk image of your server.

Linode Snapshot Backups

Lin­ode snap­shot” backups

While this is use­ful if you want to restore the entire serv­er to an arbi­trary point in time, it’s pret­ty heavy-hand­ed if all you need to do is restore a file that the client delet­ed (and you’ll lose any­thing else they’ve changed in the interim).

I view back­ups like this as good to have in an emer­gency, and use­ful if you want to do a back­up before doing a major serv­er upgrade, in case some­thing goes awry. But they are not what I’m look­ing for most of the time; indeed, snap­shot back­ups are not rec­om­mend­ed as a way to reli­ably back up dynam­i­cal­ly chang­ing data such as a mysql database.

So let’s see if we can’t come up with more practical backup strategies

Imple­ment­ing a prop­er back­up strat­e­gy is a ser­vice that you can pro­vide to your clients, because it has real val­ue. So let’s have a look at how we can do it.

The first thing we need to do is logis­ti­cal­ly sep­a­rate the con­tent that we cre­ate from the con­tent that the client cre­ates. Con­tent that we cre­ate, such as the design, HTML, CSS, JavaScript, tem­plates, and graph­ics should all be checked into a git repository.

This allows you to col­lab­o­rate with oth­er devel­op­ers, gives you an intrin­si­cal­ly ver­sioned back­up of the web­site struc­ture, and allows you to eas­i­ly deploy the web­site. Whether you use GitHub​.com pri­vate repos, Beanstalk​a​pp​.com, Bit​Buck​et​.org, Git​Lab​.com, or your own pri­vate Git serv­er for your git repos, it does­n’t real­ly mat­ter. Just start using them.

This all goes into one box, and we store that box in a git repo, so it’s already backed up. If you’re not doing it already, the time is now to get on board with using git repos. It’s get­ting to the point where it’s a stan­dard part of web development.

Backup Boxes

Then the con­tent that the client cre­ates, in terms of the data they enter in the back­end, images they upload, and so on goes into anoth­er box. The Data­base & Asset Sync­ing Between Envi­ron­ments in Craft CMS talks about this sep­a­ra­tion, and we’re going to lever­age it here as well, again with the help of Craft-Scripts.

This box of client uploaded con­tent is the part that we have to devel­op a back­up strat­e­gy for.

Link Enter Craft-Scripts

Before we get into the nit­ty grit­ty of back­ups, let’s talk a lit­tle bit about the tools we’re going to use to make it happen.

Craft-Scripts are shell scripts to man­age data­base back­ups, asset back­ups, file per­mis­sions, asset sync­ing, cache clear­ing, and data­base sync­ing between Craft CMS envi­ron­ments. In real­i­ty, they will real­ly work with just about any CMS out there, but we’ll focus on their use with Craft CMS here.

You may already be famil­iar with Craft-Scripts, if you use them for Hard­en­ing Craft CMS Per­mis­sions or Data­base & Asset Sync­ing Between Envi­ron­ments in Craft CMS. They also have handy scripts for doing backups.

In a nut­shell, the way Craft-Scripts works is you copy the scripts fold­er into each Craft CMS pro­jec­t’s git repo, and then set up a (which is nev­er checked into git via .gitignore) on each envi­ron­ment where the project lives, such as live pro­duc­tion, staging, and local dev. For more on mul­ti­ple envi­ron­ments, check out the Mul­ti-Envi­ron­ment Con­fig for Craft CMS arti­cle.

Then you can use the same scripts in each envi­ron­ment, and they will know things like how to access the data­base, where the assets are, etc. based on the set­tings in the local

The Craft-Scripts doc­u­men­ta­tion cov­ers set­ting up the in detail, so we won’t go into that here, how­ev­er I think real-world exam­ples can be help­ful. So here’s the full that I use on my local dev envi­ron­ment for this very website:

# Craft Scripts Environment
# Local environmental config for nystudio107 Craft scripts
# @author    nystudio107
# @copyright Copyright (c) 2017 nystudio107
# @link
# @package   craft-scripts
# @since     1.1.0
# @license   MIT
# This file should be renamed to '' and it should reside in the
# `scripts` directory.  Add '' to your .gitignore.

# -- GLOBAL settings --

# What to prefix the database table names with

# The path of the `craft` folder, relative to the root path; paths should always have a trailing /

# The maximum age of backups in days; backups older than this will be automatically removed

# -- LOCAL settings --

# Local path constants; paths should always have a trailing /

# Local user & group that should own the Craft CMS install

# Local directories relative to LOCAL_ROOT_PATH that should be writeable by the $CHOWN_GROUP

# Local asset directories relative to LOCAL_ASSETS_PATH that should be synched with remote assets

# Craft-specific file directories relative to LOCAL_CRAFT_FILES_PATH that should be synched with remote files

# Absolute paths to directories to back up, in addition to `LOCAL_ASSETS_DIRS` and `LOCAL_CRAFT_FILE_DIRS`

# Local database constants

# If you are using mysql 5.6.10 or later and you have `login-path` setup as per:
# you can use it instead of the above LOCAL_DB_* constants; otherwise leave this blank

# The `mysql` and `mysqldump` commands to run locally

# Local backups path; paths should always have a trailing /

# -- REMOTE settings --

# Remote ssh credentials, and Remote SSH Port

# Remote path constants; paths should always have a trailing /

# Remote database constants

# If you are using mysql 5.6.10 or later and you have `login-path` setup as per:
# you can use it instead of the above REMOTE_DB_* constants; otherwise leave this blank

# The `mysql` and `mysqldump` commands to run remotely

# Remote backups path; paths should always have a trailing /

# Remote Amazon S3 bucket name

The only thing I’ve changed is I’ve XXXd out my REMOTE_DB_PASSWORD, every­thing else is exact­ly how I use it. Don’t wor­ry about under­stand­ing what all of the set­tings are now, I’m pre­sent­ing it here just to give you a feel for what it looks like ful­ly configured.

Now that the intro to Craft-Scripts is out of the way, let’s deal with some disasters!

Link Backups for Disasters Big and Small

When we talk about dis­as­ter recov­ery, we have to real­ize that dis­as­ters come in dif­fer­ent shapes and sizes, and pre­pare for like­ly sce­nar­ios. By far the most com­mon dis­as­ter” is that the client has some­how lost data due to delet­ing the wrong entry, or delet­ing an asset by mistake.

Disaster Recovery

In cas­es like this, what we real­ly want are local back­ups that are easy to access on the serv­er, and thus easy to restore. We want to ensure that the con­tent that the client cre­ates in the form of data­base entries and uploaded assets are tucked away safe­ly, await­ing the inevitable human error.

Link Local Database Backups

So our first step is mak­ing sure that we keep dai­ly back­ups of the data­base, for the times when client error caus­es data loss. For this, we’ll use the script.

When this script is exe­cut­ed, it will make a local copy of the data­base, exclud­ing cache tables we don’t want, neat­ly com­pressed and time-stamped, and save it in the direc­to­ry your spec­i­fy in LOCAL_BACKUPS_PATH.

It will also rotate the back­ups, in that it will delete any back­ups that are old­er than GLOBAL_DB_BACKUPS_MAX_AGE days old. This way, you’ll nev­er have to wor­ry about run­ning out of disk space due to back­ups gone wild.

I’ve found that in gen­er­al, prob­lems are usu­al­ly noticed with­in 30 days or so of them hap­pen­ing, but I’m para­noid, so I keep these local data­base back­ups around for 90 days. What you should set it to depends on your use-case, and how often you do the backups.

Here’s an exam­ple out­put after run­ning

vagrant@homestead ~/sites/nystudio107/scripts (develop) $ ./
*** Backed up local database to /home/vagrant/backups/nystudio/db/nystudio-db-backup-20170320-022335.sql.gz
*** 2 old database backups removed; details logged to /tmp/nystudio-db-backups.log

The num­bers at the end of the back­up archive are a time­stamp in the for­mat of YYYYMMDD-HHMMSS.

Link Local Asset Backups

So great, we have the clien­t’s data­base local­ly backed up. Next we need to back up their assets, the files that they upload into the CMS. To do this, we’ll use the script.

This script uses rsync to effi­cient­ly back up all of the asset direc­to­ries spec­i­fied in LOCAL_ASSETS_DIRS to the direc­to­ry spec­i­fied in LOCAL_BACKUPS_PATH. A sub-direc­to­ry LOCAL_DB_NAME/assets inside the LOCAL_BACKUPS_PATH direc­to­ry is used for the asset backups. will also back up the Craft userphotos and rebrand direc­to­ries from craft/storage by default. The direc­to­ries it will back­up are spec­i­fied in LOCAL_CRAFT_FILE_DIRS

Because rsync is used, the files are effec­tive­ly mir­rored into a sep­a­rate local direc­to­ry, so only files that have actu­al­ly changed are backed up. This makes the back­ups very quick, and because the files are stored uncom­pressed, you have quick and easy access to restore that won­der­ful image of a fluffy poo­dle that the client deleted.

If a file is delet­ed from a LOCAL_ASSETS_DIR, it does­n’t get delet­ed from the LOCAL_BACKUPS_PATH, so you can eas­i­ly find the file to res­cue it.

Here’s exam­ple out­put from

vagrant@homestead ~/sites/nystudio107/scripts (develop) $ ./
sending incremental file list
         21,175 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=144/152)
        294,064 100%   25.49MB/s    0:00:00 (xfr#2, to-chk=29/152)
        320,383 100%   12.73MB/s    0:00:00 (xfr#3, to-chk=6/152)
*** Backed up assets from /home/vagrant/sites/nystudio107/public/img/blog
sending incremental file list
*** Backed up assets from /home/vagrant/sites/nystudio107/public/img/clients
sending incremental file list
*** Backed up assets from /home/vagrant/sites/nystudio107/public/img/users
sending incremental file list
*** Backed up assets from /home/vagrant/sites/nystudio107/craft/storage/rebrand
sending incremental file list
*** Backed up assets from /home/vagrant/sites/nystudio107/craft/storage/userphotos

Because rsync is used for these back­ups, you can put a .rsync-filter in any direc­to­ry to define files/​folders to ignore. More info…

For exam­ple, if you don’t want any Craft image trans­forms backed up, your .rsync-filter file in each assets direc­to­ry might look like this:

# This file allows you to add filter rules to rsync, one per line, preceded by either
# `-` or `exclude` and then a pattern to exclude, or `+` or `include` and then a pattern
# to include. More info:
- _*
- _*/**

If you have arbi­trary direc­to­ries that you want backed up that exist out­side of your project direc­to­ry, you can use the script.

This script uses rsync to effi­cient­ly back up all of the asset direc­to­ries spec­i­fied in LOCAL_DIRS_TO_BACKUP to the direc­to­ry spec­i­fied in LOCAL_BACKUPS_PATH. A sub-direc­to­ry LOCAL_DB_NAME/files inside the LOCAL_BACKUPS_PATH direc­to­ry is used for the direc­to­ry backups.

Because rsync is used for these back­ups, you can put a .rsync-filter in any direc­to­ry to define files/​folders to ignore. More info…

For exam­ple, if you have a wiki with data/cache and data/tmp direc­to­ries that you don’t want backed up, your .rsync-filter file in the wiki direc­to­ry might look like this:

# This file allows you to add filter rules to rsync, one per line, preceded by either
# `-` or `exclude` and then a pattern to exclude, or `+` or `include` and then a pattern
# to include. More info:
- public/data/cache
- public/data/tmp

Link Backups of Backups Offsite

Fan­tas­tic, we’ve got all of the web­site struc­ture we cre­at­ed backed up in git, and we have local data­base back­ups and local asset back­ups. We’re cov­ered for the most com­mon sce­nar­ios where data has been lost in one way or another.

But what about when some­thing goes tru­ly wrong, and our serv­er isn’t accessible?

What we need is some inception: backups of backups

While it’s great to have local back­ups — and they are by far the most use­ful in prac­tice — we also want to have off­site back­ups that can be used if the prover­bial sh*t hits the fan.

For this, we’ll use the script which pulls down all of the back­ups from the REMOTE_BACKUPS_PATH on a remote serv­er to the LOCAL_BACKUPS_PATH on the com­put­er it’s run from.

This pulls down all of the data­base & assets we’ve backed up on our remote serv­er via the and scripts, and it does so via rsync so it’s very effi­cient in pulling down only the files that have changed.

This effec­tive­ly gives us an off­site mir­ror of all of our local back­ups that we can eas­i­ly access should the need arise. This off­site back­up can be to a local com­put­er, or it can be to anoth­er VPS that you spin up, as described in the How Agen­cies & Free­lancers Should Do Web Host­ing article.

Assum­ing you have set up ssh keys, you won’t even have to enter your pass­word for the remote serv­er. Here’s what the out­put of looks like:

vagrant@homestead /htdocs/nystudio107/scripts (develop) $ ./
receiving incremental file list
        435,059 100%    2.46MB/s    0:00:00 (xfr#154, to-chk=5/180)
        436,133 100%    1.65MB/s    0:00:00 (xfr#155, to-chk=4/180)
        436,381 100%    1.25MB/s    0:00:00 (xfr#156, to-chk=3/180)
        436,533 100%    1.01MB/s    0:00:00 (xfr#157, to-chk=2/180)
        436,821 100%  863.53kB/s    0:00:00 (xfr#158, to-chk=1/180)
        436,839 100%  743.21kB/s    0:00:00 (xfr#159, to-chk=0/180)
*** Synced backups from /home/forge/backups/nystudio
vagrant@homestead /htdocs/nystudio107/scripts (develop) $

If you’d like to sync your back­ups to an Ama­zon S3 buck­et, Craft-Scripts have you cov­ered there, too. 

The script syncs the back­ups from LOCAL_BACKUPS_PATH to the Ama­zon S3 buck­et spec­i­fied in REMOTE_S3_BUCKET.

This script assumes that you have already installed awscli and have con­fig­ured it with your cre­den­tials. Here’s what the out­put of the looks like:

forge@nys-production /htdocs/ (master) $ ./
upload: ../../backups/nystudio/db/nystudio-db-backup-20170322-000001.sql.gz to s3://backups.nystudio107/nystudio/db/nystudio-db-backup-20170322-000001.sql.gz
*** Synced backups to backups.nystudio107

It’s rec­om­mend­ed that you set up a sep­a­rate user with access to only S3, and set up a pri­vate S3 buck­et for your backups.

Link Automatic Script Execution

If you want to run any of these scripts auto­mat­i­cal­ly at a set sched­ule, here’s how to do it. We’ll use the script as an exam­ple, but the same applies to any of the scripts.

If you’re using Forge you can set the script to run night­ly (or what­ev­er inter­val you want) via the Scheduler.

Forge Scheduled Backups

Forge sched­uled backups

If you’re using Server​Pi​lot​.io or are man­ag­ing the serv­er your­self, just set the script to run via cron at what­ev­er inter­val you desire.

Craft-Scripts includes a crontab-helper.txt that you can add to your crontab to make con­fig­ur­ing it eas­i­er. Remem­ber to use full, absolute paths to the scripts when run­ning them via cron, as cron does not have access to your envi­ron­ment paths, e.g.:


There we go, set and for­get auto­mat­ed backups.

Link Becoming a Digital Nomad

The oth­er fan­tas­tic ben­e­fit of imple­ment­ing a back­up sys­tem like this is that you effec­tive­ly become a dig­i­tal nomad. If you’ve set up your web­site via a pro­vi­sion­ing ser­vice like Lar­avel Forge or Server​Pi​lot​.io as described in the How Agen­cies & Free­lancers Should Do Web Host­ing arti­cle, you’re no longer teth­ered to any par­tic­u­lar host­ing arrangement.

Digital Nomad

You can quick­ly spin up a new serv­er, deploy your web­site to it by link­ing it to your git repo, pull your assets down to it, pull your data­base down to it, and away you go!

​This kind of freedom is a wonderful thing

It makes what used to be a scary, fraught-rid­den process of mov­ing to a new serv­er a piece of cake! Gone are the days when you’re dread­ing a serv­er migra­tion, or you don’t update or enhance your serv­er out of fear that you’ll break something.

Link Disaster Recovery Drills

The final thing that I strong­ly rec­om­mend that you do are dis­as­ter recov­ery drills. Use this new­found free­dom as a dig­i­tal nomad to actu­al­ly put your back­ups to the test.

Spin up a new VPS, and try restor­ing a web­site from scratch.

Practice Drill

There’s no bet­ter way to gain con­fi­dence in your dis­as­ter recov­ery plan than to prac­tice doing it. It sure beats sac­ri­fic­ing chick­ens and pray­ing when you’re under the gun and fac­ing an actu­al disaster.

To help you with this, Craft-Scripts comes with the script. You pass in a path to the data­base dump, and it will restore it to the local data­base (after back­ing up the local data­base first). You can pass in a path to either a .sql data­base dump, or a .gz com­pressed data­base dump, either works.

Here’s the exam­ple out­put of the script:

vagrant@homestead /htdocs/nystudio107/scripts (develop) $ ./ /home/vagrant/backups/nystudio/db/nystudio-db-backup-20170320-022335.sql.gz
*** Backed up local database to /tmp/nystudio-db-backup-20170321.sql.gz
*** Restored local database from /home/vagrant/backups/nystudio/db/nystudio-db-backup-20170320-022335.sql.gz

If all this seems like a lot of work, just con­sid­er it prac­tice. Craft-Scripts does a lot of the heavy lift­ing for you. The first time you do it, it’ll take a bit of time to get famil­iar with how it all works, but after that you’ll gain the con­fi­dence that comes with experience.

And you’ll also gain a very use­ful — and bill­able — skill set in your repertoire.