Andrew Welch · Insights · #craftcms #frontend #devops

Published , updated · 5 min read ·


Please consider 🎗 sponsoring me 🎗 to keep writing articles like this.

Database & Asset Syncing Between Environments in Craft CMS

Sync­ing the data­base & assets between envi­ron­ments is an impor­tant work­flow, but it’s a pain. Here’s some help

Once you have devel­oped and deployed a Craft CMS site, you then have the issue of ongo­ing main­te­nance, and often ongo­ing devel­op­ment. This requires that you come up with some kind of a sys­tem for sync­ing the data­base & assets between the live pro­duc­tion serv­er, and your local dev envi­ron­ment (and often a staging envi­ron­ment for client approval of changes, too).

Syn­chro­niz­ing data­bas­es is not new; tech­niques such as master/​slave data­base repli­ca­tion have been around for years. Tools such as MySQL Work­bench allow you to syn­chro­nize MySQL schema changes as well. How­ev­er for all but the most elab­o­rate web­site projects, these tech­niques are typ­i­cal­ly overkill.

And while Pix­el & Ton­ic has announced Con­tent Migra­tions for Craft 3, this is more of a frame­work for the future than any­thing that will allow you to adroit­ly move data­base schema changes between envi­ron­ments today.

So we need to come up with something that is practical, in terms of being relatively easy to set up and utilize on a daily basis.

The tech­niques pre­sent­ed here are not by any means the be-all and end-all of data­base & asset synch­ing method­ol­o­gy. There are many ways to do it, and what I present here is far from per­fect. How­ev­er, it works pret­ty well for small to mid-size projects, and is a rea­son­able mid­dle-ground between com­pre­hen­sive­ness and usabil­i­ty. It also does­n’t require a ton of set­up work.

But first, let’s talk about code flow.

This is typ­i­cal­ly how a project should flow in terms of how we devel­op and deploy the web­sites we work on. The web­site itself exists in a git repo some­where, whether you use GitHub​.com pri­vate repos, Beanstalk​a​pp​.com, Bit​Buck​et​.org, Git​Lab​.com, or your own pri­vate Git serv­er for your git repos, it does­n’t real­ly mat­ter. Just start using them.

This git repo serves both as a method to store your work in a col­lab­o­ra­tive, ver­sioned man­ner, and also as a flow con­trol device. All of the code you work on flows in and out of git. 

Note that this is a one-way street: we devel­op and test in local dev, then we push our changes to the git repo, and then we pull them down to live pro­duc­tion and/​or staging. While we may pull changes from oth­er devel­op­ers into our local dev envi­ron­ment, changes are nev­er com­ing from live pro­duc­tion or staging.

Updates to Craft CMS itself, as well as any plu­g­ins you use should always be done only in local dev, and then pushed to git after they’ve been test­ed. You can ensure this hap­pens by set­ting 'allowAutoUpdates' => false in your craft/config/general.php file, so there’s no Update but­ton for peo­ple to push. For more infor­ma­tion, check out the Mul­ti-Envi­ron­ment Con­fig for Craft CMS article.

This helps us avoid heart-wrench­ing moments where an update caus­es our live pro­duc­tion serv­er to error, because we’ve test­ed it in local dev already. Worst-case, because the changes have all flowed through git, we can always roll back to a pre­vi­ous com­mit to get the web­site back up and run­ning with alacrity.

This is much easier on your heart than having an “oh shit” moment, scrambling to find your most recent backups whilst you field your client’s panicked phone calls.

While a tuto­r­i­al on git is beyond the scope of this arti­cle, you should have a sim­i­lar one-way street par­a­digm when it comes to man­ag­ing your var­i­ous branch­es in git. Changes should nev­er, ever be pushed direct­ly to master. You should work in devel­op­ment or fea­ture branch­es, and then merge master with them after you’ve test­ed the changes.

Come up with a sys­tem that works for you when it comes to branch­es in git, and stick to it adamant­ly. This will pre­vent you from cre­at­ing a git graph that looks like Gui­tar Hero, and will save you many, many hours of frustration.

Excep­tion­al devel­op­ers have the best-prac­tices habits they do not out of dis­ci­pline, but out of pain. The pain of hav­ing done it the wrong way, and hav­ing had to deal with the fallout.

Link Enter: The Client

So this is all well and good, but what do we do now that we have a live web­site where the client is mak­ing con­stant changes to the data­base willy nil­ly, and uploads a new pic­ture of their cat every day? Since we can’t expect the client to deal with data­base merges, the onus falls upon us.

Here’s what I sug­gest you do:

​Designate one server, either live production or staging, as the “gold master” from which all other environments pull from. All database schema changes are made there, and all new asset uploads happen there as well.

Whichev­er you choose, the client should be locked out of the oth­er envi­ron­ment, to pre­vent them from acci­den­tal­ly mak­ing changes in the wrong place. A great way to do this is using the Main­te­nance plu­g­in, which allows you to lock all non-admin users from mak­ing changes in the back­end, and present them with a nice cus­tomiz­able mes­sage as well.

Anoth­er nice thing you can do for your clients (and your­self) is install the Envi­ron­ment Label plu­g­in, which will clear­ly let you know what envi­ron­ment you’re work­ing with at any giv­en time.

So here’s what the data­base & asset flow might look like:

In this case, we made our live pro­duc­tion data­base the gold mas­ter”, so we sync down to staging and local dev from it. This works well for the typ­i­cal sce­nario where we devel­op a web­site, hand it off to the client, they use it for some months, and only occa­sion­al­ly ask us to make some changes/​additions to it. Spo­radic enhance­ment”, let’s call it.

If instead you have a project where there is con­tin­u­ous devel­op­ment that’s being done on the web­site, often it’s bet­ter to des­ig­nate staging as the gold mas­ter”, and then you sync staging down to live pro­duc­tion when changes are test­ed and approved. Both you and the client make changes in staging only, and the client needs to be made aware that changes they make will not be on live pro­duc­tion immediately.

Tan­gent: The rea­son we’re using a script to sync assets is because we should be exclud­ing all client-uploaded images & oth­er files from the git repo via .gitignore. Fail­ing to do so can result in all man­ner of has­sles. What hap­pens is the files will ini­tial­ly exist on live pro­duc­tion when the client uploads them, then you down­load or sync them to local dev, and check them into git.

The prob­lem crops up when you try to deploy back to live pro­duc­tion, because the files already exist there, and git does­n’t want to over­write exist­ing untracked files. So it throws an error dur­ing deployment.

The golden rule is: if you created it, check it into git. If your client created it, exclude it from git, and sync it instead.

So okay, great. How?

Link Craft-Scripts to the Rescue!

Those of you who read my Hard­en­ing Craft CMS Per­mis­sions arti­cle are already famil­iar with using craft-scripts to secure Craft CMS install per­mis­sions. Well, we’re going to be using them again, but this time to sync the data­base & assets between environments. 

To use them, you’ll need to do the following:

  1. Down­load or clone the craft-scripts git repo
  2. Copy the scripts fold­er into the root direc­to­ry of your Craft CMS project
  3. Dupli­cate the example.env.sh file, and rename it to .env.sh
  4. Add .env.sh to your .gitignore file
  5. Then open up the .env.sh file into your favorite edi­tor, and replace REPLACE_ME with the appro­pri­ate settings.

All set­tings that are pref­aced with LOCAL_ refer to the local envi­ron­ment where the script will be run, not your local dev environment.

The set­tings should all be fair­ly self-explana­to­ry, but here’s my .env.sh for my local nystudio107.dev so you can see a work­ing example:

# Craft Scripts Environment
#
# Local environmental config for nystudio107 Craft scripts
#
# @author    nystudio107
# @copyright Copyright (c) 2017 nystudio107
# @link      https://nystudio107.com/
# @package   craft-scripts
# @since     1.1.0
# @license   MIT
#
# This file should be renamed to '.env.sh' and it should reside in the
# `scripts` directory.  Add '.env.sh' to your .gitignore.

# -- GLOBAL settings --

# What to prefix the database table names with
GLOBAL_DB_TABLE_PREFIX="craft_"

# The path of the `craft` folder, relative to the root path; paths should always have a trailing /
GLOBAL_CRAFT_PATH="craft/"

# The maximum age of backups in days; backups older than this will be automatically removed
GLOBAL_DB_BACKUPS_MAX_AGE=90

# -- LOCAL settings --

# Local path constants; paths should always have a trailing /
LOCAL_ROOT_PATH="/home/vagrant/sites/nystudio107/"
LOCAL_ASSETS_PATH=${LOCAL_ROOT_PATH}"public/img/"

# Local user & group that should own the Craft CMS install
LOCAL_CHOWN_USER="vagrant"
LOCAL_CHOWN_GROUP="vagrant"

# Local directories relative to LOCAL_ROOT_PATH that should be writeable by the $CHOWN_GROUP
LOCAL_WRITEABLE_DIRS=(
                "${GLOBAL_CRAFT_PATH}storage"
                "public/img"
                )

# Local asset directories relative to LOCAL_ASSETS_PATH that should be synched with remote assets
LOCAL_ASSETS_DIRS=(
                "blog"
                "clients"
                "users"
                )

# Craft-specific file directories relative to LOCAL_CRAFT_FILES_PATH that should be synched with remote files
LOCAL_CRAFT_FILE_DIRS=(
                "rebrand"
                "userphotos"
                )

# Absolute paths to directories to back up, in addition to `LOCAL_ASSETS_DIRS` and `LOCAL_CRAFT_FILE_DIRS`
LOCAL_DIRS_TO_BACKUP=(
                "/home/forge/wiki.nystudio107.com"
                )

# Local database constants
LOCAL_DB_NAME="nystudio"
LOCAL_DB_PASSWORD="secret"
LOCAL_DB_USER="homestead"
LOCAL_DB_HOST="localhost"
LOCAL_DB_PORT="3306"

# If you are using mysql 5.6.10 or later and you have `login-path` setup as per:
# https://opensourcedbms.com/dbms/passwordless-authentication-using-mysql_config_editor-with-mysql-5-6/
# you can use it instead of the above LOCAL_DB_* constants; otherwise leave this blank
LOCAL_DB_LOGIN_PATH="localdev"

# The `mysql` and `mysqldump` commands to run locally
LOCAL_MYSQL_CMD="mysql"
LOCAL_MYSQLDUMP_CMD="mysqldump"

# Local backups path; paths should always have a trailing /
LOCAL_BACKUPS_PATH="/home/vagrant/backups/"

# -- REMOTE settings --

# Remote ssh credentials, user@domain.com and Remote SSH Port
REMOTE_SSH_LOGIN="forge@nystudio107.com"
REMOTE_SSH_PORT="22"

# Remote path constants; paths should always have a trailing /
REMOTE_ROOT_PATH="/home/forge/nystudio107.com/"
REMOTE_ASSETS_PATH=${REMOTE_ROOT_PATH}"public/img/"

# Remote database constants
REMOTE_DB_NAME="nystudio"
REMOTE_DB_PASSWORD="XXX"
REMOTE_DB_USER="nystudio"
REMOTE_DB_HOST="localhost"
REMOTE_DB_PORT="3306"

# If you are using mysql 5.6.10 or later and you have `login-path` setup as per:
# https://opensourcedbms.com/dbms/passwordless-authentication-using-mysql_config_editor-with-mysql-5-6/
# you can use it instead of the above REMOTE_DB_* constants; otherwise leave this blank
REMOTE_DB_LOGIN_PATH=""

# The `mysql` and `mysqldump` commands to run remotely
REMOTE_MYSQL_CMD="mysql"
REMOTE_MYSQLDUMP_CMD="mysqldump"

# Remote backups path; paths should always have a trailing /
REMOTE_BACKUPS_PATH="/home/forge/backups/"

Oth­er than the X’d out pass­word, this is the exact .env.sh that I use for my local dev environment.

You might won­der why all of this is in a .env.sh file, rather than in the script itself. The rea­son is so that the same scripts can be used in mul­ti­ple envi­ron­ments such as local dev, staging, and live pro­duc­tion with­out mod­i­fi­ca­tion. We just cre­ate a .env.sh file in each envi­ron­ment, and keep it out of our git repo via .gitignore.

I upload any new images I need for my web­site on the pro­duc­tion serv­er via Craft’s Assets inter­face. Then to sync my local dev assets from the live pro­duc­tion serv­er, I sim­ply cd to the scripts direc­to­ry, and type:

./pull_assets.sh

Assum­ing you have set up ssh keys, you won’t even have to enter your pass­word for the remote serv­er. The out­put might look some­thing like this:

vagrant@homestead /htdocs/nystudio107/scripts (master) $ ./pull_assets.sh
receiving incremental file list
blog/
blog/security.jpg
      1,285,927 100%    3.94MB/s    0:00:00 (xfr#1, to-chk=13/25)
blog/unix_permissions.png
         24,273 100%   67.73kB/s    0:00:00 (xfr#2, to-chk=10/25)
blog/unix_permissions_numbers.png
         28,358 100%   78.45kB/s    0:00:00 (xfr#3, to-chk=9/25)
blog/_desktop/
blog/_desktop/997321-code.jpg
        125,628 100%  343.65kB/s    0:00:00 (xfr#4, to-chk=7/25)
blog/_desktop/cache-bg.jpg
        136,138 100%  367.26kB/s    0:00:00 (xfr#5, to-chk=6/25)
blog/_desktop/image_optimzation.jpg
         66,463 100%  176.85kB/s    0:00:00 (xfr#6, to-chk=5/25)
blog/_desktop/lightning-trees.jpg
         92,460 100%  242.72kB/s    0:00:00 (xfr#7, to-chk=4/25)
blog/_desktop/rainforest-environment.jpg
        216,071 100%  559.70kB/s    0:00:00 (xfr#8, to-chk=3/25)
blog/_desktop/security.jpg
        165,875 100%  419.66kB/s    0:00:00 (xfr#9, to-chk=2/25)
blog/_desktop/112/twig_code.jpg
         90,046 100%  224.90kB/s    0:00:00 (xfr#10, to-chk=0/25)
*** Synced assets from blog
receiving incremental file list
*** Synced assets from users

For the curi­ous, here’s what the script itself looks like. It’s pret­ty sim­ple, really:

#!/bin/bash

# Pull Assets
#
# Pull remote assets down from a remote to local
#
# @author    nystudio107
# @copyright Copyright (c) 2017 nystudio107
# @link      https://nystudio107.com/
# @package   craft-scripts
# @since     1.1.0
# @license   MIT

# Get the directory of the currently executing script
DIR="$(dirname "${BASH_SOURCE[0]}")"

# Include files
INCLUDE_FILES=(
            "common/defaults.sh"
            ".env.sh"
            "common/common_env.sh"
            )
for INCLUDE_FILE in "${INCLUDE_FILES[@]}"
do
    if [ -f "${DIR}/${INCLUDE_FILE}" ]
    then
        source "${DIR}/${INCLUDE_FILE}"
    else
        echo 'File "${DIR}/${INCLUDE_FILE}" is missing, aborting.'
        exit 1
    fi
done

# Make sure the local assets directory exists
if [[ ! -d "${LOCAL_ASSETS_PATH}" ]] ; then
    echo "Creating asset directory ${LOCAL_ASSETS_PATH}"
    mkdir -p "${LOCAL_ASSETS_PATH}"
fi

# Pull down the asset dir files via rsync
for DIR in "${LOCAL_ASSETS_DIRS[@]}"
do
    rsync -a -z -e "ssh -p ${REMOTE_SSH_PORT}" "${REMOTE_SSH_LOGIN}:${REMOTE_ASSETS_PATH}${DIR}" "${LOCAL_ASSETS_PATH}" --progress
    echo "*** Synced assets from ${REMOTE_ASSETS_PATH}${DIR}"
done

# Make sure the Craft files directory exists
if [[ ! -d "${LOCAL_CRAFT_FILES_PATH}" ]] ; then
    echo "Creating Craft files directory ${LOCAL_CRAFT_FILES_PATH}"
    mkdir -p "${LOCAL_CRAFT_FILES_PATH}"
fi

# Pull down the Craft-specific dir files via rsync
for DIR in "${LOCAL_CRAFT_FILE_DIRS[@]}"
do
    rsync -a -z -e "ssh -p ${REMOTE_SSH_PORT}" "${REMOTE_SSH_LOGIN}:${REMOTE_CRAFT_FILES_PATH}${DIR}" "${LOCAL_CRAFT_FILES_PATH}" --progress
    echo "*** Synced assets from ${REMOTE_CRAFT_FILES_PATH}${DIR}"
done

# Normal exit
exit 0

It sim­ply reads in the set­tings from the .env.sh file, and then uses the rsync com­mand to copy down any asset direc­to­ries that you’ve des­ig­nat­ed. rsync is pret­ty smart, it will only sync files that have changed, so it rarely takes long to do.

As a bonus, it will also pull down your rebrand and userphotos direc­to­ries from craft/storage (which should­n’t be checked into git).

Sim­i­lar­ly, I make any data­base schema changes need­ed for the web­site on the live pro­duc­tion serv­er via Craft’s AdminCP inter­face. Then to sync my local dev data­base from the live pro­duc­tion serv­er, I sim­ply cd to the scripts direc­to­ry, and type:

./pull_db.sh

Assum­ing you have set up ssh keys, you won’t even have to enter your pass­word for the remote serv­er. The out­put might look some­thing like this:

vagrant@homestead /htdocs/nystudio107/scripts (master) $ ./pull_db.sh
nystudio-db-dump-20161209.sql                                                               100% 2917KB   2.9MB/s   00:01
mysqldump: [Warning] Using a password on the command line interface can be insecure.
*** Backed up local database to /tmp/nystudio-db-backup-20161209.sql
mysql: [Warning] Using a password on the command line interface can be insecure.
*** Restored local database from /tmp/nystudio-db-dump-20161209.sql

Again for the curi­ous, here’s what the script looks like:

#!/bin/bash

# Pull Database
#
# Pull remote database down from a remote and restore it to to local
#
# @author    nystudio107
# @copyright Copyright (c) 2017 nystudio107
# @link      https://nystudio107.com/
# @package   craft-scripts
# @since     1.1.0
# @license   MIT

# Get the directory of the currently executing script
DIR="$(dirname "${BASH_SOURCE[0]}")"

# Include files
INCLUDE_FILES=(
            "common/defaults.sh"
            ".env.sh"
            "common/common_env.sh"
            "common/common_db.sh"
            )
for INCLUDE_FILE in "${INCLUDE_FILES[@]}"
do
    if [ -f "${DIR}/${INCLUDE_FILE}" ]
    then
        source "${DIR}/${INCLUDE_FILE}"
    else
        echo 'File "${DIR}/${INCLUDE_FILE}" is missing, aborting.'
        exit 1
    fi
done

# Temporary db dump path (remote & local)
TMP_DB_PATH="/tmp/${REMOTE_DB_NAME}-db-dump-$(date '+%Y%m%d').sql"
BACKUP_DB_PATH="/tmp/${LOCAL_DB_NAME}-db-backup-$(date '+%Y%m%d').sql"

# Get the remote db dump
ssh $REMOTE_SSH_LOGIN -p $REMOTE_SSH_PORT "$REMOTE_MYSQLDUMP_CMD $REMOTE_DB_CREDS $MYSQLDUMP_SCHEMA_ARGS > '$TMP_DB_PATH' ; $REMOTE_MYSQLDUMP_CMD $REMOTE_DB_CREDS $MYSQLDUMP_DATA_ARGS >> '$TMP_DB_PATH' ; gzip -f '$TMP_DB_PATH'"
scp -P $REMOTE_SSH_PORT -- $REMOTE_SSH_LOGIN:"${TMP_DB_PATH}.gz" "${TMP_DB_PATH}.gz"

# Backup the local db
$LOCAL_MYSQLDUMP_CMD $LOCAL_DB_CREDS $MYSQLDUMP_SCHEMA_ARGS > "$BACKUP_DB_PATH"
$LOCAL_MYSQLDUMP_CMD $LOCAL_DB_CREDS $MYSQLDUMP_DATA_ARGS >> "$BACKUP_DB_PATH"
gzip -f "$BACKUP_DB_PATH"
echo "*** Backed up local database to ${BACKUP_DB_PATH}.gz"

# Restore the local db from the remote db dump
zcat "${TMP_DB_PATH}.gz" | $LOCAL_MYSQL_CMD $LOCAL_DB_CREDS
echo "*** Restored local database from ${TMP_DB_PATH}.gz"

# Normal exit
exit 0

What it does is it sshs into your remote serv­er, then it dumps the data­base to /tmp using mysqldump, then it uses scp to copy the data­base dump down, backs up the local data­base to /tmp and final­ly it restores the data­base in the local envi­ron­ment by pip­ing it into mysql. The data­base dumps are all time­stamped as well.

If you’re using mysql 5.6 or lat­er, you’ll note the warn­ing from mysql (this is not an issue if you’re using Mari­aDB):

mysql: [Warning] Using a password on the command line interface can be insecure.

What the craft-scripts is doing isn’t any less secure than if you typed it on the com­mand line your­self; every­thing sent over the wire is always encrypt­ed via ssh. How­ev­er, you can set up login-path to store your cre­den­tials in an encrypt­ed file as per the Pass­word­less authen­ti­ca­tion using mysql_​config_​editor with MySQL 5.6 article.

If you set LOCAL_DB_LOGIN_PATH or REMOTE_DB_LOGIN_PATH it will use --login-path= for your db cre­den­tials on the respec­tive envi­ron­ments instead of send­ing them in via the com­mand line.

For exam­ple, for my local dev setup:

    mysql_config_editor set --login-path=localdev --user=homestead --host=localhost --port=3306 --password

…and then enter the pass­word for that user. And then in the .env.sh I set it to:

    LOCAL_DB_LOGIN_PATH="localdev"

…and it will use my stored, encrypt­ed cre­den­tials instead of pass­ing them in via com­mand line. You can also set this up on your remote serv­er, and then set it via REMOTE_DB_LOGIN_PATH

Link Sync Away!

Note that since the craft-scripts use a .env.sh for all of their con­fig­u­ra­tion, you can use the same scripts to sync from any envi­ron­ment to any environment.

Just set up a .env.sh on local dev, and you can sync to local dev from staging or live pro­duc­tion. Set up a .env.sh on live pro­duc­tion, and you can use these same scripts, the same way, to sync to live pro­duc­tion from staging.

That’s it… now kick back and sync in style.