How to backup your Shopify store for free (with a bit of virtual elbow grease)

I just had the … pleasure … of helping someone figure out how to completely backup their Shopify store. While there are guides out there, there’s one big (BIG) step they tend to gloss over – and it’s especially easy using *nix or (if you use Windows 10 and up) WSL.

This might seem like a lot, but for the small shop that I helped backup, it saved us from having to manually cut, paste, and download almost 200 separate files. (Also, this post spends time explaining why and how below so you’re not just running blind.)

The first step is to go and export every CSV file that you can from Shopify. You’re only going to be able to export most of the data – customers, products, theme, order, discount codes, gift cards, and financial data. And as a big caveat there, you’re only going to get the text data. No product images, blog posts, or pages – but we’ll get to those in a moment.

It’s not too bad to export that data – in your Shopify account, you have to go to each of those pages and click the Export button. Choose the CSV that is "for Excel, Numbers, and other spreadsheet programs" for each thing you want to export. Shopify will then email you a link to get the CSV file. Download that sucker and put it in a directory somewhere.

You’ll also want to get the free Shopify app ExIm (EXport/IMport data) in order to get the information from your pages, blogs, and theme. Unlike the built-in exporter, this app will allow you to directly download JSON files with the information (but again, not pictures) from those areas of your Shopify store.

Unzip the files you’ve downloaded and put those JSON and CSV files into a directory or set of directories that you created. How organized and split up you wish to make this is entirely up to you and the complexity of your store.

This is where most guides tell you to "cut and paste" a lot of things and manually download them. Yeah, no. Let’s hit the terminal and @#$* this pig. [1]

First, if you’re running Debian (even if it’s using WSL) or derivatives like Ubuntu, make sure you install a couple of SUPER handy command line tools with the command:

sudo apt install wget csvtool sed gawk grep coreutils

Most of these are probably already installed.

For each CSV file that you downloaded, run the following command:

csvtool namedcol "Image Src" FILENAME | grep -e "^http" | sed -E 's/\?.*$//' >> imgsrc.txt

Be sure to replace "FILENAME" with the actual file name, for example, csvtool namedcol "Image Src" products_export_1.csv | grep -e "^http" | sed -E 's/\?.*$//' >> imgsrc.txt .

This pulls the image URLs out, leaves off anything that isn’t a URL (blank lines, column headers, etc), strips off the trailing "?blahblahblah" at the end, and puts them in a text file named imgsrc.txt. If the file already exists, it appends to the already existing file. I’ve done byte comparisons of files downloaded both with and without that trailing part and they’re exactly the same. If your organizational needs are simple, you can put all these image URLs in the same text file – we’ll do an automated check for duplicates in a moment.

The JSON files are a little more complicated. For each of the JSON files, do this command:

cat FILENAME | tr "<" "\n" | grep -e "^img " | awk -F '"' '{print $2}' | cut -d'\' -f 1 >> imgsrc.txt

Again, make sure that you substitute the actual filename for each file, e.g. cat articles-7234566231.json | tr "<" "\n" | grep -e "^img" | awk -F '"' '{print $2}' | cut -d'\' -f 1 > imgsrc.txt. tr adds a line break at the beginning of every HTML tag inside the JSON file, then grep only returns strings that start with img (img src="https://blahblah.com/file.jpg"). The awk bit then only returns the part inside quotation marks, and the cut statement removes a trailing \ that is in there at the end of each line. I’m sure there’s much more stylish and elegant ways to get the data out, but this will work quickly and serve our purposes well enough. [2]

Three more steps, my droogs. First, we get rid of any duplicates:

cat imgsrc.txt | uniq | sort > imgsrc_cleaned.txt

Then run this from the terminal:

wget --wait=5 --random-wait -i imgsrc_cleaned.txt

This will download all the URLs (our images) in the file imgsrc_cleaned.txt, waiting between 2.5 and 7.5 seconds between each retrieval to ensure that you don’t slam the server too badly. Go get a drink of your favorite beverage – this may take a little while.

Final step! In a different directory, run this command:

wget --wait=5 --random-wait --recursive --level=3 --convert-links --backup-converted --html-extension https://myshopifystore.com

Be sure to replace the placeholder URL at the end with the URL of your shop. This will create a local copy of your shop that you can browse. If you want the links to point to the actual shop instead, remove the --convert-links portion of the command line. As before, go get another drink; this will take a while.

And that’s it. Does this seem like a lot of work? Eh, maybe. With everything already installed, the actual backing up of the shop took maybe half an hour to an hour. The alternative?

You have to copy and paste any images, content, and categories from Shopify in separate Word or Excel files.

Yeah. For the relatively small store that I helped back up, that’s about 200 images I would have had to cut and paste and download. No. Thank. You.

While it didn’t take too long to actually follow these instructions, writing this up took about two to three hours, so if you’ve found this post useful at all, toss me some coin over on Ko-Fi or PayPal. Thanks!

Featured Photo by Christelle BOURGEOIS on Unsplash

[1] Allegedly.
[2] By the by, yes, I’m using cat instead of inline editing simply because I’m writing for comprehension, not efficiency here.