Automate Your Web Site Backup! 05/05/08

During the weekend (Saturday), UbuntuLinuxHelp was down for almost 12 hours. Fortunately the hosting provider had data backups and there was no data loss. In any event, I also keep backups, so the added redundancy helps to protect the content. Up to now, the server has been configured to create a daily backup of databases and certain directories; and those (.gz files) are downloaded manually to another location later.

But, what if there were no backups? What if your hosting provider cannot restore data at their end? To be blunt, you’d be back to square one! Developing a whole new site or blog from the beginning! That’s a chilling thought, to lose everything and start again.

For peace of mind and data (intellectual property) , today’s post will highlight some of the steps we’ve taken to fully automate the backup process. Hopefully this will help many of you who may encounter the same issues, or are simply looking for a proactive, automated backup system for your web sites, blogs, ecommerce sites, etc.

We’ll need 5 things to ensure this system works:

  • The remote host (your web hosting server).
  • The local host (your Ubuntu or other Linux based desktop).
  • The open source Rsync package.
  • OpenSSH.
  • Cron.

Let’s start with our desktop, which is the ‘localhost’. In my case the desktop is Ubuntu Linux 7.10, but this can be any Linux based system. This could also be another Linux server, if you tweak this a bit more. ;)

I know ‘cron’ is enabled (because it’s part of the default installation) of my Linux desktop. I also know SSH is installed (because it’s installed by default and I’ve used it), but I’m not sure if ‘rsych’ is there and if it works over SSH.

Side note: For those not familiar with Rsync, “rsync is an open source utility that provides fast incremental file transfer. rsync is freely available under the GNU General Public License and is currently being maintained by Wayne Davison.” Source: http://samba.anu.edu.au/rsync/

To see if rsych is installed, use the following terminal command:

apt-cache search rsync

If you see it’s installed, to determine if rsync works over SSH, open a terminal and type the following command (substituting your correct information):

rsync -avz -e ssh Your Remote Username@Your Remote Server Host:/The Remote/dir /Your Local/dir/

Here is what the switches mean:

a: Use ‘archive’ mode.
v: Use ‘verbose’ output.
z: Use ‘compression’ during file transfer.
e: Specify the ‘command’ to run. In this case SSH.

In my case the command could look something like this:

rsync -avz -e ssh backupadmin@ubuntulinuxhelp.com :/backupdir/daily /home/ubplay/sitebackups

After entering the above command, I’m prompted to enter the password and the file transfer begins.

In my case this is simple because the hosting provider uses ‘The’ industry standard software (Linux) as the standard applications, openssh, rsych, cron, etc. And my local Linux system already had the tools installed. Now that I’ve determined it works, cron can automate the system. However, before moving to cron, make sure your server is configured to backup the files and databases on a daily (or other) schedule.

If you’re using industry standard hosting services, you’ll be on a Linux box using cPanel. Personally, I’ve tried several others including Plesk, ISPConfig, etc, however in my opinion, they don’t have the amount of flexibility or options that cPanel does. In terms of a LAN however, in my opinion nothing beats Webmin. Webmin has the greatest flexibility and options. However, I’m going off topic here, back to the subject at hand!… Log into your hosting control panel and use the interface to configure your scheduled backups to occur during low-traffic periods. Make a note of the directory the backups are saved to. WHM/cPanel is great for this as it’s configured via a simple GUI, and is easy to use. :) In my case the server backs up the web site files and databases and stores them in /backupdir (so that my cron job can download any files in this directory later). For privacy issues, I’m not going to post the script as it contains a username and password among other “exposures”.

Before moving to cron itself, I needed to configure a script that will rsync over the SSH connection. Here are some example I found on the rsync site: Rsync Examples. Another great resource we found is here: resync-incr. On this site you’ll see another methodology and example scripts. And finally another great backup scripting resource here: Backup Script. I’m sure some of you have other great sites and resources listed, please comment below and add them. :)

After you’ve set up your script, however you want it (there are hundreds of ways!), use cron to run it. Setting up the cron job is not very difficult:

0 2 * * * /home/ubplay/cron/rsync-ubuntulinuxhelp

This (above) downloads the backup at 2am every day. Remember to ensure that your server has finished creating its backup by this time. Otherwise you’ll not be downloading the files you expect. In my case I use nano to create the file called “rsync-ubuntulinuxhelp” placed in the …/cron director. The file named rsync-ubuntulinuxhelp contains the actual bash script. To create the cron job itself (that calls the script), complete the following in a terminal:

sudo crontab -e

and use the following parameters:

* * * * * path to script/command to be executed and script/command
- – – – -
| | | | |
| | | | — Day of week (0 – 7)
| | | ——- Month (1 – 12)
| | ——— Day of month (1 – 31)
| ———– Hour (0 – 23)
————- Minute (0 – 59)

(‘*’ means ‘every’).

Side note: to view your existing cron jobs, in a terminal, type:

sudo cron -l

to delete a cron job:

sudo cron -r

As usual, I hope this helps some of you!

Enjoy :)

[tags]linux, ubuntu, automatic, backup, website, cron, rsync, how to, openssh, save website[/tags]

Sharing is loving!


You can leave a response, or trackback from your own site.

10 Responses to this article

 
Kenney May 6, 2008 Reply

Thank you very much. This was something I was looking for. I also had troubles with my hosting provider.

 
 
UbuntuLinuxHelp May 6, 2008 Reply

@Kenney – Thanks. :) One thing I also realized is when cPanel makes the backups on the server, it already compresses them to .gz files so no need to re-compress using rsync (which will make the download slower). Therefore I removed the -z switch. ;)

 
Arp May 6, 2008 Reply

Would this work in reverse as well? I’ve been looking for away to back up my local data (photos, etc). My host has good redundancy, and my experience with ElephantDesktop was a travesty.

 
UbuntuLinuxHelp May 6, 2008 Reply

@Arp – I would think so! ;)
There’s no reason why you can send stuff to your server.

 
Rolf May 23, 2008 Reply

Try rsnapshot!

It tankes advantage of the ext2/ext3 hardlinking features to save diskspace for unchanged files. It gives you “full backup” but does only require just a little more than the space of one full backup, plus incrementals.

It gives you the possibility to go back in time, restoring just a file, a directory and a lot more…

It is based on rsync.
http://www.rsnapshot.org/

 
vladi May 26, 2008 Reply

for rsync to work whith cron trough ssh, you “must” generate authkey without password, or your cron job may not connect to the server, because the ssh will require to enter pass, but cron cannot!

 
UbuntuLinuxHelp May 26, 2008 Reply

@vladi – The first time I connected manually during the set-up, I got the terminal prompt to add the key. After that it’s been running since, without the username and password (because it uses the key). So… as you mentioned, ssh has to be properly configured on the web server, if not, then it rsync will fail, in this case. ;) Thanks. :)

 
Yvette January 13, 2009 Reply

Hello,

Great article you have here! However, I have a question: Is it possible to send the back up to an email instead?

 
 
UbuntuLinuxHelp January 14, 2009 Reply

@Yvette – Probably yes, but maybe the files will be too large for email? For web hosting clients, you can try here: http://www.dagondesign.com/articles/automatic-mysql-backup-script/ or here http://www.justin-cook.com/wp/2006/12/27/automatic-cpanel-backup-domain-mysql-with-cron-php/ or even here: http://jussi.ruokomaki.fi/tech/62/automatic-cpanel-backup-via-curlcron-v11/ I think these links will help you with the emailing part.

 
Jenny Sanders January 12, 2011 Reply

Since rsync requires SSH access to the host and not all hosts support it, this solution is host-specific. A more general-purpose solution that, IMO, works better is WebCron Site Backup. It incrementally backs up files and MySQL databases and works anywhere PHP is available. It also sends an e-mail containing a list of changes since the last backup, which can be used to detect unauthorized entry into the system.

Leave a Reply

close comment popup

Leave A Reply