2011-04-06

TAMING RSYNC

When it comes to keeping directories in sync there are many tools out there but I feel that the good old rsync still does a very good job. Moreover, rsync jobs are easy to automate with standard unix tools (cron).


For this article we will use a user's home and sync its content to another directory.
So /home/qatqat  (my home dir, you will obviously use yours) will be mirrored to /storage/qatqat_home_bak

Now the basics:
a quick look at rsyn's man page shows:

 rsync [OPTION...] SRC... [USER@]HOST:DEST


mmmh....cryptic! Well, not actually.
let's apply it to our test scenario

execute all in one line
rsync -vraWogp --delete /home/qatqat /storage/qatqat_home_bak


Almost clear. let's go through the options:

-v verbose, detailed output
-r recursive, go and sync recursively into subdirectories (fundamental for a good backup)
-a archive, it means many of the other options together so it is probably redunant
-W whole files, no incremental check, writes more, but compares less
-o preserve owner
-g preserve group
-p preserve permissions
--delete , deletes files in destination directory that are no longer present in source directory

Now it all starts to make sense. the command will copy the content of /home/qatqat to /storage/qatqat_home_bak keeping all files' settings the same (which is good) and, if I delete something in /home/qatqat, it will do some cleaning into /storage/qatqat_home_bak (which is also good) when run again. The command outputs many lines to screen, that's why we want to catch them and save them to a synchrony log.

execute all in one line
rsync -vraWogp --delete /home/qatqat /storage/qatqat_home_bak > /storage/qatqat_backup_results.txt


Now we have a (very) detailed log of all files that have been synchronised.
Now if you use a browser (I am pretty sure you do) you will notice in the qatqat_backup_results.txt file that many thousands of lines refer to your internet cache being synchronised too. Personally I don't like that so here is a way to exclude some directories from the synchony process.

Create a new file:

touch /storage/rsync_excluded_dirs
add all dirs to exclude to it
echo "/home/qatqat/.mozilla/firefox/" >> /storage/rsync_excluded_dirs
echo "home/qatqat/any_other_dir_to_esclude" >> /storage/rsync_excluded_dirs

Now we tell rsync to read the file when synchronising:

 execute all in one line


rsync -vraWogp --delete --exclude-from=/storage/rsync_excluded_dirs /home/qatqat /storage/qatqat_home_bak > /storage/qatqat_backup_results.txt

Pretty nifty now.
Now, what's left is to execute the synchrony periodically, so we need to create a bash script.
A sample script is available here, edit it as you please.

qatqat_home_backup.sh.gz


now add the relevant line to /etc/crontab

execute all in one line
echo "30 20 * * 1,3,5 root /storage/qatqat_home_backup.sh > /dev/null 2>&1" >> /etc/crontab

The line above executes the script every mon, wed, fri at 20:30 PM. Google CRONTAB SYNTAX if you are not familiar with cron.

That's it for now.

Comments are welcome,

Ciao
QatQat

















QatQat