Snapsync is a wrapper around rsync, enabling snapshot style backups of directory trees.
snapsync.pl [OPTIONS] /path/to/dir /path/to/backups
Snapsync is a wrapper around rsync, enabling snapshot style backups of
directory trees. Snapsync is used on numerous production systems, working
equally well on small sets of data encompassing only a few files, huge systems
with gigs of files across thousands of directories, and anything in between.
In default mode, given a source dir and a destination dir, snapsync:
1. looks in destdir for backup sets named backup.datex
2. syncs srcdir against backup.datenow using hardlinks where the data
has not changed
3. rotates or deletes sets based on settings in the --rotation argument
The set backup.dateoldest is a full copy of the data, but all other backup sets
are mostly hardlinks, except for any differences between sets. This means that
if your data is size n, your entire set of backup sets is only size n +
Snapsync takes numerous arguments that can affect operation drastically.
SYNC ENGINE SELECTION
Snapsync was originally written with rsync in mind, a well-tested and widely
used synchronization platform with a wide array of features. Psync is an
experimental internal algorithm that pales in comparison to the number of
features rsync offers, but was written to test various ideas about
synchronizing file system paths with massive numbers of files in them. Nosync
allows snapsync to do its rotation semantics with no syncing, allowing for the
use of other external copying/syncing tools.
Rsync does comparisons by building lists of files in the source and
destination locations, and appears to transfer data only after recursing the
entire folder structure. Psync builds no comparison lists, recursing the
folder structure comparing differences and transferring data in real time.
This is done with the goal of keeping memory consumption down and doing data
transfer at time of comparison.
Rsync is preferred, and psync is considered experimental. See --sync in the
list of options below.
REMOTE SOURCE DIRECTORIES
Snapsync can backup source directories on remote hosts. This is achieved by
specifying a source directory in the standard user@host:/path/to/dir format.
This only works when rsync is the sync engine in question, as first snapsync
connects to the remote host over ssh to determine if the directory exists, and
then executes rsync using the default remote shell mechanism on your system.
This is usually ssh as well on modern systems. While passwords can be entered
at time of execution, automation will obviously require ssh keys.
Snapsync's concept of rotation levels (--rotation) can be leveraged to
implement a grandfather-father-son rotation scheme to store backups reaching
back for as long as is desired. A setup might look like this:
snapsync.pl [OPTIONS] --rotation=daily:7 (run daily)
snapsync.pl [OPTIONS] --rotation=daily:7,weekly:4 (once per week)
snapsync.pl [OPTIONS] --rotation=daily:7,weekly:4,monthly:3 (once per month)
As for scheduling, cron does not understand the concept of once-per-month, but
one way to achieve that is by chaining to the date command to run on the first
Saturday of the month, for example:
1 0 * * sat [ `date '+\%e'` -le 7 ] && snapsync.pl ... # first saturday
1 0 * * sat [ `date '+\%e'` -gt 7 ] && snapsync.pl ... # any other saturday
Adjust accordingly. Note that levels can be named arbitrarily, be run at
arbitrary intervals, and there can be an arbitrary number of them.
- -d, --dry-run
Do not actually do any operations. Combines well with --verbose to debug
- -e, --exclude
Path to exclude in the source directory when doing sync. This currently
applies when using sync engines rsync:atomic or psync. Behavior is the
same as rsync --delete-excluded: the atom will not appear in the backup set in
the destination directory. Multiple --exclude options are allowed.
Note that when using rsync non-atomically (the default), or when needing to
exlude a file or folder inside an atom when running atomically, one would need
to use rsync's native exclusion mechanisms, exposed by way of passing the
needed --options to snapsync wich are in turn passed to rsync. See man rsync.
- -f, --flock
Path to lockfile to prevent concurrent runs.
- -h, --help
Print brief usage message.
- -i, --include
Path to include in the source directory when doing sync. This currently
applies when using sync engines rsync:atomic or psync. This option indicates
that all non-specified atoms in the source directory will be excluded as in
the exclude option above. See that section for details.
- -l, --log
Path to logfile.
- -m, --mode
Determines the characteristics of the backup set, most visibly, the file
extension to place on the backup set folders in the destination folder. Can
be 'count,' 'epoch,' 'date,' or 'reverse.' Count stamp mode simply counts up,
the most recent set being the highest number. Both epoch and date modes use
the current timestamp as the extension name. Reverse mode is a special
version of count mode, in that sets are shuffled around so that the most
recent backup is always the lowest number (set 0), and the oldest backup is
the highest number set. Default: 'date'
WARNING: Running snapsync against a destination directory containing different
backup set mode types is not recommended and could lead to unpredictable
behavior. However, it may be possible for you to convert your sets to one
mode. See the -z option.
- -o, --options
String of command-line arguments to pass to the sync engine. Note that
snapsync always passes '-a --delete' to rsync, and '--delete' to psync, while
this argument merely specifies additional options. This is controlled by the
$RSYNC_BASE_OPTS and $PSYNC_BASE_OPTS variables at the top of the script.
- -r, --rotation
Defines how backup sets are named, rotated through levels, and purged. For
example, 'daily:7' would begin creating folders named daily.X until the
maximum seven sets existed, at which time it would purge the oldest set on the
next run. If this was changed to 'daily:7,weekly:4' however, sets would be
promoted (renamed) to weekly.X rather than being deleted, up to a maximum of
four at that second level. Setting any level to 0 turns off both deletion
and promotion at that level, and sets will never expire. Default: 'backup:0'
NOTE: See the section GFS ROTATION for ideas how this option can be used to
implement a grandfather-father-son rotation scheme.
- -s, --sync
Determines which underlying synchronization engine to use for the backup,
in the form engine:option:option, where engine may be either 'nosync'
(rotation only), 'psync' (internal algorithm), or 'rsync' (the venerable rsync
program). The rsync engine offers these additional options:
atomic - descend one directory deep and do all ops atomically
linkcopy - seperate cp -al step instead of using rsync --link-dest
The atomic option is very handy for isolating rsync to one file/directory at a
time, with the additional benefit of timing information for each atom. The
linkcopy option exists for old pre rsync-2.5.6 which lacked the --link-dest
option. It is still offered because it seems to be quicker in some
Nosync takes no options, only rotating sets and leaving a new empty set.
Psync takes no options, always doing a separate linkcopy step and inherently
operating atomically. Default: 'rsync', no options
- -v, --verbose
Print lots of useful information. For more verbosity, pass --verbose multiple
Attempt to convert all backup sets from the destination directory to the new mode
given by the --mode argument, then exit.
WARNING: It is highly recommended that you first run in dry-run mode to see
how your backup sets will be affected by a conversion. Snapsync makes a best
guess based on the mtime of the backup sets, which may or may not work for you.
- Simple snapsync of a folder:
snapsync.pl /path/to/dir /path/to/backups
- Simple snapsync with all defaults spelled out:
snapsync.pl --rotation=backup:0 --mode=date --sync=rsync /path/to/dir /path/to/backups
- Use psync instead:
snapsync.pl --rotation=backup:0 --mode=date --sync=psync /path/to/dir /path/to/backups
- Use rsync atomically:
snapsync.pl --sync=rsync:atomic /path/to/dir /path/to/backups
- Exclude some atoms:
snapsync.pl --sync=rsync:atomic --exclude /path/to/dir/atom /path/to/dir /path/to/backups
- Include only some atoms:
snapsync.pl --sync=rsync:atomic --include /path/to/dir/atom /path/to/dir /path/to/backups
- A verbose dry-run test:
snapsync.pl --verbose --dryrun /path/to/dir /path/to/backups
- Do a separate cp -al step instead of relying on rsync --link-dest
snapsync.pl --sync=rsync:linkcopy /path/to/dir /path/to/backups
- And again atomically
snapsync.pl --sync=rsync:atomic:linkcopy /path/to/dir /path/to/backups
- Against a remote source directory
snapsync.pl --sync=rsync:atomic:linkcopy user@host:/path/to/dir /path/to/backups
- Convert sets to epoch format
snapsync.pl -z backup,weekly,mystuff --mode epoch /path/to/backups
snapsync 1.00 (20120418)
- Initial release.