Home

Awesome

Tardis-Backup

A Time Machine style backup system.

Tardis is a system for making incremental backups of filesystems, much like Apple's TimeMachine.

Like TimeMachine, Tardis is aimed primarily at "live backups", namely a backup taken periodically, and available for quick recovery, typically stored on attached or online disks, or rather than being stored on archival backup media, such as tapes.

Tardis runs in a client/server mode, or in a fully local mode.

Quick Start

Installation

Backing up data

Each time you run the tardis program, you will create a new backup set. By default backup sets are named Monthly-YYYY-MM for the first backup set stored in any month, Weekly-YYYY-WW for the first backup set any week (if not a monthly set), Daily-YYYY-MM-DD for the first set each day (unless weekly or monthly), and Hourly-YYYY-MM-DD-hh if run hourly. Additional backup sets (ie, more than hourly), or incomplete backup sets (where the tardis job failed before the backup completed) are named Backup_YYYY-MM-DD-hh:mm:ss. (YYYY = year, MM = month, DD = date, WW = week of year, hh = hour, mm = min, and ss = seconds). Normally, hourly backups are pruned out of the database after 24 hours, daily sets after 30 days, weekly after 6 months, and monthly's kept forever. All these names and pruning parameters can be adjusted.

Recovering data

There are two ways to access backed up data, and it depends how you've done your backup. If you've backed your data up to the client machine, you can access the backup database directly. In this case you will point tools directly at the database. This can also be done if the database is accesible via a network file system, such as NFS or Samba/SMB/CIFS. If the backup drive is not directly accessible, it can be reached via an http exposed filesystem. Details on setting this up are below.

Tardis File System

The tardis dataset can also be mounted as a filesystem. Typically the command: tardisfs --database DatabasePath /path/to/mountpoint will mount the filesystem under the directory /path/to/mountpoint (mountpoint should be an empty directory). In this scenario, you will see files under mountpoint. At the first level, there will be a directory for each backup set, with the names as above. Within each directory will be a directory tree, with the data as backed up in that backup set.

Local and Remote Backups

The Tardis client (tardis) can be run in either a client/server (remote) or directly against a backup database (local) mode.

In the remote mode, there must be a backup server running, typically on a remote machine. In the local mode, no server is used, and data is backed up to a drive connected to the machine being backed up, either physically or via a network share.

Normally, the mode will be selected by the switches, and various config options. If a server is specified (via --server, the TARDIS_SERVER environment variable, or a configuration file), remote mode will be picked. If a database is specified (via the --database/-D option, the TARDIS_DATABASE environment variable, or a configuration file), local mode will be selected. In some circumstances, you can create a confusing situation, such as having both the TARDIS_SERVER and TARDIS_DATABASE environment variables set. In this case, use either the --local or --remote options to select local or remote mode, respectively.

Components

Tardis consists of several components:

Tardis is written in Python 3, and only relies on a few non-pure python packages.

Tardis uses a modified version of the librsync library, which adapts it to support he most recent versions of librsync. When/if a correct functional version appears on Pypi, we'll use it instead. See https://github.com/smartfile/python-librsync

Tardis also comes with a number of "tools", which are not necessarily fully supported:

Future Releases

Several releases will be coming soon:

Support

Tardis has been done in my spare time, but still represents a significant amount of work. If it helps you, please donate to support it's continued development. Thanks...

paypal

If you're using Tardis, I'd love to hear from you. Please, let me know how you like it. eric.koldinger@gmail.com

Installation

Installing up the server is relatively straightforward.

Server Setup

Server Requirements

The server should run on any system running Linux. Fedora, Ubuntu, and Raspbian have all been used successfully

It does not need to be particularly powerful. A Raspberry Pi Model B has been used, but is a bit underpowered. A Raspberry Pi 2 Model B seems to work quite well, primarily due to the larger memory.

Typically, a faster processor and more memory will lead to shorter backup times, as will faster I/O connections to the disk drives. On a benchmark system, a Raspberry Pi server would run a backup in about 40-50 minutes, a Raspberry Pi 2 will reduce that time to under 30 minutes, and a dual core 1.5GHz Celeron (with 4GB of memory, and USB 3.0 disk drives) will run the benchmark in 3-5 minutes.

Running the Client

Should probably run as root. Basic operation is thus: tardis [--port <targetPort>] [--server <host>] /path/to/directory-to-backup <more paths here> If you wish encrypted backups, add a password (via the --password, --password-file, or --password-prog options) to enable encryption. Note that passwords can be added at a later point, and the database encrypted at that point, but it is a very slow operation.

On your first backup, add the --create flag to initialize the backup set. Your first backup will take quite a while. Subsequent backups will be significantly faster. If the first backup does not complete, it can be continued simply by using the same command line, without the --create option. It will backup any data which wasn't backed up the first time, along with any changed data. This can be repeated any number of times.

Once you have an initial backup in place, put this in your cron job to run daily.

Note on Passwords

There is no mechanism for recovering a lost password. If you lose it, you're done.

Passwords can be changed with the sonic utility.

All client tools take a couple of password options. --password or -P will allow you to specify a password on the command line, or if no password is specified, it will prompt you to enter one. The second option is --password-file or -F, in which case you can specify a path to the file containing the password in plaintext. The path can be either a file path (relative or absolute) on the current system, or a URL of a remote file (file:, http:, https: or ftp:). A third option is --password-prog, after which you can specify a program command line to generate a password. The program should output the password to standard output, and the first line will be read and used as the password.

Tardisfs supports all the same options, with slightly different syntax. All are specified via the -o syntax to fuse mount. -o password=*password* will use password as the password, -o password= will prompt for a password, -o pwfile=*path* will read the password from path (which accepts the same options as --password-file above), and -o pwprog=*program* will run program, same as --password-prog above.

Listing Versions of Files Available

Files can be listed in the tardisfs, or via the lstardis application.

lstardis can list all versions of a file available. See lstardis -h for details.

Comparing versions of files

tardiff can directly compare two versions of a file in the database, or a file in the database, and it's corresponding version in the filesystem.

See tardiff -h for details.

Recovering Files

Files can be recovered in two different ways: via the regenerate application, and via a tardisfs filesystem.

The filesystem approach is often the easiest method. In this technique, a filesystem is mounted which contains the results of all the backupsets. At the top level, there is a directory for each backup set. Underneath these directories, are the full image of the backuped directories in a standard directory tree, as they appeared at the time of the backup. Files can easily be copied out of this tree to their desired locations.

Files can also be recovered via the regenerate application. The regenerate application takes the name of the file to be recovered, and can also be given a date for which to regenerate the file. Dates can be via the --date (-d) option, and can be specified via a large variety of forms. For instance regenerate -d '3 days ago' filename will regenerate a version from 3 days earlier. Dates can also be specified expclitly in a wide variety of formats, such as "03/15/2014" to specify March 15, 2014 (obviously).

Regenerate can be used to recover entire directory trees. In general, using regenerate to recover files will be siginicantly faster than rsync'ing out of tardisfs.

See regenerate -h for details.

At present, the regenerate application does NO permission checking to determine if a user has permission to read a file. Thus, any file in the database set can be accessed by anybody with access to the backup database. If this is a problem in your environment, it is recommended to disable the regenerate application (or at least protect the database with a password that you don't share with all users), and allow access primarily through a tardisfs filesystem controlled by the super-user. See Mounting the Filesystem below.

Utility Functions

The sonic program is useful for manipulating a backup. Sonic provides various functions whch don't fit well elsewhere. These include:

These options are available as subcommands, for instance: sonic list <options> Each subcommand takes a different set of options, although many are common.

Environment Variables

VariableDescriptionDefaultUsers
TARDIS_DBLocation of the tardis database/srv/tardisUser Tools
TARDIS_PORTPort to use to connect to the Tardis Daemon7420Client, Daemon
TARDIS_DBNAMEName of the database file containing tardis informationtardis.dbDaemon, Remote, User Tools
TARDIS_SERVERName (or IP address) of the tardis serverlocalhostClient
TARDIS_CLIENTName of the backup client.Current hostnameClient, User Tools
TARDIS_DAEMON_CONFIGName of the file containing the daemon configuration/etc/tardis/tardisd.cfgDaemon
TARDIS_EXCLUDESName of the file containing patterns to exclude below the current directory..tardis-excludesClient
TARDIS_LOCAL_EXCLUDESName of the file containing patterns to exclude only in the local directory..tardis-local-excludesClient
TARDIS_GLOBAL_EXCLUDESName of the file containing patterns to exclude globally/etc/tardis/excludesClient
TARDIS_SKIPName of a file whose presence excludes a current directory (and all directories below).tardis-skipClient
TARDIS_PIDFILEFile to indicate that the daemon is running./var/run/tardisd.pidDaemon
TARDIS_SCHEMAFile containing the schema for the database.schema/tardis.sqlDaemon
TARDIS_LS_COLORSDescription of colors for lstardislstardis
TARDIS_REMOTE_PORTPort used for the HTTP Remote interface7430Remote, User Tools
TARDIS_REMOTE_CONFIGConfiguration file for tardisremote/etc/tardis/tardisremote.cfgRemote
TARDIS_REMOTE_PIDFILEPath to the pidfile for tardisremote daemon./var/run/tardisremote.pidRemote
TARDIS_DEFAULTSLocation of a defaults file./etc/tardis/system.defaultsAll
TARDIS_RECENT_SETName to use for most recent, complete backupCurrentUser tools
TARDIS_SEND_CONFIGSend the running configuration to the server, mainly for debug.TrueClient
TARDIS_CONFIGConfiguration file to useNoneClient, User Tools
TARDIS_JOBJob to run, from a configuration fileTardisClient, User Tools
TARDIS_VALIDATE_CERTSValidate certificates if using SSLClient

Notes: * User tools are lstardis, regenerate, tardiff, tardisfs, and sonic. * Client is the tardis app. * Daemon is the tardisd app.

System Defaults

The above environment variables can have default values set via the system defaults file. This file is located at /etc/tardis/system.defaults, or can be overridden by the TARDIS_DEFAULTS environment variable. The system.defaults file is not installed by default.

Format is a standard .ini file, with variables in the Tardis section, and each variable specified with the names in the table above.

The location of the defaults files can be overridden by the TARDIS_DEFAULTS environment variable.

Configuration

All applications in the Tardis suite can take options from multiple locations. These locations are, in order: the default value, the system default, the configuration file, and the command line. Thus, the system default overrides the built in default, the configuration overrides either of those, and the command line arguments override all of the other options. Note that some options can only be specified on the command line.

Command line arguments for many tools can be specified in a file, accessed via the @ symbol. For instance tardis @file -list will read file as if it were arguments presented on the command line.

Client Configuration Files

Client tools can read from multiple configuration files. By default, configurations are read from Tardis section, but can be overridden by using the --job option.

NameDefault ValueEnvironment VarDefinition
ServerlocalhostTARDIS_SERVERServer to use for backups
Port7420TARDIS_PORTPort to listen on
ClienthostnameTARDIS_CLIENTName of the system to backup
DatabaseTARDIS_DBDirectory for local backups
ForceFalseForce the backup, even if another one might still be running.
FullFalsePerform a full backup (no delta's, full files for previous deltas.
Timeout300Time out (in seconds) for connections.
PasswordPassword. Only of on the 3 password configs can be set.
PasswordFileFile name of a file containing the password
PassswordProgProgram to prompt for a password.
CryptTrueEncrypt data in the backup. A Password must be set to enable tihs.
KeyFileFile containing the keys.
CompressDatanoneCompress data using this algorithm. Choices are none, zlib, bzip, lzma, zstd
CompressMin4096Minimum size file to compress.
NoCompressFileTARDIS_NOCOMPRESSFile containing a list of mime type files to not attempt to compress
NoCompressMime types to not compress
SendClientConfigTrueTARDIS_SEND_CONFIGSend the client configuration (arguments) to the server.
LocalFalsePerform a local backup. Spawns a server as a child process.
CompressMsgsnoneCompress messages to the server. Choices are none, zlib, zlib-stream, snappy
PurgeFalsePurge old content
IgnoreCVSFalseIgnore source code control files (CVS, SVN, RCS, and git)
SkipCachesFalseSkip cachedir directories
SendSigFalseAlways send a signature. Only valid for non-encrypted backups.
ExcludePatternsFilename patterns to ignore. Glob file format
ExcludeFilesFile containing patterns to ignore.
ExcludeDirsDirectories to exclude.
GlobalExcludeFileNamePath to a global file containing filename patterns to exclude.
ExcludeFileName.tardis-excludesCheck for this file in each directory, and exclude files which match it's pattern in current directory and all below.
LocalExcludeFileName.tardis-local-excludesSame, but only in the current directory.
SkipFileName.tardis-skipIf this file exists, skip this directory and all below.
ExcludeNoAccessTrueExclude files to which the user doesn't have access/permission.
LogFilesList of files to log to.
Verbosity0Verbosity level.
StatsFalsePrint some stats on the backup when complete.
ReportnonePrint a list of all files backed up when complete. Valid values are none, all, and dirs
Directories.List of directories to backup.

Server Configuration File

The server configuration file, usually in /etc/tardis/tardisd.cfg, is in the standard .ini file format. There is a single section, "[Tardis]", containing all the variables. The following configuration variables are defined:

NameDefault ValueEnvironment VarDefinition
Port7420TARDIS_PORTPort to listen on
BaseDir/srv/tardisTARDIS_DBDirectory containing all databases handled by this server
DBNametardis.dbTARDIS_DBNAMEName of the database containing all metadata
Schemaschema/tardis.sqlTARDIS_SCHEMAPath to the file containing the database schema.
LogFileNoneFilename for logging. stderr if not specified.
JournalFiletardis.journalJournal file for logging which files are dependent on others. Stored in the DB directory for each client.
ProfileFalseIf true, a profile of each session will be generated and printed to stdout
AllowNewHostsFalseIf True, any new host can connect and create a backup set. If false, a directory with the hostname that the client wil provide must be created prior to the client attempting to perform a backup.
RequirePasswordFalseRequire all backups to have a password.
SaveConfigTrueSave the client's configuration, if sent.
LogExceptionsFalseLog full detail of all exceptions, including call chain.
MaxDeltaChain5Maximum number of delta's to request before requesting an entire new copy of a file.
MaxChangePercent50Maximum percentage change in file size allowed before requesting an entire new copy of a file.
SaveFullFalseAlways save entire copies of a file in the database. Ignored if the client is sending encrypted data.
AllowSchemaUpgradesFalseAllow the server to automatically upgrade the database schemas
SingleFalseRun a single client backup session, and exit.
LocalNonePath to a Unix Domain Socket to use. If specified, overrides the Port value.
Verbose0Level of verbosity. 0 is silent, 1 gives summaries of each client session, 2 and above get very noisy.
DaemonFalseRun as a daemon process, detaching from the initial process, and running in the background.
Umask2 (002)Mode mask used when creating files in the database.
UserNoneName of the user to run as when run in daemon mode.
GroupNoneName of the group to run as when run in daemon mode.
PidFileNonePath to the file indicating that a tardis daemon process is running. Must be set if Daemon is true.
SSLFalseUse SSL over the socket.
CertFileNonePath to the certificate file for SSL communications. Must be set if SSL is true.
KeyFileNonePath to the key file for SSL communications. Must be set if SSL is true.
SkipFileName.tardis-skipTARDIS_SKIPSkip file name to be created in the backup directories.
FormatsMonthly-%%Y-%%m, Weekly-%%Y-%%U, Daily-%%Y-%%m-%%dFormats of names to use for the different types of variables. A common and whitespace separated list of formats. Format is of the same type as used by pythons time.strptime() function, although percent signs need to be doubled (ie %%Y, not %Y). Each name will be checked in order.
Priorities40, 20, 10Priority value corresponding to the names in the Formats value.
KeepPeriods0, 180, 30Number of days to keep for each backup type, corresponding to the names in the Formats value.
DBBackups5Number of backup iterations of the database to keep.
LinkBasisFalseCreate a ".basis" symbolic link file to the basis file when deltas are created.

TardisRemote Configuration File

NameDefault ValueEnvironment VarDefinition
Port7420TARDIS_REMOTE_PORTPort to listen on
Database/srv/tardisTARDIS_DBDirectory containing all databases handled by this server
DBNametardis.dbTARDIS_DBNAMEName of the database containing all metadata
LogFileNoneFilename for logging. stderr if not specified.
LogExceptionsFalseLog full detail of all exceptions, including call chain.
Verbose0Level of verbosity. 0 is silent, 1 gives summaries of each client session, 2 and above get very noisy.
DaemonFalseRun as a daemon process, detaching from the initial process, and running in the background.
UserNoneName of the user to run as when run in daemon mode.
GroupNoneName of the group to run as when run in daemon mode.
PidFileNonePath to the file indicating that a tardis daemon process is running. Must be set if Daemon is true.
SSLFalseUse SSL over the socket.
CertFileNonePath to the certificate file for SSL communications. Must be set if SSL is true
KeyFileNonePath to the key file for SSL communications. Must be set if SSL is true
CompressTrueAllow compression of data across HTTP, if the cilent accepts it.
AllowCacheTrueAllow the client to cache responses to HTTP requests.
AllowSchemaUpgradesFalseAllow the server to automatically upgrade the database schemas

Mounting the filesystem

The backup sets can be mounted as a filesystem, thus: tardisfs -o database=/path/to/database [-o host=hostname] [-o password=[your_password]] mountpoint /path/to/the/backup/directory will be the path specified in the BaseDir in the server host config. The host parameter is the name of the host that you wish to mount backups for.

Password should only be set if a password is specified in the backup. If you leave it blank (ie, password=), it will prompt you for a password during mount.

tardisfs options are specified in a format to enable fstab mounting. Each option is specified as -o name=value. For instance, -o database=/nfs/tardis -o client=hostname. Options can be specified in a fstab, such as:

tardisfs#0				/mnt/tardis/ClientName	fuse	user,noauto,default_permissions,allow_other,database=/nfs/tardis/,client=ClientName	0 2

Due to the nature of FUSE filesystems, allowing any user to mount the filesystem can create a potential security hole, as most permissions are ignored. The most effective way to perserve some security is to mount the filesystem as root, with the "-o allow_other -o default_permissions" options specified. This allows all users to access the file system, and enforces standard Unix file permission checking.

Encrypting an Unencrypted Backup

If you've built an unencrypted backup, and wish to add encryption, this can be accomplished, primarily using the encryptDB.py application in the tools directory. Note, this is only semi-debugged. Use at your own risk.

The following steps should be performed:

You can run all the steps at once with the --all option. As with --names, do NOT run this more than once. If it fails, restart the other stages as appropriate.

Release Notes -- Version 1.1.5

1.1.5 changes the way directory hashes are generated. Prior to this, hashes were based on the encrypted file names. This caused a significant performance hit, as the filenames all had to be encrytped, taking up most of the runtime of an incremental backup. 1.1.15 changes this so that the hashes are generated based on the original filenames. This will cause a significant performance increase, but on the first backup after the upgrade, it will cause all directory hashes to be incorrect. New hashes will be generated automatically, but if you wish, you can generate new hashes using the tools/setDirHashes.py script, or with encryptDB.py --dirs). This only impacts encrypted backups. Unencrypted backups will see no difference.

Release Notes -- Version 1.1.2

1.1.2 adds a several bug fixes, and a couple minor features. The most visible feature is the addition of path matching in exclude files. Patch matching uses the ** wildcard, and follows the GIT WildMatch syntax.

Changes in behavior include:

Important Note -- Version 1.1.0

The 1.1.x releases for Python3 are mildly incompatible with the 1.0.x releases. The incompatibility is in the communications protocol between the tardis client, and the tardisd server. Both need to be at the same release. A 1.0.x client can't backup to a 1.1.x server, nor can a 1.1.x client backup to a 1.0.x server.

However, the remainder of the tools are compatible. lstardis, regenerate, tardisfs, tardiff, sonic, and tardisremote can are all compatible. You can use either the 1.0.x or 1.1.x tools to monitor the same data.

The database schema is also completely compatible. Data entered into a database under the 1.0.x branches can be continued to be updated by the 1.1.x branches.

As such, it is recommended that you update both clients and servers to the 1.1.x versions at the same time, to allow ongoing backups to continue.

Important Note -- Version 0.33

New clients must be created with the tardis --create or sonic create options. If a password is required, tardis will prompt for one.

0.33's communications protocol is slightly incompatible with previous versions. Please upgrade both client and server simultaneously.

Important Note -- Version 0.32

Version 0.32 changes the communications protocol in some minor ways, but is incompatible with previous versions. The client, server, and all tools must be upgraded at the same time.

Version 0.32 also changes the login mechanism, from the rather insecure ad-hoc mechanism used previously, to an cryptographically secure version, using the Secure Remote Password (SRP) protocol. As such, the database must be changed to support the new password mechanism. The script tools/setSRP.py will set the database up for the new password mechanism. Note that this MUST be run on the server, and requires the users password. Usage is:

Because the database schema updates at the same point, you will also need to upgrade the schema. The command for this is:

Note that no password is needed.

If your backup does not use a password, this step should be skipped. It only applies to secure backups.

Important Release Notes

0.31.12 or so changed the database journalling scheme to SQLite's "write ahead logging". This seemed to cause problems. If you are unable to do much with the database, execute the following command at the command line:

Post 0.31.11 changes the directory hashing scheme. It is recommended that you run the tools/setDirHashes.py program (or run encryptDB.py --dirs, but only if your database is encrypted) to reset the hashes to the new scheme. This is not necessary, but without it your next backup job will run MUCH longer than usual. It will self correct after the first backup run.

Logwatch Support

Basic logwatch support is available in the logwatch directory. You have to install these files by hand, no support is in setup.py yet.

MacOS X Support

I'm in the early stages of testing Tardis on MacOS X, but it appears that, for the most part, it works, at least the client.

Note, you need to use the homebrew to install Python, and librsync.

Preliminary experiments indicate that it works on Yosemite, and Mojave, and should work on most versions of MacOS X.

tardisfs does not currently work on MacOS, as FUSE isn't natively supported. In theory it should work with MacFUSE, but I haven't tried.

Installation on Mac OS X

In most case, MacOS is case insensitive. Getting Tardis from git can be problematic, as there is a directory Tardis and a file tardis. The latter is not necessary, so the best approach is to download the zip file from github, unzip it manually, and when it asks what to do, replace the file.

I recommend doing all the above in a virtualenv environment until you're certain of the results.

Compression

Current versions of tardis support the following compression algorithms: zlib (gzip) deflate, lzma (xz), bzip2, and zstandard (zstd). The choice of compression algorithm is up to the user. The compression algorithm can be specified as an argument to the --compress-data (-Z) argument. If no compression is specified, zlib is chosen. The compression algorithm can differ between backup runs, and is handled transparently.

In general, zlib is the fastest scheme, but offers the least compression. Zstandard offers slightly better compression, at a slightly slower speed. bzip2 offers even better compression, and lzma is the has the best, with lzma showing a significant increase in computation time.

In general, zstandard and bzip2 probably give the best tradeoffs between compression, and performance, with zstandard being faster, and bzip2 giving better compression. lzma is probably too slow for general use.

Note that performance of any compression algorithm will be heavily influenced by the data being compressed.

Tardis will not attempt to compress certain types of files, which it views as already compressed. Compressed file types include standard compressed files (zip, gzip, xz, lzma, bzip2, etc) and some compressed data types (jpeg, mpeg, mp3, mp4, flac, etc). Additional types can be expressed via their mime type iin the types.ignore file.

Encryption

Older versions of Tardis supported a single compression scheme. File data was compressed with AES-256 in CBC (Cypher Block Chaining) mode, with a HMAC-SHA512 authentication code. Filenames were encrypted using AES-256 in ECB (Electronic Code Book) mode. Separate keys were used for filename and data encryption.

Newer versions of Tardis support multiple encryption schemes.

The original filename scheme was weak because ECB mode allows common sections of data to appear as the same value. Thus, if two file names started with the same block of characters, it was possible that both file names would be similar when encrypted.

Newer versions will use ChaCha20/Poly1305 as the default encryption. On modern Intel x86-64 processors, AES-256-GCM may give better performance, due to the availability of specialized AES instructions (AES-NI). On lower powered processors, ChaCha23/Poly1305 will generally give better performance. Both schemes are believed to be cryptographically secure.

The encryption scheme can be specified when the backup is created at first, or when the password is set via sonic. The scheme is specified via the --crypt option. Specifying scheme 0 will create an unecrypted backup, with password authentication.

Protection Model

The Tardis encryption scheme is designed to protect against unauthorized viewing of data, but are not designed to protect you against an attack. Simple checks on file data are made (via the HMAC, GCM, or Poly1305 authentication codes), but damage can easily be done by attacking the database. In addition, some data of minor value (such as inode numbers) are stored in plaintext in the database, allowing possible reconstruction of the shape of a directory tree. If the general structure of your data is known, Tardis may leak some information.

The system is designed so that the server never works with unencrypted data. All encryption and decryption are handled by the client being backed up, or being restored.

Notes on Data Storage

Data is stored in a database directory on the backup server. There is one directory for each client backedup, and named based on the client name.

Within this directory are several files:

There are also up to 256 subdirectories, number in hex from 00-ff, containing the data. Within these are a second level of subdirectories, which then contain the actual data. The actual data for each individual file is stored in named with the hash value (either the MD5, or the HMAC-MD5, if a password is set) of the contents of the file. The contents is the fully reconstructed contents, not the actual contents of the current file.

If the data is unencrypted, it is stored directly in the file, as either the raw data of the file (possibly compressed, if the client so specified) or as an rdiff delta.

If the data is encrypted, the above data is encapsulated in the following format: the first 16 bytes (128 bits) are the initilization vector for the encryption, currently AES-256-CBC. After this comes the data, as above, encrypted. In schemes one and two, which use an explicit HMAC, this data is padded ala PKCS#7, in binary. The last 64 bytes (512 bits) of the file contain an HMAC of the data (including the PAD) using HMAC-SHA512.

Along with each file xxx, there is a corresponding file xxx.sig, containing the rdiff signature of the file, and a file xxx.meta, which contains information allowing reconstruction of the file (if not it's filename) should the database be corrupted.