Awesome
RSYNC-OS
A port of rsync written in Go, with object storage support
Introduction
rsync-os is a Golang implementation of Rsync built from scratch. Its goal is to be a modern rsync-compatible tool with features like:
- Supports file storage and object storage
- Uses a subset of rsync wire protocol to avoid blocks efficient transfer (For sender and receiver)
- Rsync APIs
- Safe
Usage
Use minio as backend
- install & run minio, you need to configure the
config.toml
. go build
./rsync-os rsync://[USER@]HOST[:PORT]/SRC minio
, for example,./rsync-os rsync://mirrors.tuna.tsinghua.edu.cn/ubuntu minio
Roadmap
Client
Rsync wire protocol 27:
- Connect to rsync daemon by rsync://: Handshake, Fetches the file list, Requests & download files
- Handles error
Storage backend
- Minio: supports regular files, supports folder & symlink
- Minio: caches file list in boltdb
- Local
Other
- CLI
Detailed Information
openrsync has a really good documentation to describe how rsync algorithm works.
Why do this?
Just as rsyn said, "The rsync C code is quite convoluted, with many interacting options and parameters stored in global variables affecting many different parts of the control flow, including how structures are encoded and decoded." I would like to provide a rsync written in clean and understandable Golang code.
rsync has a bad performance. Inspired by rsync-huai, rsync-os stores the file list in database to avoid recursively generating the list.
Modernized rsync: rsync-os supports both file storage and object storage.
What's the difference between rsync and rsync-os
rsync-os is the express edition of rsync, with object storage support. It uses a subset of rsync wire protocol(without rolling block checksum).
rsync-os and rclone are completely different
rclone does not support rsync wire protocol although it is called "rsync for object storage". With rclone you can't transfer files between rsync and object storage.
Why we don't need rolling block checksum for regular file?
In the rsync algorithm, rsync requires random access of files to do the block exchange. But object storage does not support that.
rsync-os simplifies the rsync algorithm to avoid random reading and writing, since rsync-os don't need to do a rolling checksum scanning the file.
As a client, when a file has different size or modified time compared to the remote file, rsync-os just pretend 'the file does not exist here', then send a reply to download the entire file from the server and finally replace it.
As a server, TBC
HandShake
rysnc-os supports rsync protocol 27. Now it sends the arguments "--server--sender-l-p-r-t" to the remote rsyncd by default.
The File List
According to the arguments rsync-os sent, the file list should contain path, size, modified time & mode.
Request the file
rsync-os always saves the file list in its database(local file list). rsync2os doesn't compare each file with the file list from remote server(remote file list), but the latest local file list. If the file in the local file list has different size, modified time, or doesn't exist, rsync2os will download the whole file(without block exchange). To to do that, rsync2os sends the empty block checksum with the file's index of the remote file list.
Download the file
The rsync server sends the entire file as a stream of bytes.
Multiplex & De-Multiplex
Most rsync transmissions are wrapped in a multiplexing envelope protocol. The code written in C to multiplex & de-multiplex is obscure. Unlike rsync, rsync-os reimplements this part: It just does multiplexing & de-multiplexing in a goroutine.
Limitations
- Do not support block exchange for regular files. If a file was modified, just downloads the whole file.
- rsync-os can only act as client/receiver now.
Reference
- rsync
- openrsync, a BSD-liscesed rsync
- rsync-huai, a modified version rsync by Tsinghua University TUNA Association
- yajsync, a port of rsync written in Java
- rsyn, wire-compaible rsync in Rust
- acrosync-library
- repositoryd, An rsync-compatible RPKI repository daemon.
- https://rsync.samba.org/resources.html
- https://github.com/boundary/wireshark/blob/master/epan/dissectors/packet-rsync.c
- https://tools.ietf.org/html/rfc5781