Awesome
Git IPFS Remote Bridge
Git IPFS Remote Bridge is the set of programs written in Python 3 which allow Git user to clone, push, fetch, self-host or release Git repositories over IPFS decentralized data storage system.
Overview
Git IPFS Remote Bridge is written in Python 3. It provides the following programs:
git-ipfs
- the user interface program intended to be invoked by Git via wrapper program withgit ipfs
abbreviated command. This program act like a frontend solution allowing the user to install/remove IPFS remote, maintain the settings dedicated to communication with IPFS node. Also it provides the instrument to prepare a release snapshot from the given tag or commit, and immediately publish it separately on IPFS network.git-remote-ipfs
- the remote helper program. It normally should not be invoked by the end-user directly. Git invokes it maintaining the IPFS node address used as remote URL, which has format like:url = ipfs://<node-id>
. The program addressespush
,fetch
, andlist
commands of Git protocol to IPFS HTTP API of local node (by default) or remote node located somewhere in the network. The Git repository published on IPFS network is an immutable entry, so in the simplest case the program just callsgit remote set-url
to set the new CID as remote URL after pushing the data to IPFS. Otherwise, the IPNS cryptographically-signed entry key could be specified as remote URL. In this case, the program automatically invokes IPFS node API to automatically associate the obtained immutable CID calculated from the pushed repository data, with specified IPNS key.
Algorithms and logic here are partially inspired by Dropbox Git helper project.
Logo and Development Disclaimer
The program was developed under Remotization and Robotization Initiative at the Industrial Automation Lab of Elettra Sincrotrone Trieste by the members of the ICT/Scientific Computing Group. The logo created by @justeenduck
based on the image generated by Kandinsky 2.2. The logo is published under Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) License.
Intended use cases
- Sharing tagged version-controlled open source code in closed communities of volunteer programmers (like TANGO)
- Sharing version-controlled datasets with publisher signature required
- Sharing mutable datasets with single publisher over untrusted download gateways
- Self-signing of version-controlled datasets with automatic provenance using shared public key
- FAIR-compliant data sharing via shared repositories compatible with Git
- Publishing immutable release records for ordinary Git repositories not intended being accessible from public
Features
- Git repository maintenance. All the basic Git operations for repository maintenance and remote communication are supported.
- Self-hosted Git data.. In case the IPFS node on which the
push
operation is performed, is connected to the global IPFS network, the repository does not require anymore the server with installed Git for being accessible from anywhere. Moreover, the CID associated with the given state of the repository may be pinned being once known, to any IPFS node on the network improving the redundancy and reliability of the stored data. - Backward-compatibility. The pushed repository data available on IPFS network are prepared also for URI-based dumb HTTP cloning. The updated repository would be afterwards shared directly as immutable IPFS entry via HTTP/HTTPS IPFS gateway. The ordinary
git clone
command will work with this repository as usual. - IPNS security and immutability. If the IPNS entry is specified as a remote, the IPFS node will allow associating it with the repository only if the private key associated with this entry, is available on its host machine. Thus, the IPNS updates of any kind, made using the programs, are out-of-the-box protected and cryptographically signed. Only the key owher may release the data using his private key, so there are plenty of opportunities for data provenance.
Releases
The very special feature of the git-ipfs
program is a possibility to order Git to prepare a release snapshot from the given commit or (usually) annotated tag. Executing git ipfs release
command, the program performs the following steps:
- Orders Git via invocation of the
git-archive
program to prepare an archive containing snapshot of the repository state addressed by specified commit ID or tag name. - Upload the created archive to IPFS network node obtaining an immutable entry, path-like IPFS link via prefix (
/release.tar.gz
by default) and associated CID. - If IPNS public key is specified on the CLI, it will be associated with the obtained CID. Thus, the release snapshot will be published under IPNS entry key with all the protection features described above.
Limitations
- The program is not compatible with Git LFS!. In case both Git LFS and Git IPFS Bridge hooks are installed, the LFS marked data will not be stored on IPFS network! Only the internal links will be packaged and stored on IPFS instead of the actual files, the marked data will be still uploaded on LFS server if its address was configured properly before.
- The upload process is single-threaded. Due to HTTP API implementation used, the upload process utilizes only one HTTP connection at the moment. This limitation affects uploading of the large repositories and requires sometimes adjustment of the
Timeout
parameter accordingly. - The program does not follow IPFS directory entry upload specification, maintaining only path-based addressing of the repository objects on the remote side. This was done to stay compatible with old Git clients supporting only
text/plain
-encoded HTTP responses during dumb cloning.
Installation
Debian Packages
Debian packages for the program are now available in the Releases section of the repository.
Ubuntu PPA
The Debian packages are now accessible also using Ubuntu PPA. To install them, add the PPA repository to your system:
sudo su
echo 'deb https://ppa.launchpadcontent.net/twdragon/ipfs/ubuntu <DISTRO> main' >> /etc/apt/sources.list.d/ipfs
echo 'deb-src https://ppa.launchpadcontent.net/twdragon/ipfs/ubuntu <DISTRO> main' >> /etc/apt/sources.list.d/ipfs
apt update
apt install git-ipfs-remote-bridge
exit
replacing DISTRO with your Ubuntu distribution codename (jammy
, noble
, focal
, and bionic
are supported).
On the latest Ubuntu releases (jammy
and later), the PPA could be added using a single command:
sudo add-apt-repository ppa:twdragon/ipfs
sudo apt update
Then just call apt:
sudo apt install git-ipfs-remote-bridge
Building a Package
Building your own Debian package requires debhelper
script set installed on the system. Once it is done, the package can be built using just one command:
debuild
It will build the package and place the file into the uplevel directory. For further information, see Debhelper documentation.
Baremetal/Source Installation
As the program is written in Python and the Python interpreter is indicated in the same PATH
context with Git, it is enough to copy the program executables somewhere to the directory also mentioned in this PATH
context. For example, on most mainstream Linux distributions this could be done as the following:
sudo cp -v git-ipfs /usr/local/bin
sudo cp -v git-remote-ipfs /usr/local/bin
sudo chmod +x -v /usr/local/bin/git-ipfs /usr/local/bin/git-remote-ipfs
Prerequisites
Git IPFS Bridge uses requests
, urllib
, pathlib
, and json
modules. From them only requests
is the third-party module out of the standard library. Please refer to your OS module installation system documentation to learn how to install the required modules and make them available for Python interpreter associated with Git.
To make the programs working, the functioning IPFS node with HTTP API entry point is needed. Please refer to https://docs.ipfs.tech to learn how to install Kubo or another implementation of IPFS node on your system.
Usage
For simplest case, the usage of the program starts from the following command invoked from the rood directory of the Git repository:
git ipfs install <URL>
where <URL>
is an API URL addressing HTTP endpoint of the chosen IPFS node (for the local node likely http://127.0.0.1
). This command will create IPFS infrastructure scripts and register them as Git hooks, then initialize a baremetal repository on IPFS as an entry point for IPFS data serializer. Afterwards, the origin
remote address table will be created on Git and the obtained baremetal repository CID will be associated with the created remote. In case the origin
remote was already exist before invocation of the command.
The repository with installed IPFS infrastructure can be used as any other Git repository. But when the push
operation will be executed, Git will upload all data to IPFS and then tune up the remote links to comply with immutable CID changes. In case it is needed to uninstall IPFS infrastructure, it can be done with command:
git ipfs uninstall
All the previously defined remote settings, include URLs which were existing before installation of the IPFS infrastructure, will be restored.
Cloning
The program supports direct cloning of the repository from IPFS network. It can be done with command:
git ipfs clone <ID> <directory> <URL>
where:
<ID>
is IPFS CID or IPNS entry key to clone the repository from. It will be also used as the remote address.<directory>
is a relative path to the directory in which the downloaded repository data will be placed.<URL>
is an IPFS API URL as it was described above.
Configuration
Configuration of the IPFS bridge is repository-specific. During the default installation process, the default configuration file is always installed. It complies with old INI key-value configuration format. The ordinary config file has the only section called IPFS
. It is always named config
being placed in .git/ipfs
directory. This file contains both the regular setting intended to be adjusted by the git ipfs
command, and advanced settings that should be adjusted manually.
Regular Settings
Parameter | Default value | Description |
---|---|---|
URL | http://127.0.0.1 | Host URL of IPFS API endpoint |
Port | 5001 | API server's TCP port to connect |
VersionPrefix | api/v0 | API URI prefix to stay compatible with future changes of the IPFS HTTP API |
Timeout | 30.0 | HTTP timeout for I/O operatons. Increase this parameter if the repository is big or you see the timeout messages during push operations |
UnpinOld | false | If this flag is set, the bridge will instruct the IPFS node to unpin the previous state of the repository from the IPFS network. Useful for large repositories shared over IPFS only using tags |
Republish | true | Attends to change the addressed immutable CID if an IPNS node is specified as the remote address in the Git repository |
IPNSTTLString | 24h | String respresenting the duration of IPNS key ownership announcing after republishing |
CIDVersion | 1 | CID version used to generate for the obtained immutable entries. Due the CIDs of version 0 are case-sensitive and incompatible with URI specs, it is recommended to use 0 value only if IPNS node is used in the remote |
UserName | Username to tell to HTTP API endpoint in case it is remote and Basic or Digest HTTP authentication is used. This option is commented by default | |
UserPassword | Username to tell to HTTP API endpoint in case it is remote and Basic or Digest HTTP authentication is used. This option is commented by default |
Advanced Settings
Parameter | Default value | Description |
---|---|---|
IPFSChunker | size-65536 | Chunker routine name for being used to generate DHT data blocks for IPFS. Default is a linear chunker routine with 64KB block size |
Commands and online help
The program works with the commands formulated like the following:
git ipfs <command> <mode> [options] <arguments>
Every command has its own online help. The -h
key should be used to view the online help of the given command. In the help message, all available modes and options for the specified command will be described. Note that if the argument/option is not required, its name will be written within the brackets ([]
). For example, for clone
command the online help looks as the following:
git ipfs clone -h
usage: git-ipfs clone [-h] [-t TIMEOUT] [-r [REMOTE_NAME]] [-b [BRANCH]] [-n [USERNAME]] [-p [PASSWORD]] ipfs_id directory api_url [api_port]
positional arguments:
ipfs_id IPFS CID or IPNS peer name to use as remote ID
directory Relative directory to clone the repository in
api_url IPFS node API URL (API must be active to view the remote Git database!), default is http://127.0.0.1
api_port IPFS node API port (will be attached to URL) [5001]
options:
-h, --help show this help message and exit
-t TIMEOUT, --timeout TIMEOUT
Network timeout for API communications, sec (float)
-r [REMOTE_NAME], --remote-name [REMOTE_NAME]
Gives the remote name to make an IPFS remote, default is origin
-b [BRANCH], --branch [BRANCH]
Gives name of the branch to checkout
-n [USERNAME], --username [USERNAME]
HTTP authentication username
-p [PASSWORD], --password [PASSWORD]
HTTP authentication password
Every working mode also has its own online help. If the user specifies -h
key after mentioning the working mode, for example, git ipfs config manage -h
, the online help for this mode will be printed.