Home

Awesome

GenSync-lib

GenSync-lib is the version for packaging. It provides a library for synchronizing data efficiently across different machines on MacOS and Linux. The library includes implmenetations of several state-of-the-art protocols that optimize different metrics, including:

The current version is 2.0.4


Table Of Contents

<a name="Examples"></a>

Examples

Here is a simple example of GenSync in action. It can be compiled as follows, with /opt/local replaced by your MacPorts directory, or if on Linux replaced by your libraries directory:

TryMe.cpp

This program launches two processes, connected by a network socket:

Compilation

On MacOS, the program can be compiled with

$ g++ -I/opt/local/include -L/opt/local/lib -std=c++11 TryMe.cpp -lgensync -lntl -o tryme
$ ./tryme

On Linux, the program can be compiled with

$ g++ -I/usr/local/include -L/usr/local/lib -std=c++11 TryMe.cpp -lgensync -lntl -o tryme
$ ./tryme

Output

The output from the program shows both hosts with the same sets (note that the order of elements within a set does not matter):

host 1 now has a b c d 
host 2 now has b d c a 

Code

#include <iostream>
#include <GenSync/Syncs/GenSync.h>

int main() {
  // BUILD the first host
  GenSync host1 = GenSync::Builder().
    setProtocol(GenSync::SyncProtocol::CPISync). // CPISync protocol
    setComm(GenSync::SyncComm::socket).		 // communicate over network sockets
    setMbar(5).					 // required parameter for CPISync
    build();
  
  // BUILD the second host
  GenSync host2 = GenSync::Builder().
    setProtocol(GenSync::SyncProtocol::CPISync).
    setComm(GenSync::SyncComm::socket).
    setMbar(5).
    build();

  // ADD elements to each host
  // ... host 1
  host1.addElem(make_shared<DataObject>('a')); // DataObject containing a character 'a'
  host1.addElem(make_shared<DataObject>('b'));
  host1.addElem(make_shared<DataObject>('c'));

  // ... host 2
  host2.addElem(make_shared<DataObject>('b'));
  host2.addElem(make_shared<DataObject>('d'));

  // FORK into two processes
  if (fork()) {
      // ... PARENT process
      host1.clientSyncBegin(0);		     // set up the 0-th synchronizer and connect to a server
      cout << "host 1 now has ";
      for (auto &i: host1.dumpElements())    // print out the elements at host 1
	cout << i << " ";
      cout << endl;
    }
    else {
      // ... CHILD process
      host2.serverSyncBegin(0);		      // set up the 0-th synchronizer and wait for connections
      cout << "host 2 now has ";
      for (auto &i: host2.dumpElements())     // print out the elements at host 2
	cout << i << " ";
      cout << endl;
    }
  
  }

TryMe2.cpp

A more complicated example allows the user to select various synchronization parameters from the command-line.

#include <iostream>
#include <GenSync/Syncs/GenSync.h>

using std::cout;
using std::endl;
using std::string;

int main(int argc, char *argv[]) {
    if(argc<=1 || strcmp(argv[1], "client")!=0 && strcmp(argv[1], "server")!=0) {
      cout << "usage: '"
	   << argv[0]
	   << " client <sync type>' for client mode, 'TryMe server <sync type>' for server mode." << endl;
        cout << "run the client in one terminal instance and the server in another." << endl;
        exit(0);
    }

    GenSync::SyncProtocol prot;
    string type = string(argv[2]);

    // no string switch statements :(
    if(type == "CPISync") {
        prot = GenSync::SyncProtocol::CPISync;
    } else if (type == "InterCPISync") {
        prot = GenSync::SyncProtocol::InteractiveCPISync;
    } else if (type == "OneWayCPISync") {
        prot = GenSync::SyncProtocol::OneWayCPISync;
    } else if (type == "FullSync") {
        prot = GenSync::SyncProtocol::FullSync;
    } else if (type == "IBLTSync") {
        prot = GenSync::SyncProtocol::IBLTSync;
    } else if (type == "OneWayIBLTSync") {
        prot = GenSync::SyncProtocol::OneWayIBLTSync;
    } else {
        cout << "invalid sync type!" << endl;
        exit(1);
    }

    const int PORT = 8001; // port on which to connect
    const int ERR = 8; // inverse log of error chance
    const int M_BAR = 1; // max differences between server and client
    const int BITS = CHAR_BIT; // bits per entry
    const int PARTS = 3; // partitions per level for partition-syncs
    const int EXP_ELTS = 4; // expected number of elements per set

    GenSync genSync = GenSync::Builder().
			setProtocol(prot).
			setComm(GenSync::SyncComm::socket).
			setPort(PORT).
			setErr(ERR).
			setMbar(M_BAR).
			setBits((prot == GenSync::SyncProtocol::IBLTSync || prot == GenSync::SyncProtocol::OneWayIBLTSync ? BITS : BITS * CHAR_BIT)).
			setNumPartitions(PARTS).
			setExpNumElems(EXP_ELTS).
            build();

    genSync.addElem(make_shared<DataObject>('a'));
    genSync.addElem(make_shared<DataObject>('b'));
    genSync.addElem(make_shared<DataObject>('c'));

    if(strcmp(argv[1], "client")==0) {
        genSync.addElem(make_shared<DataObject>('d'));

        cout << "listening on port " << PORT << "..." << endl;
		genSync.clientSyncBegin(0);
        cout << "sync succeeded." << endl;

    } else {
        genSync.addElem(make_shared<DataObject>('e'));

        cout << "connecting on port " << PORT << "..." << endl;
		genSync.serverSyncBegin(0);
        cout << "sync succeeded." << endl;
    }
}

On MacOS, to compile use:

$ g++ -I/opt/local/include -L/opt/local/lib -std=c++11 tryme2.cpp -lgensync -lntl -o tryme2

On Linux, to compile use:

$ g++ -I/usr/local/include -L/usr/local/lib -std=c++11 tryme2.cpp -lgensync -lntl -o tryme2

To run, open two terminals. In one issue the command:

$ ./tryme2 server CPISync
connecting on port 8001...
sync succeeded.

In a second, issue the command:

$ ./tryme2 client CPISync
listening on port 8001...
sync succeeded.

<a name="UseInstructions"></a>

Extended Use Instructions:

  1. Initialize a GenSync object with the constructor or builder helper class on the client and server machine

        GenSync::Builder builder = GenSync::Builder().
           setProtocol(GenSync::SyncProtocol::GenSync).  //GenSync,InterCPISync, ProbCPISync, IBLTSync, FullSync, etc.
           setComm(GenSync::SyncComm::socket). //CommSocket or CommString
           setPort(8001).
           setHost(localhost).
        
           //Set the parameters for your specific sync type (Not the same for every protocol. See GenSync build parameters section)
           setBits(64). // Bits per element
           setMbar(256). // Max differences to sync
           setErr(7); // -log_2(prob of error) to allow for errors that might otherwise underflow
           
        GenSync mySyncClientOrServer = builder.build();
    
  2. Add elements to your GenSyncs (If you need to add large elements use the ZZ class from NTL)

    • You may manually create a shared_ptr<DataObject> (Data/DataObject.h) or pass a data type compatible with DataObject and one will be automatically created for you, returning a pointer to the newly created DataObject
        mySyncClientOrServer.addElem(myZZ);
        mySyncClientOrServer.addElem(myLong);
        mySyncClientOrServer.addElem(myInt);
        mySyncClientOrServer.addElem(myMultiSet); // Multisets are serialized for use in IBLTSetOfSets. This does NOT add each element in the multiset to your sync.
    
  3. Run serverSyncBegin and clientSyncBegin on the server and client respectively

        mySyncServer.serverSyncBegin(0); //Add the index of the sync you would like to perform
        mySyncClient.clientSyncBegin(0);  //Multiple syncs or communicants may be added to one GenSync
    
  4. Collect relevant statistics

       //The sync index is decided by the order you added your sync in
       //If your GenSync has only 1 sync the index is 0
        mySyncClient.printStats(syncIndex); //Returns a string of formatted stats about your sync
        mySyncClient.mySyncVec[syncIndex]->getName(); //Retruns the name and parameters for your sync
        mySyncClient.getXmitBytes(syncIndex); //Returns the number of bytes transmitted by this sync
        mySyncClient.getRecvBytes(syncIndex); //Returns the number of bytes received by this sync
        mySyncClient.getCommTime(syncIndex); //Returns the amount of time in seconds that the sync spent sending and receiving info through a socket
        mySyncClient.getIdleTime(syncIndex); //The amount of time spent waiting for a connection or for a peer to finish computation
        mySyncClient.getCompTime(syncIndex); //The amount of time spent doing computations
    
        
    

<a name="BuilderParameters"></a>

GenSync Builder Parameters:

<a name="SyncTypes"></a>

Sync Types:

<a name="References"></a>

Reference:

If you use this software in your works, please cite the following paper (DOI):

Boškov, Novak, Ari Trachtenberg, and David Starobinski. "Gensync: A new framework for benchmarking and optimizing reconciliation of data." IEEE Transactions on Network and Service Management 19.4 (2022): 4408-4423.

or in BibTex:

@article{bovskov2022gensync,
  title={Gensync: A new framework for benchmarking and optimizing reconciliation of data},
  author={Bo{\v{s}}kov, Novak and Trachtenberg, Ari and Starobinski, David},
  journal={IEEE Transactions on Network and Service Management},
  volume={19},
  number={4},
  pages={4408--4423},
  year={2022},
  publisher={IEEE}
}

Additional literature

The code, explanation, references, API, and a demo can be found on this web page. If you use this work, please cite any relevant papers below.

The main theoretical bases for the approaches in this work are:

Relevant applications and extensions can be found at:

Additional algorithms:

<a name="Contributors"></a>

Contributors:

Elements of the GenSync project code have been worked on, at various points, by:

<a name="Installation/Compilation"></a>

Installation/Compilation:

$ git clone --recurse-submodules git@github.com:nislab/gensync-lib.git
$ cd gensync-lib
$ cmake -B build
$ cmake --build build

Acknowledgments: