Awesome
Thinking around integrating Zerocash on Ethereum
Smart contracts
- AddressRegistry: Keeps the public keys of all peers. With these keys peers can encrypt and share secrets with other peers.
- Mixer/MerkleTree contract: Handles deposits of funds, and withdrawals. The Mixer contains a MerkleTree where the commitments are stored, and calls a Verifier to verify the provided proof during a withdrawal.
- Verifier: Contains the verification key and is in charge of verifying proofs provided by recipients during a withdrawal
- Broadcaster (Optional): This contract aims to be the recipient of all "private transactions". After a sender (Alice) has deposited a commitment in the Mixer, she encrypts the secret used to build the commitment with the recipient's public key (available in the KeyStore), and calls the "Broadcast" function of the Broadcaster contract. This "Brodcast" function only emits an event which field is the ciphertext sent by Alice. All nodes of the network listen to the events of the Broadcaster, and try to decrypt all payload of this event to see if they are the recipients of the transaction.
Raw protocol
- Both Alice and Bob generate their "private addresses" which is a tuple in the form
(Addr_pk, Addr_sk)
, whereAddr_pk = (a_pk, e_k)
,Addr_sk = (a_sk, d_k)
, anda_pk
is the address (public) derived froma_sk
(private), ande_k
is a public encryption key, andd_k
is a decryption key (private). Here the private/public key pair(e_k, d_k)
is used to be able to send some private data from one peer to another. - Both Bob and Alice publish the public part of their "private addresses" to the
AddressRegister
contract in order to be able to receive payment from other counterparts. (Note: This step is not absolutely necessary as Bob can send the public part of his address (Addr_pk = (a_pk, e_k)
) to Alice, by using another channel of communication (offchain communications)). - Let Alice and Bob have all their funds on their only Ethereum account, and
v_A
,v_B
respectively denote the amount on Ether that Alice and Bob have respectively.- Alice commits to the value of her balance, and to her "private address" by doing:
- Choose a random ρ (used to derive the serial number during the spending of a coin)
- Compute the commitment:
k <- COMM_(r_i) (a_pk_A || ρ)
, whereCOMM_(r_i)
represents the commitment function using randomnessr_i
, anda_pk_A
represents Alice'sa_pk
. - Compute the commitment
cm <- COMM_(s_i) (v_A || k)
, whereCOMM_(r_i)
represents the commitment function using randmonesss_i
, andv_A
represents all the funds owned by Alice on her account.
- The previous step results in a tuple:
c = (a_pk_A, v_A, ρ, r_i, s_i, cm)
, which Alice can use in order to generate a zk-SNARK proof that the commitmentcm
she wants to append as a leaf to the Merkle tree (held on the Merkle tree/Mixer contract) is a commitment to the value she is paying (thedeposit
function to append a commitment as a leaf of the Merkle tree is payable. This decreases Alice's balance, and increases the one of the smart contract), is the value she commited to in the commitment, and that the commitment has been computed following the "good rules". - Bob does the same: Creates a commitment and generates a proof.
- Alice commits to the value of her balance, and to her "private address" by doing:
- Both Alice and Bob have their tuples and their proof. They now call the
deposit
function of the Merkle tree/Mixer contract in order to append their commitment as leaves of the merkle tree. The merkle tree/mixer contract verifies the proof (either call of a precompiles or another verifier contract), and appends the leaves if the verification is successful.- Note1: Anyone watching the network can see that Alice and Bob just created commitments to the values of their ethereum accounts. Thus, the previous steps didn't provide any level of obfuscation for both users.
- Note2: Alice and Bob need to pay for gas in order to be able to interact with the smart contracts. Knowing this, we make, in the following points, the assumption that both Alice and Bob deposited almost all their Ether into the mixer contract, but kept a minimal amount of their balance in order to pay the gas for the different smart contract functions they want to execute (let's say Alice has 20 eth, she deposited 18 eth to the contract, and kept 2 eth for her to interact with the mixer in the future).
- Note3: Here we assume that Alice and Bob are NOT the only users of the merkle tree/mixer smart contract. We make the assumption that N users are in fact using the mixer (ie: Size of the anonymity set =
N
). As mentioned earlier, every user depositing to the contract has to do so from a funded Ethereum address. Assuming a link between every Ethereum address and the identity of the account owner, we have an anonymity set ofN
known identities.
- Now that Alice and Bob have their funds deposited to the mixer, they can start to transact in a "private" way.
- Let's say Alice deposited 18 eth to the mixer, and now wants to send Bob 2 eth. Then Alice can create
m
commitments tom
values (not necessarily different)v_1, v_2, ..., v_m
, and can generate a proof that∑ (i=1 to m) v_i = 18 (the value of the commitment she appended)
, and that them
commitments she generated are computed following the good rules. Moreover, she'll need to prove that the knows the private inputa_sk
that correspond to thea_pk
used to compute the commitment she "spends". Finally, she can compute the serial number of the commitment she wishes to "spent/split" by deriving it fromρ
. - Once Alice did that, she can append her
m
new commitments, and reveal the serial numbersn
of the one she splitted/spent, without leaking the specific commitment she just spent to anyone. - At that point Alice introduced some obfuscation regarding her funds, as people are not able to know whether she paid
m
personsm
different amounts, or whether she paidp
person (withp < m
) and took back some change, or just whether she split her funds to herself. - Now Alice can use her set of commitments in order to pay Bob or any other user of the network.
- Let's say Alice deposited 18 eth to the mixer, and now wants to send Bob 2 eth. Then Alice can create
- In order for Bob to be able to actually receive a payment from Alice, he needs to have access to the value of the "coin" (the tuple
c = (a_pk_A, v_A, ρ, r_i, s_i, cm)
), in order to see whether Alice commited to the good values (ie: If Alice owed Bob 3 ETH, Bob needs to make sure that Alice really commited to 3 in the commitment she appended to the merkle tree, and that she really used thea_pk
of Bob if the commitment generation). To do so, Alice needs to send this sensitive information to Bob, without anyone else being able to tamper/see it. A solution to this would be to use off-chain communications, and imagine a secure channel between Alice and Bob. Nevertheless, having this data recorded on-chain would be very useful (auditing, easy access to one's "private balance" via backup keys on another machine, ...). In order to do so, we can leverage theevents
emitted by smart contracts, and introduce abroadcast/relay
contract that would hav a single functionbroadcast
, that would take a ciphertext (C = E_(e_pk)(coin information)
) and emit an event containing this cyphertext. As a consequence, users of the mixing service could just implement anevent listener
to automatically receive everyBroadcast
events, and try to decrypt it using their privqte keyd_k
. If the decryption is successful, this meants that they potentially received a payment (potentially because they still need to recompute the commitment). If the decryption fails with their key, this means that they weren't the destination of the payment (and learn nothing about who the real recipient is).- Note: This
broadcast/relay
contract is useful to make the system more usable, as people can directly receive the events and try to decrypt them. This make the system easier to automate and easier to use. Nevertheless, requiring another smart contract to handle the responsability to broadcast cyphertexts is quite annoying as the action to append a commitment representing a payment to another user and the broadcasting of the event are 2 distinct transactions that can end up in different blocks. In order to make this sequence of actions atomic, we can add the ciphertext of the coin' data as an extra argument of the function appending a commitment to the merkle tree, and emit the event at the end of this function. That way both actions happen at the same time, and the recipient gets notified of him receiving a payment as soon as the deposit transaction is mined.
- Note: This
Privacy level of this system
As aformentioned (and as the title indicates), this small write-up aims to integrate the zerocash protocol on Ethereum. Nevertheless, the use of smart contracts introduce quite some changes in the amount of data that is available to an attacker.
The idea here is to convert the network users' balances into some sort of "abstractions".
In fact, as it is today, we know that a transaction where Alice sends 1ETH to Bob is going to trigger a state transition from a state T
- where Alice's balance is X
, and Bob's balance is Y
-, to a state T'
where Alice's balance is going to be X-1
, and Bob's balance is going to be Y+1
.
Thus, an attacker can look at the transaction object and get: sender, receiver, amount
.
Using the design described above, however, the amount of data leaked at each transaction is smaller. In fact, each user's balance becomes "abstracted" by a set of commitments, that are binding, but also hiding (you need to have some secret data to "open" the commitment). As a consequence, every call to the mixer contract to do "private" transfers to a counterpart still leaks the sender (Alice for instance), as the sender still needs to have a funded Ethereum address to be able to pay for the gas of the smart contract call, BUT hides the recipient and the amount. It is trully the ability to Join and Split the "commitments" that makes this design usable on Ethereum as it is today.
Of course, being able to know the identities of the set of users might not be enough in some cases. Nevertheless, this ability to combine commitments as we which means that, we are able to "break" the transaction graph. If we note the transaction graph by G = (V, E)
, where V
is the set of vertices (nodes), and E
is the set of edges (links), we go from a graph G_before
where V
contains all the users of the system, and E
contains the set of transactions that correspond to their interactions; to a graph G_after
, where V
still contains a set of known users, but where E
is the empty set (no interaction/link can be drawn from one user to another - assuming they all use the mixer contract obviously).
While calling the mixer contract and "spending" commitments still generates state transitions (the state transition here is basically encoded by the smart contract function we call, and modifies the merkle tree, as commitments are added, and serial numbers are revealed), but these state transition do not reveal the same amount of information as the state transition triggered by plain Ethereum transactions. (All state transitions not equal in term of information leakage!)
Remarks
- In some scenarios it can be hard to infer the cardinality of the anonymity set. In fact, assuming that we have this
AddressRegistry
contract that keeps the public part of the "private address" of some of the users desiring to transact privately, on-chain. We cannot be sure that the number of entries in this list gives the size of the anonymity set. In fact, Alice and Bob, can decide to do all their communications off-chain (exchange of their private addresses, and exchange of their coin secret). Thus, while Alice still needs to call the deposit function of the mixer from a funded address, Bob is the recipient of a commitment that is in the set of the leaves of the merkle tree, but does not appear on theAddressRegistry
. Thus, only Alice knows that the cardinality of the anonymity set hassize of AddressRegistry + 1
(+ 1 corresponding to Bob) as a lower bound. Other users of the network still havesize of AddressRegistry
as a lower bound of the anonimity set. They are able to refine this lower bound when they see Bob calling the mixer contract function from his public address for the first time. In fact, if Bob has never registered himself to theAddressRegistry
, and has never communicated with the mixer contract in the past; if he successfully reveals a serial number and appends a new commitment, this means that he just did a private payment via the mixer, and thus the other users can adjust their estimate of the lower bound of the cardinality of the anonymity set. - It should also be possible for Carol, who does not care about privacy, and thus who does not want to use to mixer, to deposit on the mixer "on-behalf" of Alice (who cares about her privacy), by creating a commitment to the value she deposits, and use Alice's address from the
AddressRegistry
. That way, Alice, sees her "private balance incremented" (she "controls" another commitment in the set of leaves of the merkle tree) wihtout needing to create it herself from her public Ethereum address. She can just decide to receive a payment either using her Ethereum address, or, "directly in the mixer". - Leakage of this solution: We can distinguish between users using the mixer (users part of the anonymity set, ie: users who publised the public pqrt of their address on the
AddressRegistry
contract), and those who are not (users who have never used theAddressRegistry
contract can be part of the anonimity set, as they can still communicate their address to the sender using a separate communication channel). As a consequence, it is important to be aware about the fact that registering your address to theAddressRegistry
leaks the fact that you are user doing private transactions (which can be a problem in some scenarios!). A way to ensure indistinguishability of all users of the system, would be to "force" every user to register on theAddressRegistry
(which can be possible in a permissioned chain with a fairly small amount of users, but can quickly become untractable in the case of a public setting...). - In order to implement such a solution, we can imagine a multi-threaded piece of software that would run a "listener" in one thread, and another thread dedicated to submit payments. The listener, would listen to the events emitted by the mixer, and try to decrypt them. If the decryption is successful, then the plaintext could be given to a function trying to recompute the commitment from the coin's data (make sure the sender actually commited to the given data), and if this check is succesfull, this coin would be added to a "coinstore" kept locally. In parallel, this piece of software should also be able to emit payments by the user (ie: Compute hashes and commitments to be submitted to the merkle tree of the mixer).
- The real advantage of using an encrypted broadcast in order to "deliver" the coin's data to the recipient, is to "store" this data on-chain. By doing so, we can imagine a scenario where each user of the system keeps a backup of its keys on a cold storage, and is still able to access his funds by "scaning" the chain (looking at the events emitted by the mixer contract by using the bloom filters for instance). We can even imagine a solution where each user would have an "on-chain" backup of his keys, by using Shamir's secret sharing in order to distrbute shares of his secrets to "friends" on the network. These shares could be given back to the user in order to help him recover his secret and scan the blockchain to access his funds again (and repopulate his coinstore).
Disclaimer
This write-up is a modest proposal to try an integrate part of the ZeroCash protocol on Ethereum. Experiments around this topic have been done:
Although, to the best of my knowledge, none of them has done a full integration supporting the JoinSplit feature (that is very important for the privacy level).
Happy to get feedback about this, and happy to discuss it (see if I made some mistakes, see if we can refine this, have new ideas, ...)
Illustrations