By Michael Lodder
Zero-Knowledge proofs (ZKPs) are an increasingly popular means of enhancing privacy both in identity and cryptocurrency. The initial zero-knowledge proof ideas were first published in the 1980’s which at the time were dismissed as cryptography that wouldn’t go anywhere. Around that time, David Chaum pioneered the idea of enabling zero knowledge electronic payments. Unfortunately, these ideas came with royalties and patents and thus didn’t see much application. Lately, ZKPs have been applied to various cryptocurrencies as a means of both hiding the transacting parties and the amounts exchanged between them. ZKPs can also be used for shrinking the size of the blockchain so validators and observers can more quickly validate transactions.
In the cryptocurrency world, trading and spending works only if transactions cannot be double spent. So, two checks are performed: one, a transaction must be valid; and two, a transaction must not have been spent. Essentially the first check is a set membership test, and the second is a set non-membership test. A membership test is used to determine if an element is present in a set of values. A non-membership test is used to determine the absence of an element in a set of values. The most popular set of values in production are Merkle-Trees since checking if a value is in the set doesn’t require knowledge of the entire tree, only enough of the subset to validate a specific path to a value. Yet, applying ZKPs to a Merkle-Tree is complicated. The reason to use ZKPs is to hide the value and the path being checked to preserve privacy. But in order to do so, custom circuit-based ZKPs have to be designed which consider the various setup parameters:
- Which cryptographic hash function to use?
- What’s the maximum depth?
- Are transactions allowed to be pruned?
- Selection of elliptic curves
- Proving time and computational requirements
- Verification time
- Proof size
- Proving key size
- Set size
zkSNARK is the technique receiving the most attention and research to optimize the landscape. Coda is applying zkSNARKS in a recursive manner to shrink the size of the blockchain for validation. ZCash uses SNARKS to anonymize parties and obscure amounts exchanged by performing multiple proofs: one for signature validation, the other to ensure the transaction is valid. However, SNARKs come with some drawbacks. For one thing, the desired arithmetic circuit needs to be designed so a proper Common Reference String (CRS) can be generated. The CRS is the public parameters of the system [ZCash]. Then, which system should the SNARK use? Spartan, Halo, Hyrax, Libra, Sonic, Plonk, Marlin… Each one has tradeoffs between trusted setups, proof size, bilinear pairings or other elliptic curves, proof generation and generation time. SNARKS are not the only circuit based systems that can be used. Bulletproofs, Bulletproofs+, STARKS, SNARGS… the list goes on but there is no clear winner and all of them require advanced knowledge in cryptography. Fortunately, another cryptographic primitive exists for performing fast set membership (and non-membership) proofs that preserves privacy (in that the element being checked is not revealed), and provides succinct proofs. These are called cryptographic accumulators.
Accumulators allow parties to prove that an element x is in a set S or not regardless of the number of elements in S, while not disclosing which element was checked. Accumulators work by adding a value x to (or removing from) a constant-size accumulator A, and proving that membership witness u, when combined with x, equals accumulator A. Accumulator values are short fixed-length digests (similar to hash functions), but also support short proofs for membership and non-membership checks for any element in the set. In other words, an accumulator represents the entire set of elements by a single value, the size is independent of the number of elements. An accumulator is dynamic if elements can be efficiently added or removed from the set. Otherwise the accumulator is static. A universal accumulator is dynamic and supports both membership and non-membership proofs. Otherwise the accumulator is said to only support membership proofs.
Due to the simpler setup and shorter proofs, these offer an interesting alternative to SNARKs. This is not a new concept to replace the membership test portion of proofs with an accumulator [see OWWB19]. However, OWWB19 proposes using an accumulator inside the SNARK versus dropping SNARKs altogether. They show the accumulator SNARK combination has less expensive cost requirements in space and performance as will be shown in this article as well.
Accumulators fall into the following three categories: Unknown-Order e.g. [Sander97, BP97, BM93, CL02, LLX07, Lip12, BCDLRSY17, BBF18, OWWB19, DGS20], Bilinear Maps [Nguyen05, ATSM09, CKS08, DT08, Thakur19, VB20], and hash-based [CHKO08]. Each one offers tradeoffs between setup parameters, accumulator sizes, witness and proof performance and space. Unknown order accumulators are subdivided into RSA-based and Hyperelliptic curve based (class orders). The remainder of this article focuses on the story about implementing the RSA-based accumulator and how they work.
RSA accumulators afford the most efficient tradeoff between the number of setup parameters, updating witness requirements, proof computation and succinctness. The accumulator value and witness are the size of an RSA modulus. When implemented, the modulus, accumulator value and witnesses are 256 bytes with 32 byte elements, and membership proofs are 800 bytes and only require 90 milliseconds(ms) to compute, uses 2KB of RAM and 30ms to verify. Witness updates can be computed in 5–10ms depending on the underlying system. Non-membership proofs require creating two proofs but the proofs can be computed in parallel which takes the same amount of time but twice the space with 1.6KB.
Contrast this to ZCash’s JubJub SNARK which requires downloading the entire Merkle-Tree which is a couple of gigabytes, downloading the proving key around 32MB, generating a 458KB witness in 1 second using 150MB of ram, generating the proof using 106MB of RAM, taking 3 seconds, and producing a 1.4 KB byte proof [CryptQuest2018]. The advantage of cryptographic accumulators is that proofs are succinct and fast to compute.
To set up an RSA accumulator requires generating two safe primes (p, q) of sufficient size (≥ 1024-bits) to compute the coprime modulus n and creating a generator that is a quadratic residue g, essentially find a random number and compute the modular squaring. g is the empty accumulator value and used for creating non-membership proofs. The generation of (p, q) is commonly done with a trusted dealer but some distributed methods allow it to be done in a decentralized manner [AVD19] (trustless) albeit more complicated. Each element to be accumulated must be a unique prime number. Values can be mapped to prime numbers by a function that is usually either a lookup table or a hashing algorithm and added to the set S. The accumulator value is computed as follows:
Values are accumulated by computing the modular exponentiation of the current accumulator value to the new element. Removals can be computed two ways: with knowledge of p, q and without. The trusted accumulator creator (or dealer) uses p, q to compute Euler’s totient and compute:
Mind your p’s and q’s, because without them, the accumulator has to be recomputed by multiplying all elements in the accumulator with a modular exponentiation, again. A user has a membership witness or non-membership witness (or both) depending on the type of proof that is expected by a relying party. The membership witness u is computed:
The witness must be updated to reflect any changes made to the accumulator otherwise proofs will be invalid. Users can efficiently update their witnesses as elements are added to and deleted from the accumulator. Unfortunately, implementing the non-membership variant was much more complicated than the membership version. The batching paper shows the non-membership witness as
This works okay, proofs compute normally and work out just fine. The caveat happens when attempting to update the witness w. The batching paper refers to another paper LLX07 for this math. At first the math appears simple in section 4.2 where it says to do the following for addition:
To verify the correctness of the update, compute a current non-membership witness without doing an update and compare for equality. However, they don’t match (overturn desk).
To check if the match is working properly, I insert multiple assert statements that check the assumptions made in LLX07. All pass minus the final assumption. After being puzzled by this for a while, I read the LLX07 from the beginning and discover that the non-membership witness differs by a sign, Bg-b. I made this change and non-membership updates work correctly, but now the proofs are broken. The only option is to work through the math again to make sure things check out and look for where it doesn’t. The equation at fault here is the proof that computes the proof of knowledge of exponents (POE, proof that the exponents are known to the prover without revealing them) with the current accumulator A and non-membership value a. Instead of inverting V, the generator g needed to be inverted, moving the modular inverse operation. Here’s a before and after:
The post appeared first on The Coinbase Blog