Verifiable Information Dispersal (or VID) has its roots in the work of Michael Rabin, 1989 which introduced the notion of Information Dispersal (ID). Adding verifiability (referred to as binding in this post) to obtain VIDs was done by Garay, Gennaro, Jutla, and Rabin, 1998 (called SSRI). Cachin and Tessaro, 2004 introduced the notion of Asynchronous VID (or AVID). See LDDRVXG21 for state-of-the-art.
VIDs are important primitives at the intersection of distributed computing and cryptography. Their centrality can be seen through four different perspectives:
- VIDs through the lens of BFT: an important optimization in BFT protocols is to allow proposing larger messages without increasing the communication complexity. This line of work has root in TC84 and extended to use error correction codes in FH06 and LV11. VIDs (and their variations) are a key building block that allows state-of-the-art BFT protocols to do this efficiently (for example, see BLLN21, Dumbo-MVBA and DispersedLedger).
- VIDs through the lens of Secure Multi Party Computation (MPC): VIDs can be seen as Verifiable Secret Sharing (VSS) without the hiding property. VIDs and VSSs share the same binding properties and the same termination properties. Many of the variations of VSS (for example packed VSS and publicly verifiable VSS) can be cast to VID variations and vice versa.
- VIDs through the lens of traditional distributed computing: VIDs can be viewed as the extension from crash failures to Byzantine failures (and the corresponding transition from erasure codes to error correction codes) for the fundamental distributed computing primitive of Single Writer Multi Reader objects. These objects go back to the foundational ABD protocol for sharing memory robustly Attiya, Bar-Noy, and Dolev, 1990. The extension of ABD to use erasure coding (see Cadambe, Lynch, Medard, and Musial, 2014) is used in all major cloud storage solutions (see here).
- VIDs through the lens of blockchain scalability: the recent surge of interest in VIDs is related to the realization that they are a key building block in obtaining blockchain scalability (see this post). The influential work of Al-Bassam, Sonninio, Buterin led to the creation of modular data availability networks and a key part of the ethereum’s danksharding design. The Use of VIDs is an important part of many other Data Availability solutions, like EigenDA and Espresso DA. Just as SNARKS are vital for $O(1)$ scaling of blockchain execution overhead, VIDs are vital tor $O(1)$ scaling of blockchain storage overhead.
It’s fascinating that different communities can see the same object through different lens, but underneath it all - its the same object and the work is very much connected!
What is a VID?
A VID is a pair of protocols: dispersal and retrieval. In the dispersal protocol, a designated sender commits to a message $M$ and disperses parts of it to the whole network. In the retrieval protocol, parties can now cooperate to all retrieve the committed message.
This pattern of two protocols follows the same pattern as VSS’s share and recover and SWMR register’s write and read operations.
A VID has the following key properties (the termination properties can be strengthened in synchrony):
- Binding: at the time the first non-faulty party completes the dispersal protocol, there exists a committed value, $M’$. Slightly more formally, it is sometimes required to prove that there exists an extractor that can generate $M’$ from the views of the honest parties.
- Validity: if the designated sender is non-faulty, then the committed value is its input, $M=M’$.
- Agreement: any two non-faulty parties that output a value in the retrieval protocol, output the same value, $M’$ (defined in the binding property).
- Termination of dispersal: if the designated sender is non-faulty, it will complete the dispersal protocol.
- Totality of dispersal: if any non-faulty completes the dispersal protocol then all non-faulty will complete it.
- Termination of retrieval: if all non-faulty start the retrieval protocol after all non-faulty terminate dispersal then all non-faulty will complete it.
For some use cases totality of dispersal is not needed, and termination of retrieval is weakened so that even if one honest party terminated the dispersal, all will complete the retrieval. We will call this variation, provable but un-total AVID. See the end of this post for why it is crucial for efficient BFT protocols.
Security of VIDs
VIDs vary in the assumptions needed to obtain their properties: the size of the adversary, it’s adaptiveness, and the cryptographic assumptions.
An important variant is where the retrieval can be run for a particular recipient, often with reduced communication complexity. Additional properties like being additive and/or packable are also sometimes useful.
Complexity measures
Three main measures:
- Round complexity: how many rounds do the dispersal and retrieval protocols take. Often the retrieval is a single round. See this post for how rounds are defined in asynchrony.
- Communication complexity: as a function of the number or parties $n$ and the size of the input message $\ell=|M|$. Dispersal costs of $O(\ell + \kappa n^2)$ and retrieval costs of $(\ell + \kappa n)$ where $\kappa$ is a cryptographic security parameter are the current state of the art.
- Storage overhead: the ratio between the total storage of all parties and the message size $\ell=|M|$. Obtaining a small constant ratio is important both in theory and in practice.
Lower bounds
Running dispersal and then retrieval is equivalent to reliable broadcast. Therefore:
-
The best resilience one can hope for is $t<n/3$ in partial synchrony and $t<n/2$ in synchrony (here the contradiction is on the binded value of the dispersal).
-
The Dolev Reischuk lower bound holds. This means the total message complexity of both protocols together must be at least $O(n^2)$ messages (in the worst case, or in the average case against a strongly adaptive adversary that can claw back messages).
This implies that getting an overhead of $O(n \ell)$ is only possible when the size of the message $\ell$ is at least $\Omega(n)$. This is again the power of batching.
Understanding the exact round complexity lower bounds remains an interesting open problem.
Storage overhead: Since we want to be resilient to any $t$ malicious parties, then storing $\ell$ bits requires any $n-t$ parties to be able to retrieve. So the total storage divided by $\ell$ cannot be smaller than $\frac{n}{n-t}$. For example if 7 parties, 2 of which may crash, want to store 5 bits, then every party needs to store one bit, so the storage overhead is at least $7/5=1.4$.
The classic Asynchronous VID of Cachin and Tessaro
This elegant scheme uses four building blocks:
- Cryptographic hash function which we will denote $H(x)$.
- Merkle tree which is used as a succinct vector commitment.
- Non-trivial degree at most $t$ polynomials (over a finite field) have at most $t$ roots and the bijective mapping between its $t+1$ coefficients and any $t+1$ evaluations.
- Reliable Broadcast to broadcast the root of the Merkle tree.
We assume $n=3t+1$ and will be working over a finite field of order larger than $n$. The designated sender has a message $M=m_0,\dots, m_t$ consisting of $t+1$ field elements. Let $P(X)$ be the polynomial $\sum_{0\le i\le t} m_i X^i$.
Dispersal protocol:
- The designated sender computes $P(1), \dots ,P(n)$ and builds a Merkle tree $T$ whose leaves are the values $P(1), \dots, P(N)$.
- The designated sender sends $root(T)$ via reliable broadcast.
- The designated sender sends each party $i$, the value $P(i)$ and the Merkle proof $proof(T,P(i))$.
- Party $i$ that receives a root $r$ via Reliable Broadcast and the value $p_i$, and a proof $proof_i$ checks the validity of the proof relative to the root $r$ and the value $p_i$. If the proof is valid, it sends an $ACK$ to all parties.
- Each party that sees $2t+1$ $ACK$ messages sends a $DONE$ messages to all.
- Each party that sees $t+1$ $DONE$ sends a $DONE$ to all (if it did not send it earlier).
- Each party that sees $2t+1$ $DONE$ messages completes the Dispersal protocol.
For a variant that is provable but un-total, the $ACK$ is sent back to the sender, which sends back $2t+1$ $ACK$s to all. Seeing $2t+1$ $ACK$s is a certificate that retrieval will succeed.
Retrieval protocol:
- Each party $i$ that has a valid $p_i$ sends it to all along with its $proof_i$.
- Each party that sees $t+1$ values that have valid proofs relative to the root: interpolate these values to a polynomial and check the validity of the Merkle tree. That is, compute a Merkle tree from the polynomial whose leaves are the evaluations at the points 1 to $n$ and check that it has the same root as the one broadcast in the dispersal protocol. If valid, then output the coefficients of the polynomial, otherwise output $\bot$.
Note that the check above works because a Merkle tree is a deterministic commitment to a vector.
Proof of binding
Once the first non-faulty party completes the dispersal protocol, it has received $2t+1$ $DONE$ messages. The first non-faulty party to send such a message did so after receiving at most $t$ $DONE$ messages, so it must have sent its message after receiving $2t+1$ $ACK$ messages. This means that by the time the first non-faulty party completes the protocol, at least $t+1$ non-faulty parties sent ACK messages after receiving points with correct proofs.
Choose some set of t+1 non-faulty parties that sent such ACKs, and interpolate their points to a polynomial $P$ of degree at most $t$.
Now compute a Merkle tree of the evaluations of $P$ and check that the root is equal to the root that the designated sender sent via reliable broadcast. Assuming the Merkle tree commitment is binding, then during retrieval, the only polynomial that can be interpolated must correspond to $P$.
Otherwise, the binded value is set to $\bot$. Due to the binding of the Merkle tree. During retrieval, all parties will see that it does not encode a degree at most $t$ polynomial and hence output $\bot$.
Note that this operation is an extraction of $P$ (or of $\bot$) from the views of the non-faulty parties at the time the first non-faulty completes the dispersal protocol.
Proof of validity
If the designated sender is non-faulty and has polynomial $P$ as input, then at least $t+1$ honest parties will get their shares and proofs during dispersal. So during retrieval, all parties will be able to reconstruct. Due to the binding properties of the Merkle tree, the adversary cannot provide proof for points outside of $P$.
Proof of agreement
If the binded value is the coefficient of a polynomial $P$, then the Merkle root is indeed a root for the vector $(P(1),\ldots,P(n))$. In that case, no party $j$ will be able to generate a correct opening proof for a point $p_j\neq P(j)$. This means that honest parties will only receive points on the polynomial $P$, and thus interpolate it and output its coefficients.
On the other hand, if the binded value is $\bot$, then there are $t+1$ parties that received openings of the Merkle tree with proofs, and interpolating these points results in some polynomial $Q$ such that the Merkle root is not a commitment to $(Q(1),\ldots,Q(n))$. Now, observe some honest party that completes the retrieval protocol. We would like to show that it outputs $\bot$. Assume by way of contradiction that it does not. This means that it received $t+1$ points with proofs, interpolated them to a polynomial $Q’$, and saw that committing to $(Q’(1),\ldots,Q’(n))$ yields the original Merkle root. However, from the commitment of the Merkle tree, it shouldn’t be possible to open the root to differing values at the same index. This means that $Q(i)=Q’(i)$ for the $t+1$ openings of the root defined above. Since both are polynomials of degree $\leq t$, $Q=Q’$. In total, we found that the root is not a commitment to $Q$, but is a commitment to $Q’=Q$, which is a contradiction. Therefore, any honest party that completes the retrieval protocol outputs $\bot$.
Here we use the Binding property and argue that any $t+1$ from $P$ will be interpolated to the same polynomial $P$.
Proof of termination of dispersal
If the designated sender is non-faulty, then all non-faulty will eventually send an $ACK$. Hence, eventually all will send a $DONE$ and eventually terminate.
Proof of totality of dispersal
Suppose some non-faulty terminates after seeing $2t+1$ $DONE$, so all non-faulty will see at least $t+1$ $DONE$. So eventually all will send a $DONE$ and eventually terminate.
Proof of termination of retrieval
Consider the first non-faulty that sends $DONE$, then there are at least $t+1$ non-faulty that sends $ACK$. So during retrieval, each non-faulty party will hear from at least $t+1$ non-faulty parties with correct proofs. Hence all will terminate.
Complexity Analysis
Round complexity
For a non-faulty designated dealer, the reliable broadcast takes 3 rounds, after which the $ACK$ and $DONE$ take another two rounds.
Note that sending an $ACK$ requires validating the proof which requires a root which means seeing the output of the reliable broadcast. One optimization is to wait for 2 rounds to get $n-f$ $ECHO1$ messages. This will save one round but also requires checking the reliable broadcast arrived before sending a $DONE$ messages (a similar round reduction trick is done here).
For a malicious designated dealer, the reliable broadcast may take 4 rounds, after which it may take one round of $ACK$ and two rounds of $DONE$ for a total of 7 rounds.
The retrieval protocol takes a single round.
Message complexity
We will assume each field element is a word (typically $O(\log n)$ bits).
We will also assume that the hash function maps to a word (a more detailed analysis would call this out explicitly as $\kappa$ bits where this is related to the required security parameter).
The Reliable broadcast takes $O(n^2)$ words as the content is the root which is a hash output.
The sender sends $O(n \log n)$ words because each Merkle proof is $O(\log n)$ words long. Each party sends $O(n)$ bits for a total of $O(n^2)$ words for the dispersal protocol.
Finally, the dispersal requires each party to send $O(n \log n)$ words because each Merkle proof is $O(\log n)$ words long. For a total of $O(n^2 \log n)$ words for the retrieval protocol.
Storage overhead
Each party stores $O(\log n)$ words and the message size is $t$ words. So the storage ratio is $O(n \log n)/t = O(\log n)$.
Analysis of the provable but un-total AVID variant
Note that using succinct signatures, the dispersal is now just $O(n)$ message complexity.
This allows running $n$ copies of dispersal, then choosing a leader and running a single $O(n^2)$ revival for just one party. For a total message complexity of $O(n^2)$. Many efficient BFT protocols use this pattern.
Acknowledgments
Many thanks to Qiang Tang for insightful comments and feedback.
Your feedback on Twitter.