Continuing the series on simple ways where randomization can help solve consensus. The model is lock-step (synchrony) with $f<n/2$ omission failures. We know that in the worst case reaching agreement takes at least $f+1$ rounds. Can randomization help reduce the expected number of rounds? In the post, we show a simple randomized consensus algorithm including a simple weak coin protocol.

Adaptive adversaries

As in the previous post for crash failures and in any use of randomness, we need to make sure that the randomness is unpredictable and that the adversary can only adapt to the randomness when it’s too late for it to matter.

Here we show how to build a weak common coin against a weakly adaptive adversary.

A weakly adaptive adversary in the lock-step model needs to decide all its actions in round $j$ after seeing everything that happened in previous rounds but before seeing any random value in round $j$ and any message in round $j$.

One way to think about this model is that the adversary is adaptive but needs one round of delay to take action in order to corrupt parties. This is in contrast to a strongly adaptive adversary which can corrupt parties after seeing what messages they sent and even claw back messages in the same round it corrupts.

A weak common coin for minority omission failures against a weakly adaptive adversary

A weak common coin for round $j$ has the following properties:

Unpredictable: The coin value for round $j$ cannot be predicted before the beginning of round $j$.

$\epsilon$-correct: for any $b \in {0,1}$, with probability at least $\epsilon$, all parties output $b$. In particular, here we will aim for $1/4$-correct.

The following one round protocol for $f<n/2$ omission faults is a $1/4$ correct weak coin:

Each party randomly chooses:
    a rank in [1,...,n^2]
    a bit in [0,1]
    sends (rank,bit) to all parties

Each party that hears n-f (rank,bit) values:
    outputs the bit associated with the highest rank it heard
    (break ties arbitrarily)

Clearly, unpredictability holds because the randomness is chosen in round $j$. What about $\epsilon$-correctness?

With probability at least $1/2$, the maximum rank is obtained by a non-faulty party and is unique. Conditioned on this event, all non-faulty parties will output the coin of this non faulty party. Hence we obtain $1/4$-correctness.

Note that the argument above uses the fact that the adaptive power of the adversary is weak. Otherwise, it could have learned the value of the max rank and adaptively corrupted that party to potentially cause disagreement.

Exercise: show how an adversary that can see all the random coins in round $j$ and then decide how to corrupt parties in round $j$ can cause an execution of $\Omega(f)$ rounds. Can you show $f+1$ rounds?

Next, we build an agreement protocol from this simple $1/4$-correct weak random coin.

Binary agreement for minority omission failures

Each party has an input 0 or 1 and the goal is to output a common value (agreement) that is an input value (validity).

The protocol runs in phases. Each phase consists of 3 rounds. A party that does not hear $n-f$ messages can simply shut itself down (crash) because it knows it has omission failures:

value := input

round 3j-2:
    send <value> to all parties
    wait for n-f or crash
    if all values are b, then value := b
    otherwise value := bot

round 3j-1:
    send <value> to all parties
    wait for n-f or crash
    if some value is b not bot, then value := b
    if all values are b, then output b

round 3f:
    send <rank, bit> to all parties
    wait for n-f or crash
    Let bit be from the highest rank
    if value := bot, then value := bit

Protocol in words: in the first round parties send their value and then either keep their value or switch to $\bot$ if they hear a conflict. In the second round, parties stay with $\bot$ only if they don’t hear any other value and they output $b$ if they hear $n-f$ values of $b$. Finally, in the third round, parties that end with $\bot$ use the weak coin protocol to obtain a new value.

The protocol’s analysis relies on quorum intersection: when $f<n/2$, any two sets of size $n-f$ must intersect by at least $n-2f>0$ elements.

Analysis

Validity: if all parties start with $b$ then all parties will hear $n-f$ values of $b$ in round one and hence all non-faulty will send $b$ in round 2. Hence all non-faulty will output $b$ at the end of round 2.

Weak agreement for first round: By quorum intersection, at the end of round one, it cannot be the case that parties have values 0 and 1.

Agreement: consider the first phase $v^\star$ where a party $I$ outputs $b$ in the second round of phase $v^\star$. Since $I$ saw $n-f$ messages for $b$ and from quorum intersection, any party that reaches the end of round 2 must see at least one value of $b$. Moreover, from the weak agreement for the first round, no party will see $1-b$. Hence all such parties have their value set to $b$ and hence will ignore the coin value. Then in phase $v^\star +1$, all parties start with $b$, so similar to the validity property will output $b$ in the second round of this phase.

Expected Termination: Conditioned on no party outputting a value in phases $<j$, the adversary must choose either to cause all parties to output $\bot$ or to lean into one of the values. Assume the adversary chooses to bind to the value $b$.

We use the fact that the adversary had to choose which value to bind to before it know the value of the coin:

From the weak common coin properties, with probability of at least $1/4$ all parties will see $b$. Hence by the end of phase $j$ all parties have the value $b$. Either because this is the value they had at end of the second round, or because they use the coin value at the end of the third round.

Zooming out

Each phase of the protocol consists of a graded crusader agreement protocol followed by a weak common coin protocol. This approach can be extended to the asynchronous setting as done in this post.

How did randomization help? we used the common coin as a virtual leader. To maintain validity, we only listen to the virtual leader when we are sure no party outputs a value in this round. Finally, for safety, parties that output $b$ are guaranteed that all parties have the value $b$.

Acknowledgments

Many thanks to Sravya Yandamuri and Naama Ben-David for insightful discussions and comments.

Your thoughts on Twitter.