We continue to explore the marvelous world of consensus in the Asynchronous model. In this post, we present Ben-Or’s classic protocol from 1983. In the next post, we will present a more modern version that is a simplified version from our paper.

In the previous post we defined the problem of Asynchronous Agreement, so without further ado, here is Ben-Or’s protocol for Binary Asynchronous Agreement with $n=2f+1$ parties assuming the adversary can crash $f$ parties:

```
Ben-Or's Protocol for party i
input: v (0 or 1)
r:=1
while true
send <echo1, r, v> to all
wait for f+1 <echo1, r, *>
if all have value w, then send <echo2, r, w> to all
otherwise send <echo2, r, bot> to all
wait for f+1 <echo2, r, *>
if all have the same non-bot value u, then decide(u)
if all have the value bot, then v:= coin()
otherwise, v:=u where u is a non-bot value from <echo2, r, *>
r++
```

Note that this protocol does not terminate when it decides. Adding a *termination gadget* that provides termination in the crash model is quite simple:

```
Termination Gadget for crash failures
If receive <decide u>, then
decide(u)
If decide(u), then
send <decide u> to all
Terminate
```

Why does Ben-Or’s protocol work? The first thing to check is **Weak Validity** (if all are non-faulty and start with $b$ then all decide $b$). Indeed, if all parties have the same input $b$, and no party is faulty, then clearly all parties will send `<echo1, 1, b>`

, hence will see $f+1$ `<echo1, 1, b>`

message after one round, hence will send `<echo2, 1, b>`

, hence will see $f+1$ `<echo2, 1, b>`

message and decide (and terminate from the termination gadget).

The next property to verify is **Agreement** (that no two parties decide on different values). This follows from two simple claims:

*Claim 1*: there cannot be `<echo2, r, 0>`

and `<echo2, r, 1>`

. This follows from quorum intersection, sending echo2 requires $f+1$ echo1 of the same value, but this would mean one party sent both `<echo1, r, 0>`

and `<echo1, r, 1>`

which is not possible.

*Claim 2*: if one party sees $f+1$ messages for `<echo2, r, u>`

then all parties will see at least one message for `<echo2, r, u>`

. This follows since at least one of the $f+1$ parties that sent `<echo2, r, u>`

is non-faulty and again from quorum intersection.

Combing both claims, we can now look at the first round $r$ in which a party decided value $u$, and conclude that any party that reaches the end of round $r$ will have $v=u$. We can now use the argument above for Weak Validity and prove that all non-faulty parties will decide $u$ by round $r+1$.

Now for the hard part. Why does this protocol have **Finite Expected Termination**?

It turns out that this is a non-trivial theorem for an adaptive adversary. It took over 15 years to realize via a rather complicated argument:

**Theorem: [Aguilera, Toueg 1998]**: Ben-Or’s protocol with $n=2f+1$ parties has an Finite Expected Termination of $O(2^{2n})$.

This leaves us in an unsatisfying state. While Ben-Or’s protocol is elegant, the proof for Finite Expected Termination is non-trivial and not easily taught in a standard class (or a short blog post). Is there a simpler proof? Maybe there is a reason for the complexity? Another concrete question is about efficiency: can we improve the $O(2^{2n})$ bound to a more natural $O(2^n)$ bound?

More fundamentally, the question we ask is: how should we argue about the expected termination of Asynchronous protocols with an adaptive adversary? Is there a more general framework that can allow us to decompose the problem?

We will answer this in the next post.

Thoughts and comments on Twitter.