We covered the classic DLS88 split brain impossibility result against a Byzantine adversary in a previous post:
In a follow up post, we discussed how CJKR12 strengthen this result by observing that it holds even if the adversary is weaker in that it must faithfully transfer messages. In this post we strengthen this result yet again by observing that it holds even against a much weaker rollback adversary suggested by Matetic, Kostiainen, Dhar, Sommer, Gervais, Juels, Capkun 2017:
DLS88: (modern version) It is impossible to solve Agreement under partial synchrony against a rollback adversary if $f \geq n/3$.
The rollback adversary is a model that tries to capture the power of the adversary given a secure enclave (trusted execution environment) that does not have any local trusted state.
The Rollback Adversary
Recall that an omission adversary can block any message sent to or from a corrupted party, but it must run the correct protocol, from its correct beginning state, in an honest manner even on a corrupted party. In contrast, a Byzantine adversary can run any protocol on a corrupted party. A Rollback adversary has slightly more power than an omission adversary:
- The adversary can block any message sent to or from a corrupted party.
- All parties start from a correct initial state and run the honest protocol
- On any corrupted party, the adversary can choose to:
- Either make one more step forward in executing the honest protocol;
- Or rollback the execution to its correct initial state
The Rollback adversary models an adversary that cannot corrupt the execution, but since the execution is stateless, it can rollback the execution and feed it different interactions and inputs. See Matetic et al 2017 for more discussion on this adversary. This model tries to capture the (limited) power of an adversary that can control everything inside a node except for a secure enclave (trusted execution environment) under the assumption that the secure enclave (or TEE) does not have any access to trusted storage (like a trusted monotonic counter).
Observe that to execute the lower bound of DLS88, all the adversary needs to do is to run two “split brains”. This can be done if the adversary is allowed to run two instances in parallel, or as in the case of the rollback adversary, run one instance and then rollback to run the other.
For the $n=3, f=1$ case, assume the adversary is $B$ and the honest parties are $A,C$ and as usual, communication between $A$ and $C$ will be delayed. Here is the relevant part of the proof in our blog post on DLS:
The adversary will use its Byzantine power to corrupt $B$ to perform a split-brain attack and make $A$ and $C$ each believe that they are in their respective worlds. $B$ will equivocate and act as if its starting value is 1 when communicating with $A$ and as if its 0 when communicating with $C$. If the adversary delays messages between $A$ and $C$ for longer than the time it takes for $A$ and $C$ to decide in their respective worlds, then by an indistinguishability argument, $A$ will commit to 1 and $C$ will commit to 0 (recall the time to decide cannot depend on GST or $\Delta$). This violates the agreement property.
A rollback adversary that has two inputs can do the following:
- Start $B$ in its initial correct state;
- Give $B$ input 1, and communicate only with $A$”, using its power of omission failures to block messages to $C$;
- Wait for $B$ to terminate, and then rollback $B$ to its initial correct state;
- Give $B$ input 1, and communicate only with $C”, using its power of omission failures to block messages to $A$;
We also need to assume the adversary has access to two different inputs, in this case “0” and “1” (and that the input is not part of its “sealed” correct beginning state). In the state machine replication setting, this implies having more than one client or having one client that has more than one choice of input value.
This concludes the observation that the main “split brain” world can be conducted by a rollback adversary. The remainder of the proof follows exactly as in our blog post about the DLS88 lower bound showing that agreement is impossible in partial synchrony.
Extension to Broadcast
As in this post, the lower bound also holds for Reliable Broadcast, in fact, it even holds for Provable Broadcast. In the state machine replication setting, this implies that the lower bound holds if the client can be malicious and may have (at least) two different values to output. In particular, it holds for write once objects in this setting.
ROTE: Rollback Protection for Trusted Execution reuqires $3f+1$ to be safe and live
Matetic et al 2017 suggest a system called ROTE that provides rollback protection. ROTE uses $n=f+2u+1$ severs to overcome $f$ malicious servers and to provide liveness when there are $u$ unresponsive servers. With $f$ malicious servers that can also be unresponsive, Rote would need $u=f$ and hence $n=3f+1$ servers to obtain safety and liveness (not circumventing the lower bound). Setting $u$ < $f$ would imply that ROTE is not live if $f$ servers are unresponsive and hence any protocol relying on it would not obtain liveness so would not solve agreement (again not circumventing the lower bound).
A snapshot rollback adversary
The rollback adversary needed for this lower bound could only rollback to the initial correct state. In reality a slightly more powerful adversary may be able to create snapshots and rollback a corrected party to any previous snapshot (not just the initial state). This can be a useful attack in a multi-shot (SMR or ledger) setting.
Many thanks to Andrew Miller and Guy Gueta for insightful discussions.
Your comments on Twitter.