Synchronous consensus protocols can tolerate $f < n/2$ Byzantine failures but for $n/3 <f <n/2$ must depend on the maximum network delay $\Delta$ for their safety and progress. So these protocols must set $\Delta$ to be much larger than the actual network delay $\delta \ll \Delta$. The good news is in the multi-shot (blockchain) scenarios, modern synchronous protocols such as Sync HotStuff can essentially pipeline the $\Delta$-dependent delay:
- Under an honest leader Sync HotStuff can commit $k$ blocks in $2\Delta+O(k\delta)$ time.
Protocols such as Sync HotStuff fall under the stable-leader paradigm where a single leader coordinates multiple blocks until a view-change is executed to replace the faulty leader. In this post, we focus on the rotating-leader paradigm where the leader is rotated after every proposal. The stable-leader approach typically provides better performance metrics. The rotating-leader approach provides better fairness and censorship resistance and overall more uniform distribution of work. Our main result shows that:
- Under a sequence of $k$ consecutive honest leaders our protocol can commit $k$ blocks in $2\Delta+O(k\delta)$ time.
In our paper, we provide tight upper and lower bounds:
- A lower bound showing that an optimistically responsive rotating-leader protocol tolerating $f < n/2$ Byzantine faults must have a commit latency for a single slot of at least $2\Delta$ time.
- An optimistically responsive synchronous BFT protocol tolerating $f < n/2$ Byzantine faults in the rotating-leader paradigm with a commit latency of $2\Delta + O(\delta)$.
Combing these two results shows that committing $k$ blocks in $2\Delta+O(k\delta)$ time under a sequence of $k$ consecutive honest leaders is asymptotically optimal (when $\delta$ is negligible).
Background and Related Work
An earlier post discusses the notion of good-case latency to capture commit latency in the stable-leader paradigm. A follow-up post discusses a Byzantine fault tolerant state machine replication (BFT SMR) protocol with a good-case latency of $\Delta + O(\delta)$ latency which is optimal.
Several recent BFT SMR protocols (Dfinity, PiLi, Streamlet, and HotStuff) have been designed in the rotating leader paradigm. However, Synchronous BFT protocols such as PiLi and Streamlet incur large latency ($65\Delta$ and $12\Delta$ respectively) to commit a single decision. Moreover, naively using Sync HotStuff and optimal good-case latency BFT protocols in rotating-leader paradigm would incur at least $3\Delta$ time to commit and change leaders as they incur $\Delta$ and $2\Delta$ time respectively in the view-change phase. In our work, we study the good-case latency in the context of rotating-leader protocols. In a rotating-leader paradigm, the leaders keep changing through a view-change process after each block proposal. Thus, the overall latency of a rotating-leader protocol depends on the combination of good-case latency and the latency of the view-change. In our work, we explore protocols that change leaders and have the new leader propose responsively in $O(\delta)$ time compared to waiting for $\Omega(\Delta)$ delay during view-change. We call such protocols optimistically responsive rotating-leader protocols.
A Lower Bound on the Good-Case Latency of Optimistically Responsive Rotating-leader Protocols
Our lower bound studies the good-case latency of rotating leader protocols where the next leaders propose responsively without waiting for $\Omega(\Delta)$ time after receiving proposals from previous leaders. In particular, the lower bound captures the relationship between the latency of the view-change and the good-case latency to commit a decision. Essentially, it says that if a consensus protocol tolerating $f < n/2$ Byzantine faults allows a new leader to propose responsively in $O(\delta)$ time, the commit latency for a single slot cannot be less than $2\Delta-O(\delta)$. Thus, the sum of latencies has to be at least $2\Delta$.
Informally, our lower bound result says the following: There exists an execution in an optimistically responsive rotating-leader consensus protocol that can tolerate $n/3 \le f < n/2$ Byzantine faults and when all messages between non-faulty replicas arrive instantaneously, where the following two conditions do not hold simultaneously:
- the good-case commit latency is less than $2\Delta-O(\delta)$, and
- honest leaders propose responsively in at most $O(\delta)$ time after receiving a proposal from the previous honest leader.
Rotating Leader Synchronous BFT protocol with Optimal Good-case latency
We give a complete picture and provide a rotating-leader synchronous BFT protocol with optimal good-case commit latency of $2\Delta$ time. Our protocol follows “block-chaining” paradigm where a block extends a previously proposed block called its parent. Our protocol allows a new leader to propose a block responsively as soon as it receives a certificate (i.e., $f+1$ votes assuming $n=2f+1$) for a block proposed by a previous leader. Replicas then commit a decision within $2\Delta$ time of receiving a certificate for the proposed block.
- Enter view. Upon receiving a certificate for a block proposed by previous leader, enter view $v$ and set view-timer to $7\Delta$ and start counting down.
- Propose. Upon entering the view, the current leader proposes its current block proposal $B_k$.
- Vote. On receiving the first valid block proposal from the leader, echo the proposal and multicast a vote for proposed block $B_k$.
- Commit. Upon receiving $f+1$ votes for block $B_k$, echo the certificate. If certificate for $B_k$ received such that view-timer $\ge 2\Delta$ then commit $B_k$ within $2\Delta$ if nothing bad happens.
BFT protocols usually include a certificate ranking rule to rank the certificate. In our protocol, we rank the certificates by their view i.e., certificates in higher view are ranked higher. In general, BFT protocols ensure that a certificate for a block to be committed is unique and sufficient honest replicas have received the highest ranked certificate before they vote in a higher view. Honest replicas do not vote on blocks that do not extend the highest ranked certificate known to them. This ensures safety of committed blocks.
In our protocol, replicas wait for a certificate for proposed block $B_k$, echo the certificate and wait for $2\Delta$ time to check for any “bad events” from the leader before committing $B_k$. This ensures two properties (i) a conflicting block proposal certificate does not exist (the argument is similar to the one presented in Sync HotStuff), and (ii) all honest replicas receive the highest ranked block certificate. In our protocol, each leader proposes a single block in a view. Thus, the certificate for $B_k$ is already the highest ranked. In addition, all honest replicas enter a view within $\Delta$ time of each other. This is due to the synchrony assumption. Since an honest replica commits only when it receives the certificate for $B_k$ such that there is sufficient time in the view (i.e., view-timer $\ge 2\Delta$), all honest replicas will receive a certificate for $B_k$ before they enter a higher view. This ensures no honest replica will vote blocks that do not extend $B_k$.