A few years ago if you asked “Can blockchains scale?” most people would give three reasons why, fundamentally, the answer is “No!”
- Data: every transaction needs to be replicated by every miner (validator). So increasing security by adding more validators inherently causes more replication. Having $n$ validators implies that your transaction is replicated a total of $n$ times.
- Execution: every transaction needs to be executed (and validated) by every miner (validator). So again increasing the security by adding more validators inherently causes more execution overhead. Having $n$ validators implies that your transaction is executed a total of $n$ times.
- Consensus: all the miners participate in consensus protocol. So again, increasing the security by adding more mining power (or more validators) inherently causes more work to be done. Consensus protocols, even in the optimistic case, require a linear number of messages (and quadratic in the worst case). Having $n$ validators implies that committing a block of transactions costs $O(n)$ or even $O(n^2)$ in the worst case.
These three barriers seem to indicate a fundamental tradeoff: adding security (by increasing the number of validators) seems to increase the overhead at least linearly.
What does scaling mean?
Scaling = can add more security (by adding more validators) without increasing the data, execution, and consensus costs per transaction.
Said differently, a Byzantine fault tolerant state machine replication system can scale if the data, execution, and consensus overhead of a transaction is constant ($O(1)$, not a function of $n$), no matter how many validators there are.
The power of batching
In fact, batching (and a sprinkle of modern cryptography) overcomes all three barriers. It allows to add security (by increasing the number of validators) without adding any overhead (per transaction).
Said differently, with batching, the data, execution, and consensus overhead of a transaction is $O(1)$, no matter how many validators there are.
- Scaling data via batches and data availability proofs: Storing each transaction on all $n$ validators implies a total storage cost of $O(n)$ for just one transaction. This does not scale. Instead, a scalable blockchain can use error correction codes and polynomial commitments to store a batch of $B>n$ transactions at a total storage cost of $O(B)$ - so the per transaction storage cost is constant (for some functionalities $B>n^2$ is needed). Using data availability schemes it is possible to write a batch and read a batch from the blockchain with constant overhead per transaction.
- Scaling execution via batches and validity proofs: Executing all transactions on all validators causes an un-scalable compute overhead. Instead, just one validator (or one prover) can execute a batch of transactions and generate a (succinct, non-interactive) validity proof. The other validators just need to verify the short validity proof. This results in constant additive overhead per transaction (formally polylog that depends on a security parameter).
- Scaling consensus via batches: instead of reaching agreement on a small block of transactions, modern consensus protocols can batch $O(n)$ (or $O(n^2)$) transactions together with surprisingly low latency. This results in constant overhead per transaction.
Batching solves all three scalability bottlenecks. Batching in data, execution, and consensus allows increasing the security (increasing the number of validators) without increasing the performance overhead per transaction.
In 2023 we don’t yet have blockchains in production that have a data, execution, and consensus cost of $O(1)$ per transaction, but many of the necessary components are getting close to production ready.
The dark side of batching
Batching adds latency. Some transactions are latency sensitive. Furthermore, without privacy preserving techniques, high latency allows parties with asymmetric information and power to be able to extract more value (e.g., MEV).
Who needs unbounded scalability?
We defined scalability in this post as $O(1)$ overhead per transaction for any number of validators. Is this property really necessary?
On the one side, in practice, maybe all we need are systems with a few hundred (or a few thousand) validators. Perhaps replicating the data and execution on a few hundred (or a few thousand) validators that are running a highly optimized consensus protocol is not so bad. Having hundreds (or thousands) of truly independent, non-colluding validators seems to provide a very high level of security that may be sufficient for many use cases.
On the other side, maybe one day we will have systems with millions of validators (or more), in such a future, the limits of scalability explored in this post may become more relevant.
Perhaps surprisingly, there seems to be no inherent theoretical tradeoff between security (number of validators) and performance overhead (throughput) once batching is used. So maybe the new trilemma is between security, throughput, and latency?
Can blockchains scale? your thoughts and comments.