On the left we try and form a continuous story about the history of block chain. On the right we provide isolated notes that fill in the exact details of the proofs and algorithms.
The word on the street is that all blockchains do is provide a general solution to the byzantine generals problem. Thus, we cannot discuss the story of blockchains without discussing the simpler problem of The Byzantine Generals Problem (1982) by Leslie Lamport, Robert Shostak, and Marshall Pease.
But even before we discuss solutions to the byzantine generals problem, we need some kind of a formal framework to state the problem and analyse the efficacy and formal guarantees provided by the solution. Informally, n computers are connected using authenticated point to point channels. This means that when computer 1 receives a message from say computer 2, it knows exactly who sent them this message (there’s no ambiguity about the sender). Furthermore, unless specified otherwise, no other computer can see what computer 2 sent computer 1. With this point-to-point setup in place, over time, the computers send each other messages to arrive at some kind of decision. The exact contents of the messages and what the decision looks like depends on the problem they are trying to solve. In the Byzantine generals problem, n generals are stationed at the perimeter of a city and they wish to know whether they should attack the city or retreat. Each of them have a personal opinion on what they should do, but success depends on them coordinating and arriving at a common conclusion. They cannot stand in a circle and yell out their opinions, so each general talks to other generals via messengers (point to point channels). Now if nothing bad ever happened each general could just pass along their true opinion, and once every general receives a message from every other general – they just output the majority. So what’s the problem?
The problem is at most f out n generals might be corrupted, and they might lie or not send messages on time. Assuming that the messages sent by n-f honest generals do get delivered, the goal of byzantine agreement is arrive at a common decision that satisfies validity, agreement and termination (see notes for more details).
My notes:
Byzantine Agreement and Byzantine Broadcast
Digital Signatures - brief overview
The above notes are based on ALP Lectures at A16z Part I Roughgarden Lectures 1-4
One of the first questions researchers asked was how many corruptions one could tolerate before we could no longer have any guarantees of reaching agreement. Of course the answer would depend on which communication model we used. The simplest model is the synchronous model, where all messages are delivered to the intended recipient within a publicly known interval \Delta from departure.
In this setting, in Authenticated Algorithms for Byzantine Agreement (1983) by Danny Dolev and H. Raymond Strong, they show with that if we assume that digital signatures exist1 then we can solve byzantine broadcast with any number of corruptions. This upper bound is also tight. On the other hand if we do not have cryptography, “The Phase King Protocol” protocol from the paper Bit Optimal Distributed Consensus (1992) by Piotr Berman, Juan Garay, and Kenneth Perry, solves agreement even in presence of f < n/3 corruptions in synchrony. This limit on corruption is tight. In Section 3 of Easy impossibility Proofs for Distributed Consensus Problems (1986) by Michael Fischer, Nancy Lynch, and Michael Merritt , the authors show via the famous hexagon argument that if f \geq n/3, agreement is impossible.
Efficiency: Just solving agreement is not enough. We also want to do this efficiently. But what does efficiently even mean?
Considering consensus is possible, we would like to minimise rounds of communication (latency) and the size and number of the messages honest parties need to send (communication complexity) to achieve consensus. Once again we have lower bounds on how many rounds of communication are needed by Marcos Aguilera and Sam Toueg - A Simple Bivalency Proof that t-Resilient Consensus Requires t+1 Rounds. We also have a lower bound on the number of messages that must be sent by Danny Dolev and Rüdiger Reischuk – Bounds on Information Exchange for Byzantine Agreement (1985)
Dolev-Strong Protocol: BB with PKI can tolerate up to n-1 failures.
Hexagon argument: BA or BB is impossible without PKI if f \geq n/3
Phase King Protocol: If f < n/3 then we can solve BA even without PKI.
The story in the synchronous model seems to be complete. At least from the perspective how many corruptions we can tolerate. Here we ask how what the upper and lower bound on f is in asynchronous model. The famous Fischer Lynch Patterson impossibility result tells us that agreement or broadcast is impossible in the asynchronous network model if we use a deterministic model. In other words, even for f=1, with a simple crash corruption (one of the computers just gets disconnected from the network), deterministic agreement or broadcast is impossible. However, if we relaxed the requirements of the broadcast problem (essentially not requiring all honest nodes terminate), we can solve interesting problems like reliable broadcast2 and consistent broadcast problem3. These problems can be efficiently solved in the asynchronous model using a deterministic algorithm. Consistent broadcast and reliable broadcast will later be used as important subroutines to get consensus in the partially synchronous model.
The notes above were based on
TODO
TODO
Finally, we get to the partially synchronous model introduced by Consensus in the Presence of Partial Synchrony (1988) by Cynthia Dwork, Nancy Lynch, and Larry Stockmeyer. This is most relevant for blockchains. One may also say that this is mathematically the most interesting model given that we understand the synchronous model well, and lots of things can be impossible in the asynchronous model.
We also consider multi-shot consensus instead of single shot consensus. Multi-shot consensus is a fancy way of saying we want to reach consensus again and again. The community formally calls this the state machine replication problem. Once again we ask how many corruptions we can tolerate? The answer unsurprisingly is once again that f \geq n/3 is impossible, and f < n/3 is possible. At the heart of this f < n/3 bound is a very simple counting argument, which states that two distinct sets of 2n/3 parties always has one honest party in common. As we can trust this honest party we can preserve safety or consistency.
The SMR problem is not a new problem. It has been studied since the early 90’s and the famous Paxos protocol by Lamport solves SMR with benign adversaries. Later on Barbara Liskov and her team proposed Practical Byzantine Fault Tolerance which solves SMR in the partially synchronous setting with byzantine adversaries.
More recently, there has been an influx of protocols that solve SMR for blockchains that look very much like PBFT. See Tendermint, Hotstuff, Hotstuff 2.0, Lewis-Pye, Fever, Lumiere The differences are in terms of efficiency, but even discussing what efficiency means requires some subtlety.
Although blockchains solve the SMR problem, the solutions discussed above do not solve the blockchain problem in a stand-alone manner. The solutions above are permissioned protocols in that the number of parties communicating are known before hand. Blockchains require us to modify the above protocols to make the permissionless (and deal with Sybil resistance).
A note on chaining: Casper FFG
State Machine Repication The problem that is closer to the “Blockchain problem”.
CAP Theorem and Impossibility Results: f >= n/3 lower bound for the partially synchronous model. Proof for n=3 and f=1, and this generalises for n\geq 3.
The Tendermint Protocol: The write-up for these proofs are follows from the Roughgarden lectures. It helped me get an understanding of one protocol end to end before diving into coming up with a single framework.
Lewis-Pye’s View Synchronisation Solution With Quadratic Message Complexity TODO
The concept of chaining: Reducing the latency in a view with pipelining or batching TODO
In Roughgarden Lecture 8 And 9, Tim breaks down to Nakamoto consensus into bite sized pieces. He breaks a protocol like like bitcoin into
Before diving into the Roughgarden notes on which sets us up for permissionless consensus, we review Streamlet: Textbook Streamlined Blockchains (2020) by Benjamin Chan and Elaine Shi, a longest chain consensus protocol which was primarily written for teaching purposes.
All the leader based BFT-SMR algorithms discussed above are described in the permissioned. The key ingredient that allows us to analyse these methods under the blockchain lens is the idea of a proof of stake. At a high level, its quite a simple idea – a leader is selected based on how much money they have in escrow.