Chapter 1 Introduction Distributed computing systems are characterized by their structure: a typical distributed computing system will consist of some large number of interacting devices that each run their own programs but that are affected by receiving messages or observing shared-memory updates from other devices. Examples of distributed computing systems range from simple systems in which a single client talks to a single server to huge amorphous networks like the Internet as a whole. As distributed systems get larger, it becomes harder and harder to predict or even understand their behavior. Part of the reason for this is that we as programmers have not yet developed the kind of tools for managing complexity (like subroutines or objects with narrow interfaces, or even simple structured programming mechanisms like loops or if/then statements) that are standard in sequential programming. Part of the reason is that large distributed systems bring with them large amounts of inherent nondeterminism—unpredictable events like delays in message arrivals, the sudden failure of components, or in extreme cases the nefarious actions of faulty or malicious machines opposed to the goals of the system as a whole. Because of the unpredictability and scale of large distributed systems, it can often be difficult to test or simulate them adequately. Thus there is a need for theoretical tools that allow us to prove properties of these systems that will let us use them with confidence. The first task of any theory of distributed systems is modeling: defining a mathematical structure that abstracts out all relevant properties of a large distributed system. There are many foundational models for distributed systems, but for this class we will follow [AW04] and use simple automatonbased models. Here we think of the system as a whole as passing from one 1
CHAPTER 1. INTRODUCTION 2 global state or configuration to another in response to events, e.g. local computation at some processor, an operation on shared memory, or the delivery of a message by the network. The details of the model will depend on what kind of system we are trying to represent: • Message passing models (which we will cover in Part I) correspond to systems where processes communicate by sending messages through a network. In synchronous message-passing, every process sends out messages at time t that are delivered at time t + 1, at which point more messages are sent out that are delivered at time t + 2, and so on: the whole system runs in lockstep, marching forward in perfect synchrony. Such systems are difficult to build when the components become too numerous or too widely dispersed, but they are often easier to analyze than asynchronous systems, where messages are delivered eventually after some unknown delay. Variants on these models include semi-synchronous systems, where message delays are unpredictable but bounded, and various sorts of timed systems. Further variations come from restricting which processes can communicate with which others, by allowing various sorts of failures (crash failures that stop a process dead, Byzantine failures that turn a process evil, or omission failures that drop messages in transit), or—on the helpful side— by supplying additional tools like failure detectors (Chapter 11) or randomization (Chapter 23). • Shared-memory models (Part II) correspond to systems where processes communicate by executing operations on shared objects that in the simplest case are typically simple memory cells supporting read and write operations (), but which could be more complex hardware primitives like compare-and-swap (§18.1.3), load-linked/storeconditional (§18.1.3), atomic queues, or more exotic objects from the seldom-visited theoretical depths. Practical shared-memory systems may be implemented as distributed shared-memory (Chapter 16) on top of a message-passing system in various ways. Like message-passing systems, shared-memory systems must also deal with issues of asynchrony and failures, both in the processes and in the shared objects. • Other specialized models emphasize particular details of distributed systems, such as the labeled-graph models used for analyzing routing or the topological models used to represent some specialized agreement problems (see Chapter 28.
CHAPTER 1. INTRODUCTION 3 We’ll see many of these at some point in this course, and examine which of them can simulate each other under various conditions. Properties we might want to prove about a model include: • Safety properties, of the form “nothing bad ever happens” or more precisely “there are no bad reachable states of the system.” These include things like “at most one of the traffic lights at the intersection of Busy and Main is ever green.” Such properties are typically proved using invariants, properties of the state of the system that are true initially and that are preserved by all transitions; this is essentially a disguised induction proof. • Liveness properties, of the form “something good eventually happens.” An example might be “my email is eventually either delivered or returned to me.” These are not properties of particular states (I might unhappily await the eventual delivery of my email for decades without violating the liveness property just described), but of executions, where the property must hold starting at some finite time. Liveness properties are generally proved either from other liveness properties (e.g., “all messages in this message-passing system are eventually delivered”) or from a combination of such properties and some sort of timer argument where some progress metric improves with every transition and guarantees the desirable state when it reaches some bound (also a disguised induction proof). • Fairness properties are a strong kind of liveness property of the form “something good eventually happens to everybody.” Such properties exclude starvation, a situation where most of the kids are happily chowing down at the orphanage (“some kid eventually eats something” is a liveness property) but poor Oliver Twist is dying for lack of gruel in the corner. • Simulations show how to build one kind of system from another, such as a reliable message-passing system built on top of an unreliable system (TCP), a shared-memory system built on top of a message-passing system (distributed shared-memory), or a synchronous system build on top of an asynchronous system (synchronizers—see Chapter 13). • Impossibility results describe things we can’t do. For example, the classic Two Generals impossibility result (Chapter 3) says that it’s impossible to guarantee agreement between two processes across an
CHAPTER 1. INTRODUCTION 4 unreliable message-passing channel if even a single message can be lost. Other results characterize what problems can be solved if various fractions of the processes are unreliable, or if asynchrony makes timing assumptions impossible. These results, and similar lower bounds that describe things we can’t do quickly, include some of the most technically sophisticated results in distributed computing. They stand in contrast to the situation with sequential computing, where the reliability and predictability of the underlying hardware makes proving lower bounds extremely difficult. There are some basic proof techniques that we will see over and over again in distributed computing. For lower bound and impossibility proofs, the main tool is an indistinguishability argument. Here we construct two (or more) executions in which some process has the same input and thus behaves the same way, regardless of what algorithm it is running. This exploitation of process’s ignorance is what makes impossibility results possible in distributed computing despite being notoriously difficult in most areas of computer science.1 For safety properties, statements that some bad outcome never occurs, the main proof technique is to construct an invariant. An invariant is essentially an induction hypothesis on reachable configurations of the system; an invariant proof shows that the invariant holds in all initial configurations, and that if it holds in some configuration, it holds in any configuration that is reachable in one step. Induction is also useful for proving termination and liveness properties, statements that some good outcome occurs after a bounded amount of time. Here we typically structure the induction hypothesis as a progress measure, showing that some sort of partial progress holds by a particular time, with the full guarantee implied after the time bound is reached. 1 An exception might be lower bounds for data structures, which also rely on a process’s ignorance.