“Everything about distributed systems is terrible” by Hillel Wayne

Distributed Systems

Threads == Computers

Fundamentally, concurrent threads acting on shared memory is a lot like concurrent machines writing to a common DB.

Minus partial failures for the threads + memory.

Many Possible States for Concurrent, Non-Deterministic Agents

Without concurrency control, the combinations of states for agents interacting with shared data grows very fast.

For 2 agents, each doing this:

everything_distributed_terrible_agent1_agent2.PNG

Results in 13 states, 4 end-to-end behaviors (where an end-to-end behavior is a chain of states, starting at init, ending at one of the leaf states)

everything_distributed_terrible_agent1_agent2_states.PNG

For 2 agents, 4 steps each, we have about 550 states.

The number of states and behaviors grows fast

Over a Long Enough Time, a System Will do Everything

Systems have:

Eventually, invalid behaviors will occur.

Code With Concurrency Control/Abstractions Is Not Enough

Good to have semaphores, locks, promises, monads, etc.: they reduce the number of states the system be in.

However, these coding abstractions do not identify states, transitions and behaviors exhaustively and explicitly.

We Need Formal Specification

Second half of the talk is Hillel giving an example usage of TLA+.