“Chaos & Intuition Engineering at Netflix” by Casey Rosenthal

Control Plane at Netflix

Focus of Optimization: Performance, Fault Tolerance and Availability

Microservice Architecture

Great for feature velocity
A microservice can often a dependency on another microservice which itself needs another, and so on

Emergence of Undesirable System-Level Behavior

Interaction of system’s components can make the system behave poorly even if each component behaves reasonably in isolation.

Imagined Example of Positive Feedback Loop

[Without more details, this scenario sounds a bit silly; sounds like the scaling service messed up or the cache-only policy messed up]

Chaos Monkey and Chaos Kong

Chaos Monkey

Randomly turns off a prod server during employee working hours
Server failure can be noticed during working hours

Chaos Kong

Randomly turns off servers for an entire regions

Chaos Engineering (Principles of Chaos)

“Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system’s capability to withstand turbulent conditions in production”

Intuition Engineering

Idea that large complex systems need to be understood at some intuitive level

Example: Visualization Tool (Vizceral)