“Designing for Performance” by Martin Thompson

What is Performance?

Many aspects:

These aspects are not independent e.g. increasing throughput can cause latency to increase (which seemed fine at low throughput)

Queueing Theory

System utilization vs response time is not linear: design_for_perf_utilisation.PNG

Parallel Speedup

Parallel can help but don’t forget Amdahl’s law; how much of a request can be processed in parallel?

Universal Scalability Law

Amdahl didn’t take into account the sync overhead of parallel tasks (sharing common world view): the coherence penalty.

design_for_perf_universal_scalability_law_equation.PNG

Coherence penalty increases as parallelism increases (e.g. more CPU, threads, machines).

At some point, the coherence penalty eats away at a good portion of the benefit.

AWS experiment with a normal 150 microsecond coherence penalty: design_for_perf_universal_scalability_law.PNG

Penalties of Logging Frameworks

They usually cannot do any work in parallel.

Clean & Representative

Anecdotally, clean code tends to also be performant

Abstractions: Create Them When Certain of the Benefits

Leaky Abstractions

Memory System Abstraction

Hardware perf:

Hardware caching is counting on software behavior:

Performance Ideas

Coupling & Cohesion

Relationships

Batching

Amortize the expensive costs using batches:

Branches

Loops

Composition: Size Matters

APIs

Data Organization

Performance Testing

Measuring Response Time

Measure the Right Things

Benchmarking

Measuring Running System

“It does not matter how intelligent you are, if you guess and that guess cannot be backed by experimental evidence - then it is still a guess” - Feynman