PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs

Graphs are Ubiquitous

Used by many major products e.g. Google, Netflix
Gigantic (billions of vertices, edges)
Used for ML, data mining

Natural Graphs (having Power-Law Distribution)

(Certain) Graph data follows power-law distribution
- Small set of vertices have high degree of connections

powergraph_power_law_distribution

Difficult to partition by partitioning vertices

Classic Distributed Graph Approach:

Evenly distribute vertices to different machines
- Issue: High-degree vertices can’t fit all edges on one machine (i.e. many edges are connected to a vertex on another machine)

Pregel Communication Overhead for Highly-Connected Vertex

High-degree vertex incurs communication cost for its outgoing messages in Pregel systems
- Despite all messages going to Machine 1 from Machine 2
Incoming messages overhead mitigated by combining messages coming from the same machine

GraphLab Communication Overhead for Highly-Connected Vertex

GraphLab solves the outgoing messages by ‘ghosting’ a node to neighboring machines
- One chance to D implies one single outgoing update
Incoming updates however, still pose problems:

Graph Partitioning: Cut Across Edges

Vertices are assigned to machines and edges connect to neighboring vertices on other machines
- Issue: Natural graphs cannot be partitioned with few inter-machine edges
- Issue: Natural graphs will produce many cut edges (edge whose vertices are on different machines)

PowerGraph

New: Cut Across Vertices

powergraph_cut_vertex

Workflow Observation

Most workflows in 3 steps:
1. Gather data from neighboring nodes
2. Calculate a new value for own node
3. Update neighboring nodes with new own value
Named GAS:
- (1) Gather (in parallel), (2) Apply (on own node), (3) Scatter (in parallel)

Distributed Execution

powergraph_distributed_execution

Gather: Machine 1,2,3 and 4 locally gather neighbor data for a local partial sum
Gather: Master Machine 1 receives the partial sums
Gather: Master Machine 1 sums all partial sums
Apply: Master Machine 1 updates local node and sends updates to Machines 2,3,4 so they update their local node
Scatter : Within each machine, new node info is communicated to on-machine neighbors

Communication Overhead

Communication happens on vertex gather and apply
No inter-machine communication occurs with edge information

Edge Placements

Partitioning must attempt to minimize how many machines a vertex spans (and thus must communicate with)
Edges are assigned per machine, no vertices

Random

powergraph_random_edge_placement

Random is also fairly predictable, meaning simple understanding of needed resources

Coordinated Greedy

Nodes coordinate to minimize how many machines a vertex spans
(vs Random) requires coordination, slower to build
(vs Random) lower communication overhead
on placing an edge, heuristic tries to place it on:
1. machine already containing both vertices of edge
2. machine containing one vertex (tiebreaker: less loaded)
3. no machine contains any vertices of the edge: assign to least loaded machine

Oblivious Greedy

Middleground between Random and Coordinates: guesswork and little coordination between machines.

Fault-Tolerance

Snapshots are taken
A few seconds per snapshot
Stop-the-world

Performance

PowerGraph Placement: Random vs Coordinated vs Oblivious

powergraph_placement_compare

PowerGraph vs Other Systems

powergraph_perf_1

powergraph_perf_3

powergraph_perf_2