Use two choices. It’s easy to get hugely better performance by moving from one choice to two choices. It’s very hard to do better.

The full title is The Power of Two Choices in Randomized Load Balancing, by Michael David Mitzenmacher.

I love the simple theory of this: use two choices.

Two choices is exponentially better than one choice, and having three or more choices don’t offer much more improvement.

  • Need to hash? Hash it twice, then choose the bucket with fewer items.
  • Need to load balance? Choose two servers at random, then pick the one with less traffic.
  • Need to cache? TWO CHOICES.

The math of this thesis is beyond me, but that’s true of most compsci theses now. (Can anyone retain their math knowledge if they’re not actively teaching or publishing?)

I love papers with simple and powerful takeaways.