In this series of posts, we'll go through a group of load balancing algorithms studying their strengths, weaknesses, guarantees, and applications.

What and why?

It all started when a friend of mine complained about how the load balancing algorithm he was using wasn't delivering the outcomes he was expecting. Because of his somewhat unique use case I was very intrigued, so I decided to help him solve the issue. Not long after, I found myself researching the literature on all the load balancing algorithms I could find. After a while, I decided to build a simulation to better evaluate the algorithms based on:

Algorithm simplicity
Load distribution
Affinity and cache friendliness
The guarantees provided

Each algorithm serves a particular use case on a different side of the spectrum. Which should cater to most applications, ranging from CDNs to CRUD applications that keep their state in a centralized database.

All the algorithms are implemented and published as part of Liblb.

Comparison

List of algorithms

Simulation

The simulation is derived from a day's worth of anonymized real traffic logs donated from a major Online Classifieds site in the Middle East, which makes the study more meaningful than a randomly generated traffic. before explaining the simulation let's talk about its Components:

Client: a user that asks for the country of a certain IP. It sends the request to one of the GeoServer hosts given to it by consulting liblb.
GeoServer Host: responds the incoming queries by returning the country of the given IP either from its local Cache if found, or directly from the database.

Now that we got this out of the way, let's see how the simulation works:

Client: Extracts the user's IP from the logs.
Client: Sends the IP to the GeoServer host the liblb gave it (for example) based on the selected algorithm.
GeoServer Host: Checks its in-memory cache, if found it increments a cache hit counter and it returns the cached result.