Tuesday, October 30, 2012

Hashing functions and MurmurHash

Today I watched a presentation on cloud applications and one of the interesting points is how hashing the users across shards had to be done in a relatively uniform way to make sure it's evenly distributed.

While the presenter mentioned that they used an in-house caching algorithm, they realized that a good hashing algorithm can make a lot of difference. In this case, his suggestion is to use Murmur.

I found a great post on StackExchange on that, which is a must read before picking the hashing function:

Which hashing algorithm is best for uniqueness and speed?

And, of course, leave aside the Not Invented Here syndrome and don't go implement your own function :-)
