Log in

No account? Create an account
Previous Entry Share Next Entry
Kafka: a lesson in high-performance computing
While waiting for compiles today, I was poking around at the OSS Scala libraries, and came across Kafka, LinkedIn’s in-house package for distributed messaging. The architecture writeup is a thing of beauty:


It’s fascinating reading, and recommended to any engineer who has the time. It explains in fair detail how they designed a system that manages seriously high-performance pub/sub messaging in a large cluster, using commodity hardware. Along the way, it illustrates a lot of places where the conventional wisdom about performance is just plain wrong – or at least, misses the subtleties that you need in order to squeeze out big speed. There are a lot of major speedups that they achieve not so much by going to the bare metal, as by understanding how the metal works – the relationship of write speed vs. random-access latencies, the amount of time that gets wasted in buffer copying in a more naïve stack, and so on. The result is a system that makes some unusual but very pragmatic functional tradeoffs, but which looks like it works well for a bunch of applications.

Bracing stuff, and a good reminder that the details matter…

  • 1
(Deleted comment)
  • 1