Justin du Coeur (jducoeur) wrote,
Justin du Coeur
jducoeur

Kafka: a lesson in high-performance computing

While waiting for compiles today, I was poking around at the OSS Scala libraries, and came across Kafka, LinkedIn’s in-house package for distributed messaging. The architecture writeup is a thing of beauty:

http://sna-projects.com/kafka/design.php

It’s fascinating reading, and recommended to any engineer who has the time. It explains in fair detail how they designed a system that manages seriously high-performance pub/sub messaging in a large cluster, using commodity hardware. Along the way, it illustrates a lot of places where the conventional wisdom about performance is just plain wrong – or at least, misses the subtleties that you need in order to squeeze out big speed. There are a lot of major speedups that they achieve not so much by going to the bare metal, as by understanding how the metal works – the relationship of write speed vs. random-access latencies, the amount of time that gets wasted in buffer copying in a more naïve stack, and so on. The result is a system that makes some unusual but very pragmatic functional tradeoffs, but which looks like it works well for a bunch of applications.

Bracing stuff, and a good reminder that the details matter…
Tags: programming
Subscribe
  • Post a new comment

    Error

    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded 

  • 0 comments