June 1st, 2010


Parallelism: not exactly a new field of study

Thanks to Lambda the Ultimate for the pointer to this ACM article on the history of parallel computing. Doesn't say anything radical and new -- indeed, it is specifically about stuff that is old -- but it's useful for providing perspective. The main point of the article is that, while parallel computing looks like The Big New Thing, most of the groundwork and research was actually done in the '60s and '70s, and that those who are trying to do it seriously should go back and read into all that old (and largely forgotten) research.

The main upshot seems to be a fine argument for Scala, really. They manage to somewhat indirectly argue that, if you want to be doing parallelism seriously, you should be using a functional language, but that people really want something object-oriented. So while they don't say so in as many words, they effectively argue that multi-paradigm languages like Scala and Oz are the way of the future.

And speaking of parallelism and Scala, the really hardcore may want to check out this presentation from ScalaDays (30 min video; here's the low-res version). It demonstrates a new open-source project out of LinkedIn, called Norbert, which provides a solid high-level API for creating and managing a distributed system of nodes for doing real work. Towards the end, he spends a while showing specifically why Scala worked so well for this -- not only the Actors architecture for parallelism, but also the rich trait system that lets them eliminate duplicate code.

More on Scala and scalability: Akka

Continuing from the previous thought: wandering around through the Scala Days presentations, I came across this neat presentation on the Akka Project. This is the latest evolution of the work that Jonas Boner has been doing for a while now, and it's getting to the point of being ready for industrial-grade apps.

High concept (through my own particular lens, anyway) -- Erlang is a kind of sucky language, but it got one thing totally right: by baking an Actor-style architecture in from the beginning, properly-written Erlang apps wind up astonishingly scalable and fault-tolerant, often orders of magnitude better than what you would typically get from more conventional languages. Scala picked up the Actor concept early on, but the built-in version of Actors is somewhat half-baked. The Akka project took these ideas and has carried them *much* further, picking up all the best concepts from the Erlang world and blending them neatly into a Scala environment. It adds concepts like easy remoting, STM to produce distributed transactions across Actors, lots of APIs, and so on.

Mind, it's not perfect yet. I find the actual client APIs still a bit low-level and clunky: it needs the sort of robust clustering capabilities that the Norbert project is building, so that you stop having to worry about which node a particular Actor can be found on. And the questions at the end point out some places where more research is needed -- for example, you can't distribute transactions across the network yet. (OTOH, that's a pretty hard problem, so I'm not going to fault them much for it. I would recommend trying to avoid Transactions in an Actor-oriented system as much as possible anyway.)

Overall, this is turning into a very compelling platform for server development. I've been saying for a long time now that the world needs a really good Scala-based XMPP server (which could be fabulously useful if done right); it sounds like Akka is probably the platform that that should be built on top of. In general, the above presentation is well worthwhile if you are interested in modern scaling techniques -- while the Hadoop-style MapReduce approach is very well-known, Actors are still less known than they should be, given that they are *clearly* the right architecture for many communication-centric apps...