Monday, November 29, 2010

Swarm: A true distributed programming language

Fundamentals

The fundamental concept behind Swarm is that we should “move the computation, not the data”.
The Swarm prototype is a simple stack-based language, akin to a primitive version of the Java bytecode interpreter. I wanted the proof of concept to be quick to implement, while demonstrating that the concept could work for a popular runtime like the JVM or Microsoft’s CLR.
Update (Sept 17th 09): Swarm is now implemented as a Scala library, so you program in normal Scala, rather than a custom stack-based library as with the prototype described here.  It uses the Scala 2.8 Continuations plugin to achieve this.  See end of blog post for further information.

The Prototype

The prototype is implemented in Scala, and I will use snippets of Scala code below, but a knowledge of Scala won’t be required to understand the rest of this article. I chose Scala because I wanted to learn it, and because its rich semantics tends to make coding easier and faster than Java (my normal language of choice).
As with the JVM, there are three places to store data in the Swarm VM: the stack, a local variable array, and the store. The stack is used for intermediate values in computations, data here tends to be very short-lived. In the prototype it is implemented as a List[Any]. The local variable array is for data that is used within a block of code (its implemented as a Map[Int, Any]).

The “Store”

The “store” is somewhat analogous to the JVM heap. It is used for long-term storage of data, indeed, in an actual implementation it may be persistent, and/or transactional, but in the prototype it is in-memory. The store contains “objects”, each of which is a list of key-value pairs. The values may be references to other objects. The store is implemented as a Map[Int, Map[String, Any]].

Read more: Hypergraphia Indulged