Wednesday, February 13, 2008

Automatic Node Discovery

One of the first problems we ran into on EC2 was having Erlang nodes find each other. Erlang ships with a node discovery strategy based upon a static hosts file. On EC2, the idea is to start and stop nodes frequently; and EC2 will typically hand out different hostnames each time (I guess that repeats are possible but unlikely). So a static hosts file won't work.

Fortunately Erlang is ready for the set of nodes to change constantly, it is just not spelled out how to find them. I've noticed this about Erlang: often an excellent design is coupled with a simplistic stub implementation. It's the "Tommy" school of software: even the most casual observer is supposed to be able to figure out the next step.

Initially we thought we'd use multicast UDP for discovery, which was extremely easy to do but then we discovered that EC2 does not support multicast. So then I overthought the problem and came up with a discovery protocol based upon using S3 as a blackboard. That worked but was overkill, since just parsing the output of ec2-describe-instances contains all the information needed, plus it allows EC2 security groups to define the Erlang node groups. This is convenient because the EC2 security group(s) can be determined dynamically from the instance metadata.

As an aside, the parsing of the output of ec2-describe-instances is done with a list comprehension and built-in pattern matching only (parse_host_list/2 and parse_host_list/4).

All three strategies are now available on google code. If you're on EC2, ec2nodefinder is the recommendation. Note, you can still in addition use a static hosts file (or any other node discovery strategy), e.g., if you have some non-EC2 hosts you'd like to connect as well.


dysinger said...

What about SimpleDB ?

Nick Hristov said...

Having a simple name server to which your nodes register is not good enough?

Paul Mineiro said...

@nick: do you mean an external dns server, or a custom registration server? you could do that, nodefinder is an interface as well as a few implementations. On EC2 Amazon keeps the information for you in the EC2 API so that seemed the natural place to get it, hence ec2-nodefinder.