Wednesday, March 12, 2008

mnesia and ec2

Mnesia is Erlang's built-in multi-master distributed database. It's one of the reasons why we chose Erlang for our startup. And while it is mostly a great fit for EC2, it needs some tweaks to work. In particular we wanted the following to be automatic:
  1. Discover nodes entering or leaving the system
  2. Have them join the Mnesia database
  3. Have them take responsibility for a portion of the data
We've already talked about and published solutions to the first problem (nodefinder) and third problem (fragmentron). Now I just published a solution to the second problem on google code (schemafinder). It took a while to get this part out partially because we kept finding small problems with it, but mostly because we're very busy.

It's actually surprisingly complicated, given that at the root the way to add a node to a running database is to call mnesia:change_config/2 as mnesia:change_config (extra_db_nodes, NodeList). However while adding nodes is easy removing nodes is harder. Detecting nodes to remove is not that hard; we go with the (EC2-centric) strategy of periodically updating a record in a table to "stay alive" (and also a way to mark clean shutdown); nodes that fail to check in eventually are considered dead. To handle multiple node failure we patch mnesia to provide mnesia_schema:del_table_copies/2, the multiple table analog to mnesia:del_table_copy/2. To guard against rejoining the global schema after having been removed but without having removed one's mnesia database directory (this is a no-no), we check with the other database nodes to see if they think we've been removed.

Even with all this complication, we have yet to address the problem of handling network partitioning automatically. We'll see if we have time to address that before it bites us in the butt.

Now available on google code.

2 comments:

Roberto Saccon said...

that's great stuff what you are building ! About the patching of mnesia: it seems to me that you are patching mnesia at its real source location. Have you considered just keeping the patched mnesia source files within schemafinder and make sure the ebin folder with the patched files has set the codepath in a way that the patched beams get picked up instead of the original ones ?

Roberto Saccon
http://rsaccon.com

Paul Mineiro said...

roberto,

the patch is applied at build time; the resulting complete .erl and .beam files are installed with the schemafinder package. the original mnesia files are not modified. this hopefully provides some modicum of adaptability to new erlang releases, although we have only used r11b-5 in production so far. it does mean you need consistency between your build environment and your production environment; if you're using an OS packaging system i'd recommend an exact dependency.

wrt code path, we do what you say; when the application is (hot) deployed to the running erlang instance our hot code deployment system ensures that it's code path is prepended, and then we do some tests to see if an unpatched mnesia module has been loaded (see delirium.erl), loading our modules if necessary.

(we haven't released our hot code deployment framework yet; it basically allows you to integrate hot code deployment with an OS packaging system like apt-get; stay tuned.).