Saturday, August 22, 2009

Metaprogramming with ct_expand

Yesterday I posted about putting arbitrary Erlang terms into HTTP cookies. I suggested that constructing the base64 codec at compile time was an excellent application for ct_expand, but I didn't provide any details. Therefore, here is a follow-up.

ct_expand provides a simple interface to Erlang metaprogramming: evaluating an arbitrary Erlang term at compile time and substituting the results into the source code during compilation. This is especially useful for initializing a data structure at compile time, and in particular, it can be used to construct the forward and inverse maps for the base64 codec. We will specify our codec by providing a list of 64 unique characters, and ct_expand will do the rest. The following code is functionally identical to the termcookie module presented previously.

-module (termcookie2).
-compile ({ parse_transform, ct_expand }).
-export ([ decode/2,
encode/2 ]).

-define (CODEC, "abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"0123456789.,").

%
% Public
%

decode (Encoded, Secret) when is_binary (Encoded) ->
<<Signature:28/binary, Payload/binary>> = Encoded,
Signature = to_base64 (crypto:sha ([ Payload, Secret ])),
erlang:binary_to_term (from_base64 (Payload)).

encode (Term, Secret) ->
Payload =
to_base64
(erlang:term_to_binary (Term,
[ compressed,
{ minor_version, 1 } ])),
Signature = to_base64 (crypto:sha ([ Payload, Secret ])),
<<Signature/binary, Payload/binary>>.

%
% Private
%

to_base64 (Bin) when (8 * byte_size (Bin)) rem 6 =:= 0 ->
to_base64_padded (Bin);
to_base64 (Bin) when (8 * byte_size (Bin)) rem 6 =:= 2 ->
to_base64_padded (<<Bin/binary, 0:16>>);
to_base64 (Bin) when (8 * byte_size (Bin)) rem 6 =:= 4 ->
to_base64_padded (<<Bin/binary, 0:8>>).

to_base64_padded (Bin) ->
<< <<(element (N + 1,
ct_expand:term (
begin
64 = length (?CODEC),
64 = length (lists:usort (?CODEC)),
list_to_tuple (?CODEC)
end
)
)
):8>>
|| <<N:6>> <= Bin >>.

from_base64 (Bin) ->
<< <<(element
(N + 1,
ct_expand:term
(element
(2,
lists:foldl
(fun (X, { K, T }) ->
{ K + 1, setelement (X + 1, T, K) }
end,
{ 0, erlang:make_tuple (256, -1) },
?CODEC)
)
)
)
):6>>
|| <<N:8>> <= Bin >>.

Some notes:
  • Lines 6-8 contain the specification of the codec.
  • Lines 43-44 are compile-time assertions on the codec specification, namely that it consists of 64 unique characters. If you modify the specification to violate these assertions, the module will not compile, although the resulting error message will be nearly unintelligible (try it!).
  • Line 45 constructs the forward mapping from the specification; the result is a (constant) tuple which is consulted at run time.
  • Lines 56-65 construct the inverse mapping from the specification; the result is a (constant) tuple which is consulted at run time.
Note you cannot use any functions from the current module inside the ct_expand:term/1 argument, because the current module has not been compiled yet! If you need to do something really complicated and don't like an unwieldy inline expression you can place helper code in a separate module which is compiled first.

The resulting software is easier to maintain than the original version, because if the codec needs to be changed, only the specification is modified. For instance we can replace lines 5-8 with

-define (CHARS, "abcdefghijklmnopqrstuvwxyz"
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"0123456789.,").
-define (CODEC,
ct_expand:term (
fun () ->
random:seed (1, 2, 3),
[ X ||
{ _, X } <-
lists:sort (
[ { random:uniform (), Y }
|| Y <- ?CHARS
]
)
]
end ()
)).

which results in a proper codec utilizing a permutation of the original codec. Note the call to random:seed/3 is happening at compile time, and is setting the random seed of the compilation process. Therefore this is a stable codec definition. (Unfortunately, it doesn't really increase the opacity of the encoding scheme; the bits in an encoded Erlang term are highly degenerate so any interested party would be able to deduce the permutation given enough cookies).

No comments: