Friday, August 21, 2009

Erlang Terms in Cookies

Erlang is an increasingly popular choice for web development. Such projects tend to heavily leverage HTTP cookies, and because Erlang defines an external format for any Erlang term, it turns out to be very easy to store an arbitrary term with a cookie (within cookie size limitations), which can be a useful trick.

The technique outlined here generates a signed but not encrypted cookie. That means it's fairly simple for anyone possessing one of your cookies to determine the contents, but difficult for them to forge a novel cookie. The latter is typically important for the application, but has additional importance here because we will be calling erlang:binary_to_term/1 on the cookie value and passing arbitrary data to erlang:binary_to_term/1 is a bad idea (for example, this could cause an extremely large memory allocation).

Here's the code:

-module (termcookie).
-export ([ decode/2,
encode/2 ]).

%
% Public
%

decode (Encoded, Secret) when is_binary (Encoded) ->
<<Signature:28/binary, Payload/binary>> = Encoded,
Signature = to_base64 (crypto:sha ([ Payload, Secret ])),
erlang:binary_to_term (from_base64 (Payload)).

encode (Term, Secret) ->
Payload =
to_base64
(erlang:term_to_binary (Term,
[ compressed,
{ minor_version, 1 } ])),
Signature = to_base64 (crypto:sha ([ Payload, Secret ])),
<<Signature/binary, Payload/binary>>.

%
% Private
%

to_base64 (Bin) when (8 * byte_size (Bin)) rem 6 =:= 0 ->
to_base64_padded (Bin);
to_base64 (Bin) when (8 * byte_size (Bin)) rem 6 =:= 2 ->
to_base64_padded (<<Bin/binary, 0:16>>);
to_base64 (Bin) when (8 * byte_size (Bin)) rem 6 =:= 4 ->
to_base64_padded (<<Bin/binary, 0:8>>).

to_base64_padded (Bin) ->
<< <<(to_base64_char (N)):8>> || <<N:6>> <= Bin >>.

to_base64_char (N) when N >= 0, N =< 25 -> $a + N;
to_base64_char (N) when N >= 26, N =< 51 -> $A + (N - 26);
to_base64_char (N) when N >= 52, N =< 61 -> $0 + (N - 52);
to_base64_char (62) -> $.;
to_base64_char (63) -> $,.

from_base64 (Bin) ->
<< <<(from_base64_char (N)):6>> || <<N:8>> <= Bin >>.

from_base64_char (N) when N >= $a, N =< $z -> N - $a;
from_base64_char (N) when N >= $A, N =< $Z -> 26 + (N - $A);
from_base64_char (N) when N >= $0, N =< $9 -> 52 + (N - $0);
from_base64_char ($.) -> 62;
from_base64_char ($,) -> 63.
We are taking advantage of the fact that erlang:binary_to_term/1 will ignore extra bytes at the end, which allows us to mindlessly pad for base 64 encoding.

If you really like to squeeze the last few drops of efficiency out of code, you can change those to_base64_char/1 and from_base64_char/1 functions into tuple lookups. If you are extra cool you can use Ulf Wiger's ct_expand parse transform to construct the tuples at compile time from a specified character list.

This code will throw an exception if anything is amiss with the input, including a signature fail.

% erl
Erlang (BEAM) emulator version 5.6.5 [source] [async-threads:0] [kernel-poll:false]

Eshell V5.6.5 (abort with ^G)
1> crypto:start ().
ok
2> termcookie:encode ({ "omg", erlang, rulz }, "wazzup").
<<"nQCwmuMgeK3bTPzBqKDSmSylIciaG2GdAWadB21NzaagzxjSyw5NzaaeCNvSEGaa">>
3> termcookie:decode (termcookie:encode ({ "omg", erlang, rulz }, "wazzup"), "wazzup").
{"omg",erlang,rulz}
4> termcookie:decode (termcookie:encode ({ "omg", erlang, rulz }, "huh"), "wazzup").
** exception error: no match of right hand side value <<"nQCwmuMgeK3bTPzBqKDSmSylIcia">>
in function termcookie:decode/2

6 comments:

  1. Oh, my god. I consider, that much better is to store JSON in cookies.

    http://github.com/maxlapshin/erlyvideo/blob/master/src/rtmp_session.erl

    Look at it, I encode session to JSON, then Base64 it, than sign with SHA1 and get reliable session storage.

    ReplyDelete
  2. If you really plan to use this straight as a cookie value, beware that the '=' character is interpreted as a value assignment in some browsers.

    Base64 uses the '=' character for padding, and mochiweb refuses to set that cookie (error, quoting neccesary). Solution is to use a modified base64, which uses other characters:

    ...
    to_base64_char (61) -> $_;
    to_base64_char (62) -> $-;
    to_base64_char (63) -> $+.

    ...
    from_base64_char ($_) -> 61;
    from_base64_char ($-) -> 62;
    from_base64_char ($+) -> 63.

    for more info, google for base64_url

    ReplyDelete
  3. Hey argl ...

    I'm trying to understand your comment, since my solution does not use a '=' character in its codec. I use a-z, A-Z, 0-9, '.', and ','.

    ReplyDelete
  4. Hi Max,

    seems your code does not work with newer mochiweb as they now reject ','

    ReplyDelete
  5. Thanks for pointing that out gebi.

    To fix this, you could either change lines 41 and 50 or define your own codec via ct_expand as in the next post http://dukesoferl.blogspot.com/2009/08/metaprogramming-with-ctexpand.html

    ReplyDelete
  6. Yea thx, i changed it to:
    to_base64_char (63) -> $+.
    from_base64_char ($+) -> 63.

    btw... just to have it in comments for other user:

    -include_lib("eunit/include/eunit.hrl").

    base64_bijective_test() ->
    lists:foreach(fun(X) ->
    B = <>,
    B64 = to_base64(B),
    <> = from_base64(B64),
    ?assertEqual(B, B1) end,
    lists:seq(1,64535)).

    base64_mochicookie_test() ->
    B64 = [ to_base64(<>) || X <- lists:seq(0,4096) ],
    lists:foreach(fun(X) -> mochiweb_cookies:cookie("t", X) end, B64).

    ReplyDelete