-
Notifications
You must be signed in to change notification settings - Fork 4
Records
The problem with JSON object translated to Erlang record, is the need to know the record structure before decoding.
jason
solve this problem by creating and loading at runtime dynamically named modules declaring and exporting record type, and functions to manage it.
The module and the record names are a same atom computed from a hash (phash/2) on object keys and value types.
{"akey": "a string"}
and {"akey": "another string"}
are considered same objects, while {"akey": "a string"}
and {"akey": 1.2 }
are not the same and result in two different modules.
Those modules are Jason's crew, so they will be called hereafter argonauts
.
Argonaut modules are compiled and loaded in VM only once at first object match while decoding JSON structure. This can lead to a non negligible time overhead at first call, linear to number of different objects found. All subsequent calls won't have any overhead, while modules are still loaded. Rebooting the VM will remove all argonaut modules at next start, unless they were dumped on disk in a directory in include path.
Tip : Decoding a JSON sample could be reasonable as part of your process' initialization, in order to create argonaut modules needed by your process. This is however not mandatory if first decoding call overhead can be bearable.
Argonauts modules are exporting :
- A module attribute called
jason
withargonaut
as value (to distinguish them) - record type as opaque 'xxxx':'xxxx'() where 'xxxx' is record/module name
- fields/0 function for list of record fields
- size/0 function for size of record
- def/0 function for a string representation of record's definition
- arity 1 function to get a key value of record (for each key)
- arity 2 function to modify a key value of record (for each key)
- new/0 function in order to create a new 'empty' record
1> '22207878':module_info().
[{module,'22207878'},
{exports,[{new,0},
{fields,0},
{size,0},
{def,0},
{glossary,1},
{glossary,2},
{module_info,0},
{module_info,1}]},
{attributes,[{vsn,[81124041750904415376220006066684788135]},
{jason,[argonaut]}]},
{compile,[{options,[]},{version,"7.0.3"}]},
{native,false},
{md5,<<61,7,236,16,44,66,129,140,127,117,138,248,85,109,
125,167>>}]
Using fields/0
and def/0
on this example show that '22207878' record use only one glossary
field that must contain a '34707836' record.
2> '22207878':fields().
[glossary]
3> '22207878':def().
"-record('22207878', {glossary = '34707836':new() :: '34707836':'34707836'()})."
As decoding is done bottom-up by leex/yecc parser, nested objects are cleanly handled.
Call to '22207878':new/0
returns a complete nested new record.
4> '22207878':new().
#'22207878'{
glossary =
#'34707836'{
title = [],
'GlossDiv' =
#'6257036'{
title = [],
'GlossList' =
#'131402670'{
'GlossEntry' =
#'49933946'{
'ID' = [],'SortAs' = [],'GlossTerm' = [],'Acronym' = [],
'Abbrev' = [],
'GlossDef' = #'111785629'{para = [],'GlossSeeAlso' = []},
'GlossSee' = []}}}}}
There is another advantage to use argonaut
module arity 2 functions : Guards are protecting from affecting invalid values, while basically, Erlang does not protect to affect invalid data to a record field, even typed in record definition.
5> B = A#'22207878'{glossary= "invalid value"}. % Hirk !
#'22207878'{glossary = "invalid value"}
6> C = '22207878':glossary(A, "invalid value"). % Raise exception on invalid value
** exception error: no function clause matching '22207878':glossary(#'22207878'{
glossary =
#'34707836'{ ... }},
"invalid value") (, line 1)
7> C = '22207878':glossary(A, #'34707836'{title="A valid glossary title"}).
#'22207878'{
glossary =
#'34707836'{
title = "A valid glossary title",
'GlossDiv' =
#'6257036'{
title = [],
'GlossList' =
#'131402670'{
'GlossEntry' =
#'49933946'{
'ID' = [],'SortAs' = [],'GlossTerm' = [],'Acronym' = [],
'Abbrev' = [],
'GlossDef' = #'111785629'{para = [],'GlossSeeAlso' = []},
'GlossSee' = []}}}}}
JSON decoding to Erlang record need the option {mode, record}
.
Using JSON to Record decoding in Erlang shell is facilitated by the possibility to create a header file with all record definitions discovered in the JSON structure.
Use option {to, "/path/to/a/file"}
where the target file MUST be non existing OR empty (upper directory MUST exists).
Loading this header file with shell helper function rr/1
let you see the usual record structures.
1> {ok, A} = jason:decode_file("priv/ref1.json",[{mode, record}, {to, "/tmp/records.hrl"}, {return, tuple}]).
{ok,{'22207878',{'34707836',"example glossary",
{'6257036',"S",
{'131402670',{'49933946',"SGML","SGML",
"Standard Generalized Markup Language","SGML",
"ISO 8879:1986",
{'111785629',"A meta-markup language, used to create markup languages such as DocBook.",
["GML","XML"]},
"markup"}}}}}}
2> rr("/tmp/records.hrl").
['111785629','131402670','22207878','34707836','49933946',
'6257036']
3> A.
#'22207878'{
glossary =
#'34707836'{
title = "example glossary",
'GlossDiv' =
#'6257036'{
title = "S",
'GlossList' =
#'131402670'{
'GlossEntry' =
#'49933946'{
'ID' = "SGML",'SortAs' = "SGML",
'GlossTerm' = "Standard Generalized Markup Language",
'Acronym' = "SGML",'Abbrev' = "ISO 8879:1986",
'GlossDef' =
#'111785629'{
para =
"A meta-markup language, used to create markup languages such as DocBook.",
'GlossSeeAlso' = ["GML","XML"]},
'GlossSee' = "markup"}}}}}
Nested records can then be accessed without argonaut module functions inside the shell by using usual record Erlang syntax.
4> (A#'22207878'.glossary)#'34707836'.title .
"example glossary"
5> B = A#'22207878'{glossary = (A#'22207878'.glossary)#'34707836'{title = "another example" }}.
#'22207878'{
glossary =
#'34707836'{
title = "another example",
'GlossDiv' =
#'6257036'{
title = "S",
'GlossList' =
#'131402670'{
'GlossEntry' =
#'49933946'{
'ID' = "SGML",'SortAs' = "SGML",
'GlossTerm' = "Standard Generalized Markup Language",
'Acronym' = "SGML",'Abbrev' = "ISO 8879:1986",
'GlossDef' =
#'111785629'{
para =
"A meta-markup language, used to create markup languages such as DocBook.",
'GlossSeeAlso' = ["GML","XML"]},
'GlossSee' = "markup"}}}}}
Once a .hrl file written to disk, you can easily use it for your own :
- include it in your own modules' code
- modify it to set your own default values (default are : string = "", integer = 0, float = 0.0 and literal = null)
- give a more human readable name to record instead of hash atoms. See example.