-
Notifications
You must be signed in to change notification settings - Fork 43
Nested query results
SQL queries take "flat" tables (i.e. multisets of records whose fields are values of "base" type) and produce "flat" tables as results. By default, Links supports SQL-like queries with flat inputs and flat outputs, for example:
for (d <-- departments)
for (e <-- employees)
where(d.name == e.dept)
[(dept = d.name, name = e.name)]
This query will return a table with one row for each employee, giving their name and their department name. There is a fair amount of redundancy: if there are 50 people in the "Department of Redundancy Department", then there will be 50 copies of the department name in the result.
Links now also supports nested query results using a technique called query shredding, described in a SIGMOD 2014 paper and tech report referenced below. When this feature is enabled, Links's query
keyword will evaluate queries whose results involve arbitrary combinations of collection (list/multiset), record, and base types. For example:
for (d <-- departments)
[(dept = d.name, emps = for (e <-- employees)
where(d.name == e.dept)
[e.name])]
Notice that the above query has return type [(dept:String,emps:[String])]
, i.e. it is not a flat table result but has some nesting of collections in the emps
field. By default Links would reject this query because the result is not flat. With shredding enabled, Links executes this query; it is translated to two SQL queries yielding flat results and then the results are "stitched" together into a nested Links value. Moreover, with shredding enabled Links can handle queries with arbitrary nesting in the query result, subject to some caveats below.
To use nested query results, the following configuration options should be added to the config
file:
relax_query_type_constraint=on
shredding=on
If the first option is not enabled, Links's typechecker will complain if you try to write a query with nested results. Probably it would be a good idea to combine these options, and perhaps also to rename the jargon-y shredding
option to something like nested_query_results
.
Shredding is currently only known to be supported by PostgreSQL. It should work with other databases that support SQL:1999 but this hasn't been tested. Trying to use shredding with a database other than PostgreSQL will result in undefined behavior (probably just a runtime exception). It would be good to catch this statically or, failing that, make sure that the runtime exception is not too mysterious.
Links supports some simple forms of aggregation (e.g. SQL's count
), limiting (e.g. SQL's limit N
) and sorting (e.g. order by
). The interactions between nested query results and these features are not well understood, so for programs that rely on these features, shredding may not be used. This is a possible area for future work.
It might be a good idea to allow the programmer to choose whether to use shredding or not, e.g. with different variants of the query
keyword, rather than using the shredding
option to globally change the behavior.