Skip to content

Nested query results

James Cheney edited this page Jul 10, 2017 · 7 revisions

What does it do?

SQL queries take "flat" tables (i.e. multisets of records whose fields are values of "base" type) and produce "flat" tables as results. By default, Links supports SQL-like queries with flat inputs and flat outputs, for example:

for (d <-- departments)
for (e <-- employees)
where(d.name == e.dept)
[(dept = d.name, name = e.name)]

This query will return a table with one row for each employee, giving their name and their department name. There is a fair amount of redundancy: if there are 50 people in the "Department of Redundancy Department", then there will be 50 copies of the department name in the result.

Links now also supports nested query results using a technique called query shredding, described in a SIGMOD 2014 paper and tech report referenced below. When this feature is enabled, Links's query keyword will evaluate queries whose results involve arbitrary combinations of collection (list/multiset), record, and base types. For example:

for (d <-- departments)
[(dept = d.name, emps = for (e <-- employees)
                        where(d.name == e.dept)
                        [e.name])]

Notice that the above query has return type [(dept:String,emps:[String])], i.e. it is not a flat table result but has some nesting of collections in the emps field. By default Links would reject this query because the result is not flat. With shredding enabled, Links executes this query; it is translated to two SQL queries yielding flat results and then the results are "stitched" together into a nested Links value. Moreover, with shredding enabled Links can handle queries with arbitrary nesting in the query result, subject to some caveats below.

Configuration

To use nested query results, the following configuration options should be added to the config file:

relax_query_type_constraint=on
shredding=on

If the first option is not enabled, Links's typechecker will complain if you try to write a query with nested results. Probably it would be a good idea to combine these options, and perhaps also to rename the jargon-y shredding option to something like nested_query_results.

Limitations / Future Work

Shredding is currently only known to be supported by PostgreSQL. It should work with other databases that support SQL:1999 but this hasn't been tested. Trying to use shredding with a database other than PostgreSQL will result in undefined behavior (probably just a runtime exception). It would be good to catch this statically or, failing that, make sure that the runtime exception is not too mysterious.

Links supports some simple forms of aggregation (e.g. SQL's count), limiting (e.g. SQL's limit N) and sorting (e.g. order by). The interactions between nested query results and these features are not well understood, so for programs that rely on these features, shredding may not be used. This is a possible area for future work.

It might be a good idea to allow the programmer to choose whether to use shredding or not, e.g. with different variants of the query keyword, rather than using the shredding option to globally change the behavior.

References