Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can we have a api to separate the complex multi-table join query to sub-queries? #48

Open
lenahi opened this issue Nov 16, 2021 · 4 comments

Comments

@lenahi
Copy link

lenahi commented Nov 16, 2021

For a simple join query blow:
q = "SELECT Orders.OrderID, Customers.CustomerName FROM Orders INNER JOIN Customers ON Orders.CustomerID = Customers.CustomerID;"

we hope it can separate something similar:
"Orders": "SELECT OrderID, CustomerID FROM Orders",
"Customers": "SELECT CustomerName, CustomerID FROM Customers"

@klahnakoski
Copy link
Owner

@RemilYoucef has made a pull request that may do some of the work you are looking for: #26.

The mo-sql-parsing library has no concept of the database schema, and does not understand what the dot-delimited variable names mean; Splitting joins into queries is probably hard work.

@klahnakoski
Copy link
Owner

I think about this more often than is useful: By converting (multi-table) joins to subqueries, we can translate queries to document stores and hierarchical databases.

@DoGoodToday
Copy link

DoGoodToday commented May 20, 2022

A similar feature would be separating subqueries into common table expressions, making them more readable.

Ex. select a,b
from (select 'stuff' a, id from table1) t
join (select "more stuff" b, id from table2) v on t.id=v.id

converts to:

with t as (
select 'stuff' a, id from table1
{

,v as (
select "more stuff" b, id from table2
)

select a,b from t join v on t.id=v.id

Presumably, one path would be converting the JSON subquery output to a JSON "with" output, then "formatting" to sql.

Any suggestions on where I would start to implement this?

@klahnakoski
Copy link
Owner

klahnakoski commented May 21, 2022

@DoGoodToday

Any suggestions on where I would start to implement this?

Your suggestion goes along the lines of my own thinking. I have a project that transpiles SQL into other query languages, it covers many simple use cases, but it is mostly incomplete. What remains is the feature you propose (well, the reverse mapping from subquery-form to join-form).

  • I have the sense that all joins have an implicit group by: In your example the group by is the id, which is unique, but There are joins on non-unique columns too.
  • I was thinking of turning joins into subqueries (or lateral views, or window functions) , which allows parameterization; quite similar to your suggestion.
  • It is best if you write a bijection (from join-form to subquery-form AND from subquery-form to join-form). This bijection will help validating your algorithm, and double the usefulness of the tests.

So, how should you proceed? There are two options:

  1. Make a few test cases, and start programming - You can deal with the easy cases first, and increment on your design. This will get you started fast, and may cover all your real word cases; a general solution may never be required. The downside is you will be rewriting parts/all of your code as you handle ever-more-complex cases. There is a good chance you will code yourself into a corner and must do a full rewrite, forcing you to the second option ...
  2. Find the underlying model/symbolism/datastructures that makes implementing the transformation "easy" - Someone has probably done this work before; Find it. If it does not exist then you must discover it: Figure out how to encode the SQL expression so that transformation between join-form and subquery-form is trivial. The downsides are, a) you have tI he wrong model, so you end up doing the first option. b) There is no model, or it can not be found, leaving you only the first option.

I am willing to collaborate with you on this project if you are willing to attempt the heavy lifting: It is best if this project has it's own repo. Make one, and tell me where it is, I can setup the test suite, and add test cases. I can also provide feedback in reviews, or be around to discuss ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants