Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Functions to extract all tables, columns used by a query #226

Open
niroshaimos opened this issue Jan 16, 2024 · 1 comment

Comments

@niroshaimos
Copy link

A function that could get a query\json repr of a query and output all tables/columns used by a query would be useful for cases in which you want to check if the user of an attempted query has sufficient permissions before attempting execution of the query.
Thanks!

@a1ea321
Copy link

a1ea321 commented Aug 21, 2024

I'd also find this useful, for another purpose. But I don't think the functionality should be in this library. This parser has enough on its plate as it is. It gets confused too often when I throw queries at it from the real world. After all, it supports multiple DBs.

Anyway, I need it so much I started writing it myself. And I've seen it is not possible to really do it without a DB connection. Consider these two queries:

  • select t1.x, t2.y from t1 join t2 on ...
  • select x, y from t1 join t2 on ...

These are equivalent, as long as there is no x in t2 and no y in t1. Therefore, the function @niroshaimos wanted must have the query as one input parameter, and a database connection as the other parameter. Alternatively, the output could claim the table to be ? in such cases and the programmer might have to do some post-processing.

Not a problem for what I am writing. All the input queries will be written by me, so I can be explicit (prepend the table names to the column names).

But, aliases that are the same and/or nested are what made me gave up. In the PostgreSQL manual there was a good example illustrating what I mean. But I couldn't find it. I hope you get the idea with my following example:

-- Preparation:
create view t1 as (select 42 a); 
create view t2 as (select 43 b);
-- This is the query to illustrate my point:
select t1.b, t2.b
from (select b from t2) t1
join t2 on 1=1;

I'd consider this bad style, but it is valid SQL. Having to handle this sort of mess is enough to keep me away from trying to write a function that works for every valid query.

I have a piece of code that lists tables but because of what I mentioned above I gave up trying to extract the column names as well. You may use my progress if you find it useful. Here. It just outputs tables, not columns.

I gave up but what I need to do still needs to get done. For PostgreSQL queries I am going to use pglast and use visitors to write something myself - if the library does not have already. For MS SQL... Not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants