Replies: 3 comments
-
Thanks for starting this thread. I managed to finish writing down my thoughts on this. Supporting views is not a trivial change, but it's also not that different from supporting tables: Tables, databases and schemas are currently the only objects that are synced from MotherDuck. This means that if people create views in MD they cannot query them in Postgres. To query them from postgres we'll need to create Postgres objects that we can query in the same way. There are three viable object types that we could use:
All of these would work the same with regards to querying them, but there are other things to consider as well. Tl;dr conclusionWe should sync view definitions from MotherDuck to Postgres as a Postgres VIEW. The resulting view will look as if it was created using a Postgres command that looks like this: CREATE VIEW myview AS
SELECT * FROM duckdb.view($$ FROM 's3://some.parquet' $$) AS (column1 int, column2 text); Being able to create MotherDuck views from within Postgres would be a follow up task, with less priority. Dropping MotherDuck views should be fairly simple to implement though, and thus should probably be part of the initial syncing support. Inspecting viewsCatalogs in Postgres are also inspected, either using the tables in information_schema, psql it’s meta commands like \d and \dv, or using some visual client application like DBeaver or pgAdmin. The MotherDuck views will show up in these places as the type of object that we choose. So creating them as views would feel most natural to Postgres users when inspecting them. There are two slight problems with VIEWs though. VIEW definitions contain the query string, and these are also shown when inspecting views. But this query string needs to be understandable by postgres. So we need to store some other string than the actual query string. But it seems nice to still embed the actual query string though, so people can see it. SELECT duckdb.view($$ FROM 's3://some.parquet' $$); Differentiation from Postgres objectsAnother thing that users want to do when inspecting their catalog. Is to differentiate between MotherDuck and Postgres views. With tables they can do so by looking at the table's table access method. If that’s “heap” it’s a Postgres table, if that's “duckdb” it’s an MD table. That’s not the case for duckdb views, because they don’t have a table access method. Using the DDL on MotherDuck viewsViews can not only be read from, but they can also be created/altered/dropped. Actually implementing this might not be necessary for the first level of support, but it’s something to consider. Dropping and altering MotherDuck viewsDropping views is easy with all approaches. But for users having it be a view is most natural, because then DROP VIEW works. Otherwise people need to use DROP TABLE or DROP FOREIGN TABLE. The same is true for altering views. Right now RENAME is the only supported ALTER command by DuckDB. All proposed objects. Creating MotherDuck views from PostgresCreating views is the most difficult operation. An approach that would work for all of our objects would be to use a procedure that would build the DuckDB CREATE VIEW command for us. CALL duckdb.create_view('myview', $$ FROM 's3://some.parquet' $$) That seems quite straightforward, but it gets more complicated once we want to support some other CREATE VIEW options, because we need to encode all these things in Postgres arguments. Take for instance the following CREATE VIEW command, which uses OR REPLACE and specifies the column names: CREATE OR REPLACE VIEW v1(a) AS SELECT 42; Another approach is to use Postgres' dedicated DDL to create the object in question. If we use views that could look something like this: CREATE OR REPLACE VIEW myview(a) WITH (using = duckdb) AS SELECT 42; Note that this uses This doesn’t work though if the view query is not valid postgres syntax. For those we’d somehow need to encode the actual query in a valid Postgres query. We could then detect this special query pattern instead of or in addition to detecting CREATE VIEW myview AS SELECT duckdb.view($$ FROM 's3://some.parquet' $$); With tables CREATE TABLE something like this would work: CREATE TABLE myview() USING duckdb WITH (view = $$ FROM 's3://some.parquet' $$); And with CREATE FOREIGN TABLE we’d need syntax like this: CREATE FOREIGN TABLE myview() SERVER duckdb
OPTIONS (view = $$ FROM 's3://some.parquet' $$); Both foreign CREATE TABLE and CREATE FOREIGN TABLE would only be able to support OR REPLACE using another WITH option though. ConclusionBased on the options above it seems clear that using a VIEW is the most natural option. The only problem is that we need to wrap the query in a duckdb.view call when storing it in Postgres. |
Beta Was this translation helpful? Give feedback.
-
Basic support for read-only syncing and querying of MD views (i.e. mirroring them with Postgres views) would already be a significant win! |
Beta Was this translation helpful? Give feedback.
-
There is a not-so-good way to query the contents of duckdb. SELECT duckdb.raw_query($$ SELECT * from main.duckdb_tables $$);
|
Beta Was this translation helpful? Give feedback.
-
I brought this up through another channel as well, but raising here for public visibility: is there any technical reason why querying DuckDB views through
pg_duckdb
is not yet supported? This would be really key for us.Beta Was this translation helpful? Give feedback.
All reactions