Insights into adapter, connector, dialect, driver, and integration workbenches #186
amotl
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Introduction
In order to learn about how creating adapters and dialect implementations for CrateDB works, on behalf of real-world examples, this article attempts to enumerate typical genesis patches for adapters to be unlocked for CrateDB. The one and only valid answer to the question about how it works, how long it takes, and how complex it is, must always be: It Depends.
The good thing is that the excellent decision to start implementing the PostgreSQL wire protocol 💯 makes the effort relatively low, both in terms of complexity, and amount of code that needs to be written. In this spirit, it isn't rocket science at all, because it can build upon large amounts of community work done for PostgreSQL already, while adapters for CrateDB mostly only need to "reasonably subtract" relevant features from the PostgreSQL adapters, and add different adjustments for CrateDB's special features and data types at other spots. Even the LangChain adapter implementation that is using modern PostgreSQL extensions like pgvector was largely suitable to be easily retrofitted for CrateDB, in this case thanks to the magic of SQLAlchemy.
Many dialect adapters can derive from the vanilla PostgreSQL adapters within the corresponding libraries and frameworks in one way or another, so it is mostly a matter of "disabling" features CrateDB does not provide, or working around them, mostly by deriving and inheriting from PostgreSQL implementation fame, only overriding particular spots where CrateDB acts differently, mostly by applying object inheritance patterns, like other PostgreSQL-compatible databases like Materialize, Redshift, and RisingWave are also doing it within such database adapter libraries.
Details
Items sorted alphabetically, some patches already have landed, while some others are still in progress. Some are large, some are small. It Depends.
Apache Flink
Apache Spark
Apache Superset
DataGrip
Django ORM
dbt
Estuary
JDBC
LangChain
Meltano
{singerfile,github}-to-cratedb
cratedb-examples#190Metabase
MLflow
pandas/Dask
Records
Tableau
SQLAlchemy
SQLAlchemy is certainly a major pillar, and encapsulates many details of a database abstraction layer in Python lands, so that others don't have to. In this spirit, it needs elevated attention, because only if it does everything right, other frameworks that are sitting on top of it, can successfully operate. For example, type mapping support, forward and reverse, specifically for CrateDB's special data types, is important.
Many of those polyfills and support extensions were needed, and coming from, adjustments to support MLflow, LangChain, pandas/Dask, and others. I've omitted the chore & maintenance commits, so it is effectively just the gist of what was happening on the SQLAlchemy dialect adapter in high-level dialect support details, after finally gaining some velocity in order to make it work well when used with real-world applications.
Most recent patches, in order to unlock or improve interoperability with real-world 3rd party applications and frameworks.
FLOAT_VECTOR
data type andKNN_MATCH
function sqlalchemy-cratedb#9get_table_names()
reflection method sqlalchemy-cratedb#10CrateIdentifierPreparer
for properly quoting reserved words sqlalchemy-cratedb#21DateTime
fields sqlalchemy-cratedb#22datetime.date
values onDateTime
fields sqlalchemy-cratedb#25get_pk_constraint
to returnlist
instead ofset
type sqlalchemy-cratedb#26table_kwargs
context manager to make pandas/Dask support CrateDB's special SQL DDL options sqlalchemy-cratedb#139support.util.refresh_table
sqlalchemy-cratedb#140CrateDDLCompiler
. sqlalchemy-cratedb#141quote_relation_name
support utility function sqlalchemy-cratedb#155error_trace
connect_args
option, by usingcrate-1.0.0dev1
sqlalchemy-cratedb#161do_execute...
dialect methods to store their response sqlalchemy-cratedb#162Other enhancements, some of them significant, haven't made it into mainline, yet.
Patches that need dear support.
ObjectArray.as_generic
sqlalchemy-cratedb#23JSON(B)
types using CrateDB'sOBJECT
sqlalchemy-cratedb#27asyncpg
andpsycopg3
drivers sqlalchemy-cratedb#11SQLFrame/SQLGlot
Beta Was this translation helpful? Give feedback.
All reactions