Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Integrate SQLAlchemy for db conn management and introduce new SqlStor…
…age abstraction (#93) * Update expected VRS IDs for VCF tests * Update VRS IDs for variation tests * Added new SqlStorage implementation as an abstract base class for all RDBMS storage implementations The base class utilizes SqlAlchemy for connection management and SQL statement execution because it is the only connection pooling library that works with the Snowflake connector. The base class includes the background db write capabilities from the Snowflake implementation and actual SQL statement execution where standard SQL is used. Abstract methods are defined for queries where the SQL or database APIs are not standard. * Switch to snowflake-sqlalchemy package * Update the Postgres storage implementation to be a subclass of the new SqlStorage base class Primarily removed code that was included in the base class and reorganized remaining code into the base class API shape. Because the Snowflake connector only supports SqlAlchemy 1.4 which in turn only supports psycopg2, had to modify the batch insert logic to use a different API. * Update Snowflake storage implementation to be a subclass of new SqlStorage base class Removed code that is now included in base class and reorganized remaining code into base class API shape. * Updated unit tests to cover the use of background writes in Postgres storage implementation Refactored mocks for SqlAlchemy based testing into separate module * Rename test file and replace unused var names with underscore * Add storage option to always fully flush on batch context exit * When storage construction does not complete, the batch_thread and conn_pool are sometimes not created leading to spurious errors on close(). Check for these attributes before attempting to clean them up. * Depending on the underlying database, the returned column value can be a string or a dict * Add batch add mode settings to control what type of SQL statement to use when adding new VRS objects to the database * Update variation test data to match VRS 2.0 changes * Comment out response model to return full VRS objects instead of serialized version * Make get location/variation behave consistently even when the object store does not throw a KeyError on missing key * Remove code added to make debugging easier * Uupdate queries to use specified table name * Fix bug in detecting column value type on fetch * Batch add mode only makes sense for Snowflake because in Postgres the vrs_objects table has a primary key and uses "ON CONFLICT" on inserts * Switch to using question mark bind variables for Snowflake because named parameters were not working Pick up table name from environment in unit tests * Update to batch insert to play nicely with Snowflake quirks * Update example URL to be SQLAlchemy friendly * Use super() to invoke __init__() * Add support for Snowflake private key auth * Add monkey patch workaround for bug in Snowflake SQLAlchemy * Update collation in temp loading table * Storage implementations should be consistent with MutableMapping API and throw KeyError when an item is not found * Remove VRS model classes from response objects because the serialization used internally is not correct for API responses * Corrected path used for missing allele id test * Get location and get variation should be consistent in behavior when id is not found * Revert unecessary change * Throw KeyError when id is not found * Add missing argument to _get_connect_args * Code formatting * Suppress SQL injection warning as elsewhere * Code formatting * Adding missing SQL injection warning suppressions * Update README to reflect changes * Address "Incomplete URL substring sanitization" warning
- Loading branch information