-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-46747 refactor pqserver per SQR-072 #65
Conversation
6f6d5e9
to
09e4ecc
Compare
09e4ecc
to
b402b60
Compare
) | ||
|
||
# Check SQLAlchemy version | ||
required_version = (2, 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could put this in our requirements.txt or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed the version check from the code, but I think setting up requirements.txt or similar calls for a new ticket. The needs of consdb are fairly complicated given that there are several products coming from this repository.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wanted to add requirements on such a major refactoring, that seems fine - I don't see that needing its own ticket.
Co-authored-by: Valerie <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This organization of dependencies, external, internal, models, and the pqserver.py being reduced to being the 'main app' seems to fit sqre-072 and my familiarity with fastapi.
We also talked about globals and 'requirements.txt' to handle sometime soon.
md.reflect(engine) | ||
self.table_names.update([str(table) for table in md.tables]) | ||
self.schemas = md | ||
self.obs_id_column = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant (along with the next line).
self.get_db = get_db | ||
|
||
self.table_names = set() | ||
self.schemas = dict() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this a potential type problem if you're replacing it with a MetaData
?
I see the attraction of initializing these all up front, but sometimes initializing to the final, fixed contents can be more readable.
col_name = col_name.value | ||
if col_name in md.tables[table].columns: | ||
self.obs_id_column[table] = col_name | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to comment here that this breaks ties based on the ordering in the enum if more than one such column is in a table. (I do think that that ordering is correct.)
columns = self.timestamp_columns[table] | ||
return columns | ||
|
||
def get_schema_version(self) -> Version: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how this might be used, but if we need it, we should pass it through from the source of truth rather than inferring it. For a later ticket.
def get_day_obs_and_seq_num(self, exposure_id: int) -> tuple[int, int]: | ||
exposure_table_name = f"cdb_{self.instrument}.exposure" | ||
exposure_table = self.schemas.tables[exposure_table_name] | ||
query = sqlalchemy.select(exposure_table.c.day_obs, exposure_table.c.seq_num).where( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this ever becomes a bottleneck, we can replace it with per-instrument formulae.
schema_table_name = table_name + "_schema" | ||
if table_name in md.tables and schema_table_name in md.tables: | ||
schema_table = md.tables[schema_table_name] | ||
stmt = sqlalchemy.select(schema_table.c["key", "dtype", "doc", "unit", "ucd"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it's too difficult to share this code with the refresh_
method below.
self.postgres_url = url | ||
return url | ||
|
||
return "ERROR DATABASE CONNECTION NOT SPECIFIED" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this raise a ValueError
?
logger = logging.getLogger() | ||
|
||
# Check whether the instrument name is valid | ||
if _instrument_list is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call get_instrument_list()
?
inspector = inspect(engine) | ||
instrument_list = [name[4:] for name in inspector.get_schema_names() if name.startswith("cdb_")] | ||
|
||
if instrument not in [i.lower() for i in instrument_list]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call validate_instrument_name()
?
return data | ||
|
||
|
||
class UnknownInstrumentException(Exception): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually seems like it could inherit from BadValueException
with a particular kind
.
This divides the functionality in pqserver into separate units. It also switches from using a
Connection
object toSession
objects so that sqlalchemy is managing a connection pool.Note that the error behavior is not preserved by this change. My impression is that no one is relying on the existing error behavior, so it is to our advantage to move to error formats that are provided by FastAPI.