Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DM-46747 refactor pqserver per SQR-072 #65

Merged
merged 23 commits into from
Jan 27, 2025
Merged

DM-46747 refactor pqserver per SQR-072 #65

merged 23 commits into from
Jan 27, 2025

Conversation

bbrondel
Copy link
Collaborator

@bbrondel bbrondel commented Jan 7, 2025

This divides the functionality in pqserver into separate units. It also switches from using a Connection object to Session objects so that sqlalchemy is managing a connection pool.

Note that the error behavior is not preserved by this change. My impression is that no one is relying on the existing error behavior, so it is to our advantage to move to error formats that are provided by FastAPI.

@bbrondel bbrondel requested a review from Vebop January 7, 2025 19:03
python/lsst/consdb/cdb_schema.py Outdated Show resolved Hide resolved
python/lsst/consdb/cdb_schema.py Show resolved Hide resolved
python/lsst/consdb/dependencies.py Outdated Show resolved Hide resolved
python/lsst/consdb/dependencies.py Outdated Show resolved Hide resolved
python/lsst/consdb/dependencies.py Outdated Show resolved Hide resolved
python/lsst/consdb/dependencies.py Show resolved Hide resolved
)

# Check SQLAlchemy version
required_version = (2, 0, 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could put this in our requirements.txt or something?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the version check from the code, but I think setting up requirements.txt or similar calls for a new ticket. The needs of consdb are fairly complicated given that there are several products coming from this repository.

Copy link
Contributor

@JeremyMcCormick JeremyMcCormick Jan 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you wanted to add requirements on such a major refactoring, that seems fine - I don't see that needing its own ticket.

python/lsst/consdb/handlers/external.py Outdated Show resolved Hide resolved
python/lsst/consdb/models.py Outdated Show resolved Hide resolved
python/lsst/consdb/models.py Outdated Show resolved Hide resolved
@bbrondel bbrondel requested a review from ktlim January 15, 2025 20:30
@bbrondel bbrondel marked this pull request as ready for review January 15, 2025 20:30
Copy link
Collaborator

@Vebop Vebop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This organization of dependencies, external, internal, models, and the pqserver.py being reduced to being the 'main app' seems to fit sqre-072 and my familiarity with fastapi.
We also talked about globals and 'requirements.txt' to handle sometime soon.

md.reflect(engine)
self.table_names.update([str(table) for table in md.tables])
self.schemas = md
self.obs_id_column = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant (along with the next line).

self.get_db = get_db

self.table_names = set()
self.schemas = dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a potential type problem if you're replacing it with a MetaData?

I see the attraction of initializing these all up front, but sometimes initializing to the final, fixed contents can be more readable.

col_name = col_name.value
if col_name in md.tables[table].columns:
self.obs_id_column[table] = col_name
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to comment here that this breaks ties based on the ordering in the enum if more than one such column is in a table. (I do think that that ordering is correct.)

columns = self.timestamp_columns[table]
return columns

def get_schema_version(self) -> Version:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this might be used, but if we need it, we should pass it through from the source of truth rather than inferring it. For a later ticket.

def get_day_obs_and_seq_num(self, exposure_id: int) -> tuple[int, int]:
exposure_table_name = f"cdb_{self.instrument}.exposure"
exposure_table = self.schemas.tables[exposure_table_name]
query = sqlalchemy.select(exposure_table.c.day_obs, exposure_table.c.seq_num).where(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this ever becomes a bottleneck, we can replace it with per-instrument formulae.

schema_table_name = table_name + "_schema"
if table_name in md.tables and schema_table_name in md.tables:
schema_table = md.tables[schema_table_name]
stmt = sqlalchemy.select(schema_table.c["key", "dtype", "doc", "unit", "ucd"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's too difficult to share this code with the refresh_ method below.

self.postgres_url = url
return url

return "ERROR DATABASE CONNECTION NOT SPECIFIED"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this raise a ValueError?

logger = logging.getLogger()

# Check whether the instrument name is valid
if _instrument_list is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call get_instrument_list()?

inspector = inspect(engine)
instrument_list = [name[4:] for name in inspector.get_schema_names() if name.startswith("cdb_")]

if instrument not in [i.lower() for i in instrument_list]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call validate_instrument_name()?

return data


class UnknownInstrumentException(Exception):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually seems like it could inherit from BadValueException with a particular kind.

@bbrondel bbrondel merged commit 323ad56 into main Jan 27, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants