-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python module validation of observation_id field - too stringent? #12
Comments
see https://www.opencadc.org/caom2/#ObservationURI and https://www.opencadc.org/caom2/#PlaneURI Those two types specify the structure of those URIs, but since they do so in reference to other fields they only loosely imply that (i) those field values must be valid to use in URIs and (ii) additional path separators (/) are not allowed. This is admittedly a poorly documented part of the model. Since URIs are used internally and also services are expected to be able to provide ivoa publisher ID values based on these same strings, the restriction on chars that cannot be used in URIs (eg space) is pretty much impossible to remove. |
Since ObservationURI contains {collection} and {observationID} (one /) and PlaneURI contains that plus {productID}, the URIs can currently be validated based on number of path components to make sure s/w is using the right one in the right place... we could think about allowing additional separators (/) but that would weaken the validation of those two types of URI. |
@alaity47 Can you provide some concrete use case/example of observationID and/or productID values? Where they come from, who decides what those strings are, etc. |
The format/content of observationid for us varies by mission. We only have a couple of collections where a space or a slash can be in there, and it's because we're either using part or all of a filename path, or some kind of target/object name. Some examples:
I completely understand enforcing these restrictions on URIs, but the way we've implemented things here there's no direct relationship between observation_id and any URIs. To me the documentation reads as if that field can be any string (https://www.opencadc.org/caom2/2.3/#Observation.observationID) |
At MAST, the observation_id is typically the "fileroot" of the primary science file. For example, if the file is ipppssoot_drz.fits, the observation_id would be ipppssoot. I know we use URI parsing logic for the artifact URI and would not be surprised if we also have some for the observation URI. Many times that parsing logic is very simple and just splits by "/" so having additional ones would likely break things on our end. |
closing in favour of #3 |
Several of our collections at IRSA have observationid values that include spaces or slashes, and we're running into the following error trying to export them with the python caom2 library:
invalid observation_id: may not contain space ( ), slash (/), escape (\), or percent (%)
I reached out to @pdowler, who said this is to ensure that observation_id does not contain any characters aren't URI-safe, and suggested I open an issue here for discussion.
We don't use that field for URI population and haven't been able to find any documentation indicating that the observation_id string has any restrictions.
Is the caom2 module possibly being too stringent in how it validates this column?
(FYI we use CAOM 2.3, and the python module caom2 2.3.8.4)
The text was updated successfully, but these errors were encountered: