-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chore/schema constraints #81
base: develop
Are you sure you want to change the base?
Conversation
cc @dmiller15 as well |
Just some things:
|
So you're in good company w/ the I'll also double check on the value ranges for the percentage fields, but I'd suspect that yeah, we should be capping them. |
@dmiller15 looks like you're right about the I grabbed one of the percent fields, In general, how have you guys been finding these value ranges? Maybe I'm missing something here, but it seems like the CDE pages on this stuff is rather lacking... |
@dmiller15 and @allisonheath I've added upper bound constraints for the aforementioned percentage fields, as well as converted |
Hi there. @allisonheath sent me this conversation in case I am able to contribute. I looked at the restrictions that have been added and have a couple of comments. If you're interested in building more validation for additional elements, i'm happy to share some suggestions. Looks like @dmiller15 is also working on these – let me know if I can help in any way.
|
@tlicht3 sounds great, suggestion away. |
@dmiller15 @allisonheath any thoughts on modifying tests to match? |
Discussed w/ @dmiller15 offline and I think we're going to go ahead with modifying the tests to match these changes + suggestions from @tlicht3. |
This is a slew of small changes to the schema, primarily around imposing type and range constraints. I've kept the commits broken down so we can cherry pick certain ones into separate prs if desired. But mostly just want to get the conversation going.
Most of these add a
minimum
tonumber
types, with notable exceptions beingage_at_diagnosis
anddays_to_birth
which havemaximum
s set. The CDE pages onage_at_diagnosis
anddays_to_birth
specify that the former is in a range[0,150]
and that the latter is explicitly negative. For everything else, it seems like the definition of the value range is more vague, but does not explicitly state negative values.year
fields were changed to beinteger
type, following the CDE description of a year value being a four-character field w/ 0 decimal places.The other two major changes are in
md5sum
where I've added a string regex to match theMD5
format, and restricted thefile_size
to be aninteger
type with minimum value of 1. This was discussed w/ @allisonheath and, unless there's a compelling reason not to, this should restrict us from accepting 0-byte files.@NCI-GDC/ucdevs thoughts?