-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up incoming ID3C data #32
Comments
I would really prefer to keep the entire UUID in the strain name. The whole reason for using the UUIDs in the first place is that they are universally unique; a property that we lose if we truncate them. If we don't use the UUID, then we've lost all its benefits and shouldn't have used them from the start. I would also caution against using opaque acronyms like
instead? |
@tsibley --- I'm afraid I don't agree. We should aim to be as consistent as possible with how the entire flu field treats strain names. It will be super weird if there are canonical names like The strain name itself is meant to be unique, but short enough to be usable. Even
(Field order is important too, extra slashes are non-standard and would break parsing) I might even say to just name this as |
Ok! It seems like I don't understand how these names are used in practice, if that's considered unwieldy. (It doesn't, from my naive, outside perspective, seem unwieldy to me.) Are these names regularly spoken, as opposed to copied/programmatically processed? |
Yes. Regularly spoken aloud and used to point people around a tree or around a titer table. If you'd like to keep UUID, we can provide this as a "sample ID" in flat file data download that's paired with strain name. |
I think it would be smart to keep the full UUID linked one way or another. It is an identifier equivalent in utility to the GenBank accession. |
One additional request here: just using I've added this as request number 4 above. |
Yet one more request. Can we restrict rows in
I've added this as request number 5 above. |
@trvrb do you still only want the new |
In a [GitHub issue](seattleflu/augur-build#32), Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v*` views.
In a [GitHub issue](seattleflu/augur-build#32), Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v2` view.
In a [GitHub issue](seattleflu/augur-build#32), Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v2` view.
In seattleflu/augur-build#32, Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v2` view.
In seattleflu/augur-build#32, Trevor requested that we include `age_range_coarse` as a column in the view.
In seattleflu/augur-build#32, Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v2` view.
In seattleflu/augur-build#32, Trevor requested that we include `age_range_coarse` as a column in the view.
In seattleflu/augur-build#32, Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v2` view.
In seattleflu/augur-build#32, Trevor requested that we include `age_range_coarse` as a column in the view.
In seattleflu/augur-build#32, Trevor requested that encountered date no longer be formatted as a timestamp but rather a date in YYYY-MM-DD format for the `shipping.metadata_for_augur_build_v2` view. Co-authored-by: Thomas Sibley <[email protected]>
In seattleflu/augur-build#32, Trevor requested that we include `age_range_coarse` as a column in the view.
This is now fixed on master.
This column is now present on master. |
@joverlee521 ---
There are a small handful of upstream fixes we need to shipping views.
date
field inv2/shipping/augur-build-metadata
was formatted as2019-09-25T19:37:35.483+00:00
. This should just read2019-09-25
. I've fixed this on the augur side here: https://github.com/seattleflu/augur-build/blob/master/scripts/download_sfs_metadata.py#L25 for the time being.B/Washington/2/2019
. This means that sample UUIDfe1a1206-21ef-45ff-8be0-9d7643eef879
would be strainA/Washington/43eef879/2019
, ie takingA
orB
depending on flu A or flu B and taking year from date.neighborhood
(within Seattle proper) /puma
(outside Seattle proper) forlocation
. I believe that @kairstenfay may have started on this already in ID3C.age_range_coarse
as a field in the shipping view.shipping.augur-build-metadata
to only those samples that have sequencing data.Edited to update format for strain name in item 2 and to include items 4 and 5.
The text was updated successfully, but these errors were encountered: