Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HCMI sample file is missing important metadata and does not cover the entire repository #186

Closed
sgosline opened this issue May 21, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@sgosline
Copy link
Member

sgosline commented May 21, 2024

We need to update the current manner in which we capture HCMI metadata to fully track all the samples. Currently there is no tie back tot he case_ids, which are the primary identifiers, nor is there any tracking of cancer diagnosis, tumor vs. normal, or primary vs met. This can be rectified by updating the data capture algorithm using a well-thought-out schema. Using this updated approach we can capture all the data in the current HCMI sample schema:

We should map the following CoderData fields to the hcmi fields. Like the broad_sanger data, there will be multiple rows for a single sampleID.
cancer_type: This should be the 'Clinical tumor diagnosis'
common_name: this should be the 'sample_submitter_id'
other_id: there should be MULTIPLE of these per sample, which can be duplicated

  1. case_submitter_id (other_id_source should be 'case_submitter_id')
  2. if available: diagnosis_id (other_id_source should be 'diagnosis_id') (there can still be multiple aliquots per dx)
  3. if available: treatment_id (other_id_source should be 'treatment_id')
  4. sample_uuid (other_id_source should be 'sample_id'). <----this should be 1:1 with improve_sample_id
  5. case_uuid (other_id_source should be 'case_id')
  6. other_name: add 'tissue_type' here
  7. other_name: add 'tumor_descriptor' here

If fixed this should address the need for #185

@sgosline sgosline self-assigned this May 21, 2024
@sgosline sgosline added the bug Something isn't working label May 21, 2024
@sgosline sgosline moved this to Ready in CoderData May 21, 2024
@sgosline sgosline moved this from Ready to In progress in CoderData May 21, 2024
@sgosline sgosline moved this from In progress to Done in CoderData May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

1 participant