You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We need to update the current manner in which we capture HCMI metadata to fully track all the samples. Currently there is no tie back tot he case_ids, which are the primary identifiers, nor is there any tracking of cancer diagnosis, tumor vs. normal, or primary vs met. This can be rectified by updating the data capture algorithm using a well-thought-out schema. Using this updated approach we can capture all the data in the current HCMI sample schema:
We should map the following CoderData fields to the hcmi fields. Like the broad_sanger data, there will be multiple rows for a single sampleID. cancer_type: This should be the 'Clinical tumor diagnosis' common_name: this should be the 'sample_submitter_id' other_id: there should be MULTIPLE of these per sample, which can be duplicated
case_submitter_id (other_id_source should be 'case_submitter_id')
if available: diagnosis_id (other_id_source should be 'diagnosis_id') (there can still be multiple aliquots per dx)
if available: treatment_id (other_id_source should be 'treatment_id')
sample_uuid (other_id_source should be 'sample_id'). <----this should be 1:1 with improve_sample_id
We need to update the current manner in which we capture HCMI metadata to fully track all the samples. Currently there is no tie back tot he case_ids, which are the primary identifiers, nor is there any tracking of cancer diagnosis, tumor vs. normal, or primary vs met. This can be rectified by updating the data capture algorithm using a well-thought-out schema. Using this updated approach we can capture all the data in the current HCMI sample schema:
We should map the following
CoderData
fields to the hcmi fields. Like the broad_sanger data, there will be multiple rows for a single sampleID.cancer_type
: This should be the 'Clinical tumor diagnosis'common_name
: this should be the 'sample_submitter_id'other_id
: there should be MULTIPLE of these per sample, which can be duplicatedother_id_source
should be 'case_submitter_id')other_id_source
should be 'diagnosis_id') (there can still be multiple aliquots per dx)other_id_source
should be 'treatment_id')other_id_source
should be 'sample_id'). <----this should be 1:1 withimprove_sample_id
other_id_source
should be 'case_id')other_name
: add 'tissue_type' hereother_name
: add 'tumor_descriptor' hereIf fixed this should address the need for #185
The text was updated successfully, but these errors were encountered: