-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to match BIDS IDs and CanProCo IDs #88
Comments
We are working on a more thorough solution, but in the meantime the participants.tsv file in Montreal's BIDS dataset can be used to link scans to the original CanProCo IDs. For example 'sub-tor092' corresponds to 'CAN-01-CON-092' and 'ses-M0' corresponds to M0. Sequences names in Montreal's BIDS dataset can be remapped back to original names (that @leelisae is using) like so: T2w -> 3D-T2W-S To shed some light on this situation, @leelisae actually has direct access to UBCs internal server and has been pulling new scans regularly, rather than using the packaged BIDS structured datasets (like what Montreal has received). This is because she was previously UBC student and still has a UBC login and VPN access. The reason @leelisae's data is structured differently is because she is using the source data before it has been passed through our script that re-structures the dataset to BIDS. Hope this helps! |
Thank you for your inputs @zachvav. What I still don't understand, though, is that during a conversion with @leelisae this week, she mentioned that patients are also organized by phenotypes, and that the ID number could be the same (and what would distinguish them would be the phenotype). Eg: |
I believe @leelisae might have been talking about the ID numbers being shared between sites e.g. |
@leelisae would you be able to confirm? Thank you |
@zachvav is correct. I apologize, @jcohenadad - I misunderstood & miscommunicated. I meant to say that the ID numbers, at rare times, could be shared between sites (e.g., CAN-02-PPM-201 & CAN-03-RRM-201). It seems like once we receive the document from UBC matching filenames, we'd be good to go! |
To clarify, the IDs between the BIDS dataset and the original are actually the same IDs, but just coded differently (without phenotype and with a three letter site code e.g. 'tor' rather then a site ID number). We do not have a document that matches IDs as they are already implicitly linked. If it would be helpful to you @leelisae however, I can generate a one-time CSV that has the BIDS IDs of Montreal's data and their corresponding CanProCo IDs. |
I would instead suggest to write a script that does the conversion based on this logic (simple regex) and upload it on this repos (eg under a |
Thanks for the suggestion. I have created a script to backwards match the BIDS NII files to files in our original structure on the UBC server and have sent Lisa the output. To match files instead of just IDs, the conversion is actually slightly more complicated then simple regex. As an example, the BIDS structure recoded MT sequences like so: Because of this, the script itself requires access to both the BIDS and the original directory trees and therefore must be run from within UBCs network and would be unusable by external sites. Instead of including the script itself in the repo I suggest we both include the script output as a new TSV file and add the original CanProCo ID to the |
Thank you both for your help! @jcohenadad and/or @plbenveniste - Since it seems like we can now match the BIDS IDs to original CanProCo IDs, would I be able to receive your baseline (M0) SC lesion masks that you've already created? Or, do you advise that I still run Pierre-Louis' pipeline myself to re-generate SC lesion masks? Also, as we previously spoke: @jcohenadad - Would you be able to write an example code to linearly register SC lesion masks in subject PSIR to subject MT space? @plbenveniste - Would you be able to write a few sentences about the new SC lesion segmentation tool and send me any citations, so I could add this to the Methods section of the manuscript? Of course, I will include you as a co-author too. |
Yes, @plbenveniste is on it. In fact, ideally, the segmentations should be pushed to the main repos by @zachvav, and a new version of the dataset re-sent to @leelisae
Because your file structure (non-BIDS) is different than my file structure (BIDS), if I design a script based on my file structure it won't work on your file structure. So, my suggestion is that you send me an example subject, with your current analysis script, which I will modify to add the code for PSIR registration. Moving forward, we should all be working with the same file structure. |
@leelisae You can find the manual segmentations for all the M0/baseline participant which were manually segmented in the following zip file canproco_M0_lesion_segmentations.zip. However, some participants were not segmented because the image quality was good enough (the excluded subjects should all be in the following exclude.yml file). find ./ -type f -name '*ses-M0*lesion-manual.*' -exec cp {} ~/Desktop/canproco_M0_lesion_segmentations \;
Here is a brief description of the model created for automatic spinal cord lesion segmentation: A deep learning model for cervical spinal cord MS lesions segmentation was developed using the self-configuring nnUNet v2 framework (https://pubmed.ncbi.nlm.nih.gov/33288961/). It is a region-based model, outputting a single segmentation image containing 2 classes representing the spinal cord and MS lesions. Training data was based on the CanProCo dataset (M0) and consisted of sagittal PSIR 0.7×0.7×3 mm3 (4 sites, 333 participants) and sagittal STIR 0.7×0.7×3 mm3 (1 site, 92 participants). The ground truth spinal cord labels were generated by the contrast-agnostic model (https://arxiv.org/abs/2310.15402) with manual corrections when required (~5% of the images), and the ground truth MS lesion labels were generated manually from scratch by a trained radiologist. As for the citation: |
I agree it would be much easier if we all used the BIDS structure going forward! I have now included the masks provided by @plbenveniste in the BIDS repository on our end. Any new datasets that we send will include the M0 lesions masks. While doing this I noticed that the following NII files do not have a corresponding .JSON file:
We don't actually need the JSON files for anything, so this isn't a problem itself; however, I wanted to make sure this is expected behavior and there aren't any files that have been accidentally missed. |
Thanks for highlighting this issue @zachvav. This is an issue for us. We use the JSON files to trace where the segmentations come from. I will investigate on this. For now, I can just say that the 4 segmentation masks are empty, and I couldn't identify any lesions in the images. Therefore, I must :
|
The problem comes from a previous manual labelling after receiving the new M0 batch from Erin (more details in issue #39). To inspect each file history, I ran : git log --follow -p ./derivatives/labels/sub-mon006/ses-M0/anat/sub-mon006_ses-M0_PSIR_lesion-manual.nii.gz What was done:
Changes were pushed to branch Here is the updated zip file with the M0 lesion segmentation: Thanks again for your feedback @zachvav |
@jcohenadad - Yes, I will send you an example subject and analysis script likely via dropbox later today. Thank you all for your generous help! @jcohenadad @plbenveniste @zachvav |
I suggest to put the analysis script in this repository, under e.g. |
@jcohenadad - This is a friendly follow-up re: example code to linearly register SC lesion masks in subject PSIR to subject MT space, then, calculating MTR for ROIs excluding SC lesions. As a reminder, I sent you the example subject data & analysis script via email on Apr 25. Thank you! |
I've created a specific issue for this #91 (the current issue is about something else) |
Context
The Montreal team would like to share spinal cord lesion segmentation obtained by @plbenveniste to the Toronto team (@leelisae) for the purpose of linking spinal cord lesion load with additional qMRI measures done in Toronto.
Problem
The Montreal team uses BIDS structure sent by the UBC team (internal git-annex SHA:
a04d89739c769dc03f23fcda183df62c62f586a9
), while the Toronto team uses the CanProCo original file name. How can we match files between the two teams?Solutions
The file that has matched IDs for both datasets should be accessible by both teams. Is this file centralized at a single point on the UBC server? If so, could it be made accessible to all CanProCo researchers without having to ask for it (sending the file by email implies that the file sent will be out-of-sync with the original and maintained file, which is prone to error, eg: if either the BIDS or the original dataset is being updated).
Related issue #86
The text was updated successfully, but these errors were encountered: