Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHub / IDC - Implementing the nnUNet Task005 Prostate MR (ADC, T2) Model #67

Merged
merged 12 commits into from
Feb 28, 2024

Conversation

LennyN95
Copy link
Member

@LennyN95 LennyN95 commented Nov 20, 2023

I updated my implementation of the nnUNet Prostate MR (Task005) model.

The model takes two input series:

  • ADC
  • T2

The current default config uses a FileStructureImporter that accepts the following input structure (note that study is optional):

├── Patient1/
│   ├── study1
│   │   └── ADC 
│   │   │   └── dicom 
│   │   │       └── *.dcm
│   │   └── T2
│   │   │   └── dicom 
│   │   │       └── *.dcm
│   └── study2 
├── Patient2/
│   └── ADC 
│   │   └── dicom 
│   │       └── *.dcm
│   └── T2 
│   │   └── dicom 
│   │       └── *.dcm

To use the patient/strudy structure only, change to:

FileStructureImporter
    structures:
    - $patientID/$studyID@instance/$part@/dicom@dicom

To use the patient/ structure only use:

FileStructureImporter
    import_id: patientID
    structures:
    - $patientID@instance/ADC$part@/dicom@dicom
    - $patientID/T2$part@/dicom@dicom

To omit the dicom folders and instead organize all dicom files directly under the ADC and T2 folders, the following import structure can be used:

FileStructureImporter:
  outsource_instances: True 
  import_id: patientID/studyID
  structures:
    - $patientID/$studyID@instance/$part@dicom
    - $patientID@instance:studyID=none/ADC$part@dicom
    - $patientID@instance:studyID=none/T2$part@dicom

@LennyN95 LennyN95 changed the title Implementing the nnUNet Task005 Prostate MR (ADC, T2) Model MHub / IDC - Implementing the nnUNet Task005 Prostate MR (ADC, T2) Model Nov 20, 2023
@LennyN95 LennyN95 self-assigned this Nov 20, 2023
@ccosmin97
Copy link
Contributor

@LennyN95 here is the meta.json file : nnunet_task05_meta.json

Regarding the segdb codes, I did not see anything for the prostate zonal regions -- output labels for this models.
You could use these codes :
PROSTATE_PERIPHERAL_ZONE | Peripheral zone of prostate | C_BODY_STRUCTURE | T_PROSTATIC_PZ_STRUCTURE |   |
where T_PROSTATIC_PZ_STRUCTURE = 279706003

PROSTATE_TRANSITION_ZONE | Transition zone of prostate | C_BODY_STRUCTURE | T_PROSTATIC_TZ_STRUCTURE |   | where T_PROSTATIC_TZ_STRUCTURE = 399384005

Here is my dcmqi json file for converting these SEGS to DICOM for reference :
task05_nnunet_dcmqi.json

For licensing purposes, heres the nnUNet task05 dataset.json, being a bit more explicit about their licensing scheme.
dataset_task05.json

@LennyN95
Copy link
Member Author

LennyN95 commented Nov 22, 2023

Thx @ccosmin97,

Following your suggestion, I added the two new substructures to our SegDB with commit 9f922e1f.

Regarding the dicom conversion, which MR scan did you use as reference, I guess T2?

@ccosmin97
Copy link
Contributor

Regarding the dicom conversion, which MR scan did you use as reference, I guess T2?

I am a bit confused here, do you mean the resampling scheme to get identical geometry between the T2 and ADC volumes?
In that case, I used the T2 as a fixed grid to resample the ADC volume.

If you are referring to the dicomseg conversion of potential manual annotations, in my use case (ProstateX, QIN-Prostate-Repeatability collections) they always refer to the T2 volume.

@ccosmin97
Copy link
Contributor

ccosmin97 commented Nov 22, 2023

I think it is worth highlighting that this model was trained on co-registered T2 and ADC volumes, so any resampling scheme on ADC and T2 sequences that are not co-registered is sub-optimal. I did do resampling of ADC to T2 during my analysis because registration of ADC and T2 volumes was out of scope for the collections analyzed.

I did put some notes regarding this in the meta.json

@LennyN95
Copy link
Member Author

I think it is worth highlighting that this model was trained on co-registered T2 and ADC volumes, so any resampling scheme on ADC and T2 sequences that are not co-registered is sub-optimal

Yes @ccosmin97, I therefore build a Resampler Module that will resample ADC to T2's spacing. Is that what you mean?

@ccosmin97
Copy link
Contributor

I think it is worth highlighting that this model was trained on co-registered T2 and ADC volumes, so any resampling scheme on ADC and T2 sequences that are not co-registered is sub-optimal

Yes @ccosmin97, I therefore build a Resampler Module that will resample ADC to T2's spacing. Is that what you mean?

@LennyN95 That's what I did for my analysis, perfect.
For these multi-modal models that assume co-registered inputs, maybe it is a good idea to add a more generic resampling scheme in the future, where the user can choose the fixed resampling grid. In this case, it is the T2 image, but it also could be the ADC one or a fixed arbitrary grid. Motivation is these choices are purely experimental, trying to conform to same-size inputs.
I do not think there is a perfect way to design this resampling scheme, it is case-specific. In my case, resampling to the higher resolution, i.e. T2, makes more sense, also because the T2 image has better prostate tissue contrast compared to ADC, so we do not want to degrade/resample the T2 image.
I believe in your Resample module, it all boils down to the 'fixed' plastimatch argument, where here you specify 'path_to_t2.nii.gz' but it could also be a fixed grid-like '250,250,30' -- x,y,z voxel size

@ccosmin97
Copy link
Contributor

@LennyN95 A quick follow-up on the model testing :

Here is the workflow I wanted to test:

  1. Build the base mhub image
  2. build the model image -- nnunet_prostate_zonal_task05
  3. Try on dicom test data following the structure you mentioned

Even though step 1 and 2 completed 'successfully', i.e images were built, it seems that some paths were not set up properly. Here are some logs about the building of the model image:
image

When I try to run inference, it looks like no models ID were found inside the model container. I will investigate more, the issue is probably on my side.

image

@LennyN95
Copy link
Member Author

Yes, the resampling could be generalized. However, it's a static, model-specific task (e.g., various models might utilize various sampling techniques, but whatever sampling technique is used, it will be fixed for inference), right?
Importing model definition from Mub models repository.

Regarding the error, this is expected; the model has not yet been accepted. Therefore, you need to specify the repository and branch name during the build:

docker build -t dev/nnunet_prostate_zonal_task05 \
   --build-arg MHUB_MODELS_REPO="https://github.com/MHubAI/models::m-nnunet-prostate" .

@ccosmin97
Copy link
Contributor

Yes, the resampling could be generalized. However, it's a static, model-specific task (e.g., various models might utilize various sampling techniques, but whatever sampling technique is used, it will be fixed for inference), right?

It is model specific indeed. However, in this model case, no sampling technique was recommend in particular, since they assumed that the ADC and T2 images are co-registered. Any resampling decisions are outside of the model scope, but necessary in case of non co-registered data, such as my case. That's why I mentioned a more generic sampling scheme(T2 to ADC, ADC to T2, fixed grid applied for both), up to the user. As a start, resampling ADC to T2 image is ok. Overall I agree with a fixed sampling strategy for inference, as long as it was recommended by the author of the model, which is not the case here, this resampling strategy is purely motivated by my specific use-case.

@LennyN95
Copy link
Member Author

Okay in that case, for Mhub we shouldn't apply any resampling in the default case.

If we propose some specific sampling strategy these should always be motivated and evaluated by some analysis. Do you have any statistics of model performance that we could rely on?

I don't like the idea to leave this to the user without guidance, getting this wrong is quite easy and getting it right requires extensive testiness and evaluation. So if we allow for flexibility here, then it should be well documented in the Model Card and outsourced into alternative workflows. What do you think?

@ccosmin97
Copy link
Contributor

ccosmin97 commented Nov 29, 2023

@LennyN95
I tested the current PR with one sample from IDC which can be downloaded using s5mcd using the following command:
~/Downloads/s5cmd_latest/s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com cp 's3://idc-open-data/cf61e842-f0f6-4a3a-9a35-671acf207b91/*' .
It contains 2 sequences, one T2 and one ADC, from QIN-Prostate-Repeatability, patientID equal to PCAMPMRI-0002.

I modified the FileStructureImporter.structure from :

FileStructureImporter
    import_id: patientID
    structures:
    - $patientID@instance/ADC$part@/dicom@dicom
    - $patientID/T2$part@/dicom@dicom

to

FileStructureImporter
    import_id: patientID
    structures:
    - $patientID@instance/ADC$part@dicom/dicom@dicom
    - $patientID/T2$part@dicom/dicom@dicom

Here is the complete default.yml -- renamed to .json for uploading.

Without this change, the individual SOPUID.dcm objects were not copied from the source folder to the expected temporary folders(_global, imported_instances).

With this change, the rest of the converters and runners worked as expected, producing a segmentation mask with the expected geometry and orientation.

Regarding the actual segmentation result, I did not have time to check if it is similar to my previous analysis results, it might be slightly different since I did used --tta False during inference (default is True -- like in the default workflow).

@LennyN95
Copy link
Member Author

LennyN95 commented Dec 1, 2023

@ccosmin97 I updated my implementation and added a new customizable parameter disable_tta, default value false.

You can disable TTA by modifying the default.yml (or try with a custom config):

ProstateRunner:
   disable_tta: true

Or you can set the attribute from the cli:

docker run --rm -t --gpus all \
-v /abs/dicom/in:/app/data/input_data:ro \
-v /abs/out/:/app/data/output_data \
dev/nnunet_prostate_zonal_task05 \
--config:modules.ProstateRunner.disable_tta=false

Can you please try both versions (so we can set the appropriate default value)?

@ccosmin97
Copy link
Contributor

@ccosmin97 I updated my implementation and added a new customizable parameter disable_tta, default value false.

You can disable TTA by modifying the default.yml (or try with a custom config):

ProstateRunner:
   disable_tta: true

Or you can set the attribute from the cli:

docker run --rm -t --gpus all \
-v /abs/dicom/in:/app/data/input_data:ro \
-v /abs/out/:/app/data/output_data \
dev/nnunet_prostate_zonal_task05 \
--config:modules.ProstateRunner.disable_tta=false

Can you please try both versions (so we can set the appropriate default value)?

Thank you @LennyN95 !
It would also be valuable to have the model type as an option: -- model 3d_fullres/2d/...

@LennyN95
Copy link
Member Author

LennyN95 commented Dec 7, 2023

Updated @ccosmin97. Note that I changed the TTA configurable from disable_ttato use_tta (default false) to use the same parameter names as in our NNUnetRunner implementation.

@LennyN95
Copy link
Member Author

NOTE: The model is tested & can be pushed.

@ccosmin97 Any updates or objections from your side before we move ahead?

@ccosmin97
Copy link
Contributor

@LennyN95 Thank you for working on this! I will in the following few days -- this week:

  • I will update the meta.json to include information and justification about the resampling scheme
  • Re-test the branch once again

@LennyN95
Copy link
Member Author

/review

@github-actions github-actions bot added the REQUEST REVIEW Attach this label to your PR when your submission is "in progress" and is ready to be reviewed by us label Jan 18, 2024
@ccosmin97
Copy link
Contributor

@LennyN95 I added evaluation results and the used resampling scheme justification to the meta.json.
Details are in my fork, in this commit. I could create another PR or something else if that helps.

Testing is next.

@ccosmin97
Copy link
Contributor

ccosmin97 commented Jan 23, 2024

@LennyN95 I was testing out the task05 branch and got this error log for NiftiConverter.log :

image

'Read-only file system' --> Did I mess something in the docker configuration? Did this happen before?

By replacing the structure in FileStructureImporter in the default.yml from :

  - $patientID/$studyID@instance/$part@bundle@dicom
  - $patientID@instance:studyID=none/ADC$part@bundle@dicom
  - $patientID@instance:studyID=none/T2$part@bundle@dicom

to :

    - $patientID/$studyID@instance/$part@dicom@dicom
    - $patientID@instance:studyID=none/ADC$part@dicom@dicom
    - $patientID@instance:studyID=none/T2$part@dicom@dicom

I do get the the desired output segmentations without errors with the FileStructureImporter 'fix' mentioned above.

@LennyN95
Copy link
Member Author

You need the newest base image for the @Bundle to work, can you check if you have the latest :)? Your fix works too (just one @dicom would be enough) but that wouldn't hold for some edge cases.

@ccosmin97
Copy link
Contributor

You need the newest base image for the @Bundle to work, can you check if you have the latest :)? Your fix works too (just one @dicom would be enough) but that wouldn't hold for some edge cases.

Thanks, that did it! I built the base image locally from your branch, which had a commit to the base image 2 months ago.

It run successfully, I will finalize testing for this model and the 2 others tomorrow/Thursday -- based on your template for submitting model testing.

@LennyN95 LennyN95 removed the REQUEST REVIEW Attach this label to your PR when your submission is "in progress" and is ready to be reviewed by us label Jan 25, 2024
@ccosmin97
Copy link
Contributor

ccosmin97 commented Jan 26, 2024

@LennyN95

testing one one sample was successfull, results look like what is expected, and DICOM SEG metadata also looks right.
Here is the template:
Unless you have further comments/additions, the only thing left is to incorporate these changes into the m-nnunet-prostate meta.json file. I can submit a code review if that helps.

/test

sample:
  idcv: Data Release 17.0 December 04, 2023
  data:
    - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.3671.4754.165941479363091869002475812419
        aws: s3://idc-open-data/0c341066-954b-4076-aa9d-aef460b6ab12
        path: test_data/PCAMPMRI-00003/T2/
    - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.3671.4754.179470327513803829437549550387
        aws: s3://idc-open-data/6bbd50a9-e7fa-44fd-947c-13fa1d2eb4f8/
        path: test_data/PCAMPMRI-00003/ADC
reference:
    url_output: https://www.dropbox.com/scl/fi/vsgt82id8m712e5wbsby6/output.zip?rlkey=jy7vz011uboyauo0q9e3ds0b0&dl=0
    url_input: https://www.dropbox.com/scl/fi/z4yaunck470pwal70yrck/input.zip?rlkey=rzixexi0ckrk3mkkd7ysr9d9c&dl=0
notes:
This model takes a multi-modality input, namely T2 and ADC sequence. The model input also needs to be sorted by patientID, at the very least. Steps below are included to show how to create the input model case folder : 
mkdir -p test_data/PCAMPMRI-00009/
mkdir -p test_data/PCAMPMRI-00009/T2/
s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com cp 's3://idc-open-data/0c341066-954b-4076-aa9d-aef460b6ab12/*' test_data/PCAMPMRI-00003/T2/
mkdir -p test_data/PCAMPMRI-00009/ADC/
s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com cp 's3://idc-open-data/6bbd50a9-e7fa-44fd-947c-13fa1d2eb4f8/*' test_data/PCAMPMRI-00003/ADC/

mkdir -p out_data

docker run -v /home/exouser/Documents/mhub_task05_exp/test_template/test_data/:/app/data/input_data:ro \
-v /home/exouser/Documents/mhub_task05_exp/test_template/out_data/:/app/data/output_data \
dev/nnunet_prostate_zonal_task05:latest --workflow default

image
image

@LennyN95
Copy link
Member Author

Thx Cosmin! I'll integrate the meta changes and run our new test pipeline.
The template needs to be exactly as in the documentation for our automation to work, I'll post and try below.

@LennyN95
Copy link
Member Author

LennyN95 commented Jan 29, 2024

/test

sample:
  idc_version: "Data Release 17.0 December 04, 2023"
  data:
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.3671.4754.165941479363091869002475812419
    aws_url: s3://idc-open-data/0c341066-954b-4076-aa9d-aef460b6ab12/*
    path: PCAMPMRI-00003/T2
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.3671.4754.179470327513803829437549550387
    aws_url: s3://idc-open-data/6bbd50a9-e7fa-44fd-947c-13fa1d2eb4f8/*
    path: PCAMPMRI-00003/ADC
reference:
  url: https://www.dropbox.com/scl/fi/vsgt82id8m712e5wbsby6/output.zip?rlkey=jy7vz011uboyauo0q9e3ds0b0&dl=0

Test Results (24.01.29_13.36.48_9caE8eLF4g)
checked_files:
- path: /app/test/src/PCAMPMRI-00003/none/nnunet_prostate_zonal_task05.seg.dcm
  checker: DicomsegContentCheck
summary:
  files_missing: 0
  files_extra: 0
  checks:
    DicomsegContentCheck:
      files: 1
conclusion: true

Test Results (24.02.28_12.56.22_MnXlLNinpg)
id: 7779c218-57cf-459c-a988-1c9b398135f5
date: '2024-02-28 13:09:46'
checked_files:
- file: nnunet_prostate_zonal_task05.seg.dcm
  path: /app/test/src/PCAMPMRI-00003/none/nnunet_prostate_zonal_task05.seg.dcm
  checks:
  - checker: DicomsegContentCheck
    notes:
    - label: Segment Count
      description: The number of segments identified in the inspected dicomseg file.
      info: 2
summary:
  files_missing: 0
  files_extra: 0
  checks:
    DicomsegContentCheck:
      files: 1
conclusion: true

@github-actions github-actions bot added the REQUEST TEST Attach this label to your PR when your submission is ready to be tested by us. label Jan 29, 2024
@MHubAI MHubAI deleted a comment from ccosmin97 Feb 28, 2024
@LennyN95 LennyN95 merged commit 2412311 into main Feb 28, 2024
1 check passed
@LennyN95 LennyN95 removed the REQUEST TEST Attach this label to your PR when your submission is ready to be tested by us. label Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants