Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAMF MR Brain tumor segmentation #93

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

jithenece
Copy link

Pretrained model for 3D semantic image segmentation of the brain tumor, necrosis, and edema from MRI scans

@jithenece
Copy link
Author

@LennyN95 Transformation matrix are used by this model. Could you add Transform(.mat) to the list of allowed files in mhubio/core/FileType.py

@jithenece jithenece marked this pull request as ready for review August 7, 2024 10:30
@jithenece
Copy link
Author

/test

sample:
  idc_version: "Data Release 2.0 October 24, 2022"
  data:
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.72111832425535404540752357374191117693
    aws_url: s3://idc-open-data/81967c43-b804-410a-91f0-619cea9e74d6/*
    path: input_data/case_study1/flair
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.179890292978753819584197741746287159209
    aws_url: s3://idc-open-data/92492275-9496-4049-8273-4c6461d75fc9/*
    path: input_data/case_study1/t2
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.337491152254065288399657726162931889194
    aws_url: s3://idc-open-data/951a4b1e-1ed3-4c59-b7d0-0e877b370b03/*
    path: input_data/case_study1/t1
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.339743325012852358051228621010996392201
    aws_url: s3://idc-open-data/7facf869-55b9-441f-81ef-a3ec2e63c5a8/*
    path: input_data/case_study1/t1

reference:
  url: https://drive.google.com/file/d/1Xq8H_qmPx-hS15ChciCXi12-EAdyKIhr/view?usp=sharing

Copy link
Member

@LennyN95 LennyN95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the most-complex workflow added to MHub yet, I would love to hear about your experience in breaking up your work into MHub-IO Modules ;)

Some comments below, some apply to all Modules to simplify them a bit by making them more specific. Great work!!

Comment on lines 65 to 69
structures:
- $patientID@instance/t1@dicom:mod=mr:type=t1
- $patientID/t1ce@dicom:mod=mr:type=t1ce
- $patientID/$type@dicom:mod=mr:type=t2
- $patientID/flair@dicom:mod=mr:type=flair
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you use a $type meta variable in the 3rd structure directive. I assume this is not on purpose. Instead, either use a specific declaration:

structures:
      - $patientID@instance/t1@dicom:mod=mr:type=t1
      - $patientID/t1ce@dicom:mod=mr:type=t1ce
      - $patientID/t2@dicom:mod=mr:type=t2
      - $patientID/flair@dicom:mod=mr:type=flair

This is equivalent to the use of a variable as in the example below, because the$type variable will read the folder name and add it to the type meta key for all files imported under the folder. However, in this case it is not clear from the import directive how the folder structure needs to look like, hence the first option should be used.

structures:
      - $patientID@instance/$type@dicom:mod=mr

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jithenece I see you already updated this to the latter example. However, for the transparency with we should stick with the first, more verbose variant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This explains. I have reverted this to earlier verbose variant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jithenece Can you check if you pushed this already, I still see the simplified version in the most recent commit.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have pushed it now. please check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a $type variable (instead of t2).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed it. I have updated now.

models/bamf_mr_brain_tumor/config/default.yml Outdated Show resolved Hide resolved
models/bamf_mr_brain_tumor/utils/NNUnetRunnerV2.py Outdated Show resolved Hide resolved
models/bamf_mr_brain_tumor/utils/ReOrientationRunner.py Outdated Show resolved Hide resolved
models/bamf_mr_brain_tumor/utils/ReOrientationRunner.py Outdated Show resolved Hide resolved
"description": "native (T1) MRI scan of a patient.",
"format": "DICOM",
"modality": "MRI",
"bodypartexamined": "Brain",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The convention is all-uppercase, e.g., BRAIN.

models/bamf_mr_brain_tumor/meta.json Outdated Show resolved Hide resolved
models/bamf_mr_brain_tumor/meta.json Outdated Show resolved Hide resolved
models/bamf_mr_brain_tumor/meta.json Outdated Show resolved Hide resolved
models/bamf_mr_brain_tumor/meta.json Outdated Show resolved Hide resolved
@jithenece
Copy link
Author

This looks like the most-complex workflow added to MHub yet, I would love to hear about your experience in breaking up your work into MHub-IO Modules ;)

Some comments below, some apply to all Modules to simplify them a bit by making them more specific. Great work!!

Thank you for your feedback! It’s great to hear your thoughts on the complexity and structure of this model. Breaking it down into MHub-IO Modules was indeed a thoughtful process. Initially, we had a single module covering all steps, but we found it was difficult to understand.

I will review your suggestions and make the necessary changes.

@fedorov
Copy link
Member

fedorov commented Aug 21, 2024

idc_version: "Data Release 2.0 October 24, 2022"

@jithenece there was no data release 2.0 on that date in IDC. IDC v2 was in June 2021, see https://learn.canceridc.dev/data/data-release-notes#v2-june-2021. Where did you get this date from?

@jithenece
Copy link
Author

1.3.6.1.4.1.14519.5.2.1.72111832425535404540752357374191117693
    aws_url: s3://idc-open-data/81967c43-b804-410a-91f0-619cea9e74d6/*

Apologies. I have used collection version number for UPENN-GBM instead of idc version. I will update it

models/bamf_mr_brain_tumor/config/default.yml Outdated Show resolved Hide resolved
Comment on lines 65 to 69
structures:
- $patientID@instance/t1@dicom:mod=mr:type=t1
- $patientID/t1ce@dicom:mod=mr:type=t1ce
- $patientID/$type@dicom:mod=mr:type=t2
- $patientID/flair@dicom:mod=mr:type=flair
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is still a $type variable (instead of t2).

models/bamf_mr_brain_tumor/meta.json Outdated Show resolved Hide resolved
@jithenece
Copy link
Author

jithenece commented Aug 28, 2024

/test

uploading the segmentation file
output.zip

sample:
  idc_version: 10
  data:
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.339743325012852358051228621010996392201
    aws_url: s3://idc-open-data/7facf869-55b9-441f-81ef-a3ec2e63c5a8/*
    path: case1/t1
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.179890292978753819584197741746287159209
    aws_url: s3://idc-open-data/92492275-9496-4049-8273-4c6461d75fc9/*
    path: case1/t2
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.337491152254065288399657726162931889194
    aws_url: s3://idc-open-data/951a4b1e-1ed3-4c59-b7d0-0e877b370b03/*
    path: case1/t1ce
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.72111832425535404540752357374191117693
    aws_url: s3://idc-open-data/81967c43-b804-410a-91f0-619cea9e74d6/*
    path: case1/flair

reference:
  url: https://github.com/user-attachments/files/16783893/output.zip

Test Results (24.08.29_15.15.30_AkoraywFNq)
id: 773edbf6-156a-410f-85f9-2f19225a4c46
date: '2024-08-29 15:54:09'
missing_files:
- case_study1/flair.seg.dcm
- case_study1/t1ce.seg.dcm
- case_study1/t2.seg.dcm
- case_study1/t1.seg.dcm
summary:
  files_missing: 4
  files_extra: 0
  checks: {}
conclusion: false

@LennyN95
Copy link
Member

Please note, we updated our base image. All mhub dependencies are now installed in a virtual environment under /app/.venv running Python 3.11. Python, virtual environment and dependencies are now managed with uv. If required, you can create custom virtual environments, e.g., uv venv -p 3.8 .venv38 and use uv pip install -p .venv38 packge-name to install dependencies and uv run -p .venv3.8 python script.py to run a python script.

We also simplified our test routine. Sample and reference data now have to be uploaded to Zenodo and provided in a mhub.tom file at the project root. The process how to create and provide these sample data is explained in the updated testing phase article of our documentation. Under doi.org/10.5281/zenodo.13785615 we provide sample data as a reference.

@jithenece
Copy link
Author

@LennyN95 I have updated. Please let me know if any changes required.

@LennyN95
Copy link
Member

LennyN95 commented Oct 9, 2024

Static Badge

Test Results
id: 651fbbe3-07a2-419b-813b-046de5412898
name: MHub Test Report (default)
date: '2024-10-09 12:31:58'
missing_files:
- case1/flair.seg.dcm
- case1/t1ce.seg.dcm
- case1/t2.seg.dcm
- case1/t1.seg.dcm
summary:
  files_missing: 4
  files_extra: 0
  checks: {}
conclusion: false

@jithenece
Copy link
Author

Thanks @LennyN95 for testing this out. Is it possible to share the docker logs as it seems to ran fine for me by following the steps mentioned in the documentation

I had one similar issue while testing locally where docker ran fine but output files were not in the output mounted path. I found there was access issue to write file and had to run chmod 777 -R on the output mounted folder post which it got resolved. Could you try this or share the docker logs please.

@LennyN95
Copy link
Member

LennyN95 commented Oct 9, 2024

It appears that this is caused by the same error as reporter in #92 (comment). I attached a complete run log (run on and after downloading the provided sample data). Can it be that you're running a different version of your NNUnetRunnerV2 module maybe?
mhub.run.log

@jithenece
Copy link
Author

We used --ipc=host to the docker command to fix this error. Could you try this please?

@LennyN95
Copy link
Member

LennyN95 commented Oct 9, 2024

Why would you require this? The MHub model is meant to run entirely shielded from any host processes. Our testing procedure does not allow any breach from the default setup (similar with our offline requirement). Can you elaborate why exactly you specify this flag?

@jithenece
Copy link
Author

jithenece commented Oct 9, 2024

error logs show that
raise RuntimeError('Background workers died. Look for the error message further up! If there is ' RuntimeError: Background workers died. Look for the error message further up! If there is none then your RAM was full and the worker was killed by the OS. Use fewer workers or get more RAM in that case!

The resource needs to be increased for running NNUnetRunnerV2. Hence adding this --ipc=host flag, it provides more resources and completes the execution.

@LennyN95
Copy link
Member

LennyN95 commented Oct 9, 2024

This seems to be a problem with NNUnet (see MIC-DKFZ/nnUNet#2452). Can you please open an issue in the NNUnet repository?

I understand that setting --ipc=host seems to be a simple solution. However, for MHub we need to a) make sure that all models run with the same instruction set to keep it simple, b) make sure that all MHub containers are shielded from host processes and online services to address security concerns. For future submissions, it's best if you let us know in advance so we can find a solution in the early stages.

@jithenece
Copy link
Author

The issue was already raised. They referred this

Please note that PyTorch uses shared memory to share data between processes, so if torch multiprocessing is used (e.g. for multithreaded data loaders) the default shared memory segment size that container runs with is not enough, and you should increase shared memory size either with --ipc=host or --shm-size command line options to nvidia-docker run.

@LennyN95
Copy link
Member

LennyN95 commented Oct 9, 2024

Thx @jithenece for sharing. I wonder if multithread data loaders are required since we implement sequential execution (in contrast to batch execution) in MHub?

@jithenece
Copy link
Author

I tried to avoid multithread data loaders by reducing nnUNet_def_n_proc and nnUNet_n_proc_DA settings available at nnunetv2/configuration.py. But it does not seem to overcome this.

Alternatively,

  1. It worked when I tried using other flag setting --shm-size=256MB as stated earlier above.
  2. We used the below settings in our kubernetes cluster to fix this error. Please check if this helps
    Mounting an emptyDir to /dev/shm and setting the medium to Memory did the trick!

@jithenece
Copy link
Author

@LennyN95 could we test them and merge?

@LennyN95
Copy link
Member

@jithenece the additional cli commands are not compatible with our test routine, so the submission is on hold until we could fix the problem. We are working on a solution (see MIC-DKFZ/nnUNet#2556) and are writing a PR to enable single-process inference with nnunet v2. Once this has been resolved we're continuing with the submission process and testing of the model as planned.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready for Testing
Development

Successfully merging this pull request may close these issues.

4 participants