Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: BAMF nnUNet Breast MR Model #82

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

rahul99
Copy link

@rahul99 rahul99 commented Jun 7, 2024

No description provided.

@rahul99
Copy link
Author

rahul99 commented Jun 7, 2024

@LennyN95 , @fedorov : Hey guys, we would like to add another segmentation model to MHub that segments Breast, FGT, and Breast tumor. Can you please make an entry to SegDB for breast segmentation relation ROIs?

BREAST
FGT
BREAST_TUMOR

@rahul99 rahul99 changed the title [FEAT]: bamf_nnunet_mr_breast [FEAT]: BAMF nnUNet Breast MR Model Jun 7, 2024
@rahul99
Copy link
Author

rahul99 commented Jun 7, 2024

@LennyN95 , it looks like MHubIO only support nnUNet-v1 models at the moment? I am referring to this code here: https://github.com/MHubAI/mhubio/blob/main/mhubio/modules/runner/NNUnetRunner.py#L21

The breast segmentation model was trained on nNUNet-v2. Let me know how to proceed with this. cc. @vanossj

@LennyN95
Copy link
Member

LennyN95 commented Jun 7, 2024

Hi @rahul99, this sounds like an interesting model and another great addition to the MHub model repository!

  • Can you prepare a DCMQI style JSON or provide the codes in an alternative format? I'll then add them to our SegDB. I assume we only need to add BREAST and FGT, you then can use BREAST+TUMOR for the breast neoplasm.
  • Our internal NNUnetRunner, indeed, is based on the v1. I have a v2 version in the works. However, you can implement the model and provide your own NNUnetV2Runner.

@rahul99
Copy link
Author

rahul99 commented Jun 12, 2024

{
  "ContentCreatorName": "BAMFHealth^AI",
  "ClinicalTrialSeriesID": "Session1",
  "ClinicalTrialTimePointID": "1",
  "SeriesDescription": "AIMI breast-tumor AI segmentation",
  "SeriesNumber": "300",
  "InstanceNumber": "1",
  "BodyPartExamined": "BREAST",
  "segmentAttributes": [
    [
      {
        "labelID": 1,
        "SegmentDescription": "Breast",
        "SegmentAlgorithmType": "AUTOMATIC",
        "SegmentAlgorithmName": "BAMF-Breast-Tumor-MRI",
        "SegmentedPropertyCategoryCodeSequence": {
          "CodeValue": "123037004",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Anatomical Structure"
        },
        "SegmentedPropertyTypeCodeSequence": {
          "CodeValue": "76752008",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Breast"
        },
        "recommendedDisplayRGBValue": [56, 125, 255]
      },
      {
        "labelID": 2,
        "SegmentDescription": "Fibroglandular Tissue",
        "SegmentAlgorithmType": "AUTOMATIC",
        "SegmentAlgorithmName": "BAMF-Breast-Tumor-MRI",
        "AnatomicRegionSequence": {
          "CodeValue": "76752008",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Breast"
        },
        "AnatomicRegionModifierSequence": {
          "CodeValue": "51440002",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Right and left"
        },
        "SegmentedPropertyCategoryCodeSequence": {
          "CodeValue": "C35869",
          "CodingSchemeDesignator": "NCIt",
          "CodeMeaning": "Radiologic Finding"
        },
        "SegmentedPropertyTypeCodeSequence": {
          "CodeValue": "C114467",
          "CodingSchemeDesignator": "NCIt",
          "CodeMeaning": "Breast Fibroglandular Tissue"
        },
        "recommendedDisplayRGBValue": [128, 174, 128]
      },
      {
        "labelID": 3,
        "SegmentDescription": "Structural Tumor",
        "SegmentAlgorithmType": "AUTOMATIC",
        "SegmentAlgorithmName": "BAMF-Breast-Tumor-MRI",
        "AnatomicRegionSequence": {
          "CodeValue": "76752008",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Breast"
        },
        "AnatomicRegionModifierSequence": {
          "CodeValue": "51440002",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Right and left"
        },
        "SegmentedPropertyCategoryCodeSequence": {
          "CodeValue": "49755003",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Morphologically Altered Structure"
        },
        "SegmentedPropertyTypeCodeSequence": {
          "CodeValue": "4147007",
          "CodingSchemeDesignator": "SCT",
          "CodeMeaning": "Mass"
        },
        "recommendedDisplayRGBValue": [144, 238, 144]
      }
    ]
  ],
  "ContentLabel": "SEGMENTATION",
  "ContentDescription": "Image segmentation",
  "ClinicalTrialCoordinatingCenterName": "dcmqi"
}

@rahul99
Copy link
Author

rahul99 commented Jun 12, 2024

@LennyN95 , Hey Leo, I have posted the json containing code for ROIs above. Please have a look and let me know if I missed something. I have also added NNUnetRunnerV2. Its an easy extension of the NNUnetRunner that you wrote so beautifully.

Its a slightly different PR as it contains two models so there might be something I probably missed. Please review and let me know :)

@rahul99 rahul99 marked this pull request as ready for review June 12, 2024 14:05
@rahul99 rahul99 marked this pull request as draft June 12, 2024 14:07
@rahul99 rahul99 marked this pull request as ready for review June 13, 2024 08:00
@LennyN95
Copy link
Member

@rahul99, this looks great so far! I noticed that you use Mass (4147007) as a type for the breast tumor. Is there any reason why you choose not to use neoplasm (86049000) here? We can add a new tumor class if required, but I'd like to understand the background first.

@vanossj
Copy link
Contributor

vanossj commented Jun 18, 2024

we can switch it to neoplasm

@rahul99
Copy link
Author

rahul99 commented Jun 18, 2024

@LennyN95 , since we did not have the ROI entries in SegDB, I was testing the container by specifying other ROIs that are available in SegDB. Since the process outputs three ROIs and I see that entry for BREAST label is available already, I tried two random ROIs. Unfortunately, the flow complained about BREAST label too. Please can you look into this?

The flow worked great when I passed three random ROIs that are available in SegDB (Except BREAST label)

I think refreshing the base image might solve this?

@rahul99
Copy link
Author

rahul99 commented Jun 18, 2024

@LennyN95 , a small query on the NNUnetRunner as well. In this line:
https://github.com/MHubAI/mhubio/blob/main/mhubio/modules/runner/NNUnetRunner.py#L165

Comment says add symlink to input dir, but we add symlink to output dir. Also, I don't think that's needed? We have already organised the data into appropriate folder structure and nnunet predict function does not use it. Let me know what your thoughts are on this?

@rahul99
Copy link
Author

rahul99 commented Jun 24, 2024

@LennyN95 , Hey Leo, do you have any updates on the SegDB entries for Breast and FGT?

@LennyN95
Copy link
Member

LennyN95 commented Jun 24, 2024

Hey @rahul99,

I was looking into this and found that it'd be good to discuss briefly with @dclunie if available at PW.
I could add FGT as specified, however, that you only be used in the context of breast, thus BREAST+FGT would be the only valid usage of FGT, LUNG+FGT for instance wouldn't make any sense. We currently have no options to define such bindings (and the question is weather we have to). One solution could be to make FGT more verbose, e.g., by naming the ID BREAST_FGT, however, then the usage is BREAST+BREAST_FGT. I personally don't mind the redundancy in this specific situation but it might cause confusion since BREAST_FGT should not be used standalone.

In the meanwhile, I added an option to specify custom triplets and segments in the config file that will be available as SegDB ID's at runtime.

You can either register custom triplets and custom segments in your runner module class (outside of the class definition) or define them in the config file as explained here.

Note, that custom triplets and segments should mainly be used to simplify development and to accommodate the time until a segdb update propagates into a new base image. However, in special cases where we won't add a segment to our universal SegDB this is a transparent way to define custom segments.

@dclunie
Copy link

dclunie commented Jun 24, 2024

I am not in Boston for PW41 but am available online; FYI, since the subject of codes for FGT came up, see DICOM CP 2226 (currently out for ballot).

@fedorov
Copy link
Member

fedorov commented Jun 24, 2024

@LennyN95 since you introduced that capability of registering custom triplets, it might be good to add some basic validation checks. Coding scheme designator must come from the list in the standard: https://dicom.nema.org/medical/dicom/current/output/chtml/part16/chapter_8.html, or must start with the "99" prefix.

@LennyN95
Copy link
Member

Good point @fedorov. The default is indeed 99segdb, is hardxoding the list and validating against it a valuable solution? I assume this doesn't change frequently if at all?

@fedorov
Copy link
Member

fedorov commented Jun 25, 2024

The default is indeed 99segdb

I believe the recommendation is to have all caps, so it would be 99SEGDB. But private coding scheme should be absolutely last resort!

I assume this doesn't change frequently if at all?

Yes, they don't. But a basic check (ie if it does not start from 99 it must come from the defined list) would prevent someone from using, for example, an invalid value "LUNGMASK" for coding scheme designator.

@LennyN95
Copy link
Member

@fedorov I implemented it all uppercase, see https://github.com/MHubAI/SegDB/blob/6b2db45ecdc711c5dee93990b73594ccd6a77d9a/segdb/classes/Triplet.py#L63C105-L63C112 ;)

Just to make clear, I asked if this changes frequently to get an idea if hardcoding the list to check against is a valuable solution, I assume yes then. This is a simple check we can easily include! Note that in general, this is more thought to be used during development to have a quick placeholder and then to be resolved during the review process.

@rahul99 rahul99 requested a review from LennyN95 August 22, 2024 09:51
@vanossj vanossj mentioned this pull request Aug 22, 2024
@vanossj
Copy link
Contributor

vanossj commented Aug 22, 2024

@rahul99, lets switch the FGT code to SCT 1342375003. That way we can keep codes in snowmed as much as possible.

so change metadata for label 2 to:

{
  "labelID": 2,
  "SegmentDescription": "Fibroglandular Tissue",
  "SegmentAlgorithmType": "AUTOMATIC",
  "SegmentAlgorithmName": "BAMF-Breast-Tumor-MRI",
  "AnatomicRegionSequence": {
    "CodeValue": "76752008",
    "CodingSchemeDesignator": "SCT",
    "CodeMeaning": "Breast"
  },
  "AnatomicRegionModifierSequence": {
    "CodeValue": "51440002",
    "CodingSchemeDesignator": "SCT",
    "CodeMeaning": "Right and left"
  },
  "SegmentedPropertyCategoryCodeSequence": {
    "CodeValue": "123037004",
    "CodingSchemeDesignator": "SCT",
    "CodeMeaning": "Anatomical Structure"
  },
  "SegmentedPropertyTypeCodeSequence": {
    "CodeValue": "1342375003",
    "CodingSchemeDesignator": "SCT",
    "CodeMeaning": "Structure of fibroglandular tissue of breast"
  },
  "recommendedDisplayRGBValue": [128, 174, 128]
}

Copy link
Contributor

@vanossj vanossj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

category: C_RADIOLOGIC_FINDING should change too

Copy link
Member

@LennyN95 LennyN95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Run mhub.update inside of your image (during development) or re-build with the latest base image docker pull mhubai/base.

models/bamf_nnunet_mr_breast/config/default.yml Outdated Show resolved Hide resolved
@rahul99
Copy link
Author

rahul99 commented Aug 23, 2024

@LennyN95 , thanks for making breast related entries available on SegDB. I've rolled back the config :)

On standby if any more changes required. Thanks!

@rahul99 rahul99 requested a review from LennyN95 August 23, 2024 14:12
@LennyN95
Copy link
Member

@rahul99 I saw you use the BREAST+BREAST_FIBROGLANDULAR_TISSUE roi, however, since we switched to the SCT codes as suggested by @vanossj, this code represents a tissue now (rather than a radiological finding) and thus the BREAST+ context can (and should) be omitted.

@rahul99
Copy link
Author

rahul99 commented Aug 28, 2024

@rahul99 I saw you use the BREAST+BREAST_FIBROGLANDULAR_TISSUE roi, however, since we switched to the SCT codes as suggested by @vanossj, this code represents a tissue now (rather than a radiological finding) and thus the BREAST+ context can (and should) be omitted.

I have made the change. Thanks for educating me on this as well. :)

@rahul99
Copy link
Author

rahul99 commented Sep 4, 2024

@rahul99
Copy link
Author

rahul99 commented Sep 4, 2024

@LennyN95 , following your comment on the other PR, I have uploaded the output segmentation file above. Thanks!

@rahul99
Copy link
Author

rahul99 commented Sep 5, 2024

/test

sample:
  idc_version: 17.0
  data:
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.57828200249886597927204318690632931889
    aws_url: s3://idc-open-data/bf0fc92d-e708-468c-af90-0c02329f636d/*
    path: dicom

reference:
  url: https://www.dropbox.com/scl/fi/o2b26zrb20e080rcc0z5d/bamf_nnunet_mr_breast.zip?rlkey=8m434go1u7zthemx0w245pors&st=wetbolof&dl=0

bamf_nnunet_mr_breast.zip

@LennyN95
Copy link
Member

Please note, we updated our base image. All mhub dependencies are now installed in a virtual environment under /app/.venv running Python 3.11. Python, virtual environment and dependencies are now managed with uv. If required, you can create custom virtual environments, e.g., uv venv -p 3.8 .venv38 and use uv pip install -p .venv38 packge-name to install dependencies and uv run -p .venv3.8 python script.py to run a python script.

We also simplified our test routine. Sample and reference data now have to be uploaded to Zenodo and provided in a mhub.tom file at the project root. The process how to create and provide these sample data is explained in the updated testing phase article of our documentation. Under doi.org/10.5281/zenodo.13785615 we provide sample data as a reference.

@LennyN95
Copy link
Member

@rahul99 thx for adapting to the new test routine! To make the test pass, you can reference the download link to the zip file (https://zenodo.org/records/13843612/files/bamf_nnunet_mr_breast.test.zip?download=1).

@rahul99
Copy link
Author

rahul99 commented Sep 30, 2024

@LennyN95 , thanks for pointing it out. Let me know if you need more information from us. Looking forward to getting this PR over the line

@rahul99
Copy link
Author

rahul99 commented Sep 30, 2024

/test

sample:
  idc_version: 17.0
  data:
  - SeriesInstanceUID: 1.3.6.1.4.1.14519.5.2.1.57828200249886597927204318690632931889
    aws_url: s3://idc-open-data/bf0fc92d-e708-468c-af90-0c02329f636d/*
    path: dicom

reference:
  url: https://zenodo.org/records/13843612/files/bamf_nnunet_mr_breast.test.zip?download=1

@rahul99
Copy link
Author

rahul99 commented Nov 18, 2024

@LennyN95 , just circling back to see if you have an update for us. Thanks!

@LennyN95
Copy link
Member

Since this model relies on NNUnet V2, I paused the submission process until MIC-DKFZ/nnUNet#2556 is resolved. This will lead to an improved UX for MHub models and we're actively working on this now. Let me know if you are running against any deadlines and feel free to share your opinion / feedback in the related issue. Thx @rahul99!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Testing
Development

Successfully merging this pull request may close these issues.

5 participants