Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHub / GC - Add GC nnUNet Pancreas model #39

Merged
merged 25 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
0d7ec9d
WIP initial pancreas_pdac model commit
silvandeleemput Jul 4, 2023
af79adb
renamed nnunet_pancreas_pdac -> gc_nnunet_pancreas, addded run script…
silvandeleemput Jul 20, 2023
4bb7483
update for new base image, add dseg.json for class labels
silvandeleemput Aug 1, 2023
3be3473
Merge branch 'main' into m-gc-nnunet-pancreas
silvandeleemput Aug 1, 2023
12f30d8
cleanup code runner and run, configure dsegconverter, dataorganizer
silvandeleemput Aug 1, 2023
16d1893
add panimg backend for mhaconverter and cleanup
silvandeleemput Aug 30, 2023
aef205a
Merge branch 'MHubAI:main' into m-gc-nnunet-pancreas
silvandeleemput Sep 14, 2023
19d29d8
Updated and cleaned Dockerfile and Runner and added some comments
silvandeleemput Sep 14, 2023
790133a
Merge branch 'MHubAI:main' into m-gc-nnunet-pancreas
silvandeleemput Oct 10, 2023
737a5d2
add meta.json
silvandeleemput Oct 10, 2023
b04fb7e
Merge branch 'MHubAI:main' into m-gc-nnunet-pancreas
silvandeleemput Nov 23, 2023
334d573
update mhub model definition import Dockerfile
silvandeleemput Nov 23, 2023
4466e56
removed first comment line in Dockerfile
silvandeleemput Nov 23, 2023
2b035fe
Added segdb export, removed dseg.json, added remapped output to runne…
silvandeleemput Dec 7, 2023
a56e860
add cli for running the pdac_detection model
silvandeleemput Dec 7, 2023
713b0f3
added VEIN,ARTERY rois to output segmentation, cleaned config and run…
silvandeleemput Dec 12, 2023
55e34cf
add comments to CLI, add clean method and case-level likelihood extra…
silvandeleemput Jan 15, 2024
9c9508d
meta.json - update analysis section and evaluation data section #39
silvandeleemput Jan 15, 2024
4969367
updated model/algorithm version to latest commit, removed manual code…
silvandeleemput Jan 16, 2024
1b24825
meta.json - match model name #39
silvandeleemput Jan 16, 2024
561883b
meta.json - added disclaimer for output segmentation map
silvandeleemput Jan 18, 2024
2df25f3
updated to lastest version of algorithm, changed to output the raw he…
silvandeleemput Feb 1, 2024
f189937
meta.json - moved segmentation disclaimer to description and modified…
silvandeleemput Feb 1, 2024
726fdcb
fix dependencies conflict new base image #39
silvandeleemput Feb 8, 2024
343c6ab
meta.json - added version 0.1.0 to details
silvandeleemput Feb 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions models/gc_nnunet_pancreas/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .utils import *
44 changes: 44 additions & 0 deletions models/gc_nnunet_pancreas/config/default.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
general:
version: 1.0
data_base_dir: /app/data
description: base configuration for GC NNUnet Pancreas model (dicom to dicom, and json output)

execute:
- DicomImporter
- MhaConverter
- GCNNUnetPancreasRunner
- DsegConverter
- ReportExporter
- DataOrganizer

modules:
DicomImporter:
source_dir: input_data
import_dir: sorted_data
sort_data: true
meta:
mod: '%Modality'

MhaConverter:
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
engine: panimg
targets: [dicom:mod=ct]

DsegConverter:
model_name: 'GC NNUnet Pancreas'
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
source_segs: ['mha:mod=seg:src=cleaned']
target_dicom: dicom:mod=ct
skip_empty_slices: True

ReportExporter:
format: compact
includes:
- data: prostate_cancer_likelihood
label: prostate_cancer_likelihood
value: value

DataOrganizer:
targets:
- mha:mod=heatmap-->[i:sid]/nnunet_pancreas_heatmap.mha
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
- mha:mod=seg:src=cleaned-->[i:sid]/nnunet_pancreas.seg.mha
- dicomseg:mod=seg-->[i:sid]/nnunet_pancreas.seg.dcm
- json:mod=report-->[i:sid]/nnunet_pancreas_case_level_likelihood.json
45 changes: 45 additions & 0 deletions models/gc_nnunet_pancreas/dockerfiles/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
FROM mhubai/base:latest

# Specify/override authors label
LABEL authors="[email protected]"

# Install PyTorch 2.0.1 (CUDA enabled)
RUN pip3 install --no-cache-dir torch==2.0.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html

# Install git-lfs (required for downloading the model weights)
RUN apt update && \
apt install -y --no-install-recommends git-lfs && \
rm -rf /var/lib/apt/lists/*

# Install the model weights and the algorithm files
# * Pull algorithm from repo into /opt/algorithm (main branch, commit 15dd550beada43a8a55b81a32d9b3904a1cf8d30)
# * Remove .git folder to keep docker layer small
RUN git clone https://github.com/DIAGNijmegen/CE-CT_PDAC_AutomaticDetection_nnUnet.git /opt/algorithm && \
cd /opt/algorithm && \
git reset --hard 15dd550beada43a8a55b81a32d9b3904a1cf8d30 && \
rm -rf /opt/algorithm/.git

# Set this environment variable as a shortcut to avoid nnunet 1.7.0 crashing the build
# by pulling sklearn instead of scikit-learn
# N.B. this is a known issue:
# https://github.com/MIC-DKFZ/nnUNet/issues/1281
# https://github.com/MIC-DKFZ/nnUNet/pull/1209
ENV SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True

# Install nnUNet 1.7.0 and other requirements
RUN pip3 install --no-cache-dir evalutils==0.3.0 nnunet==1.7.0

# Extend the nnUNet installation with custom trainers
RUN SITE_PKG=`pip3 show nnunet | grep "Location:" | awk '{print $2}'` && \
mv /opt/algorithm/nnUNetTrainerV2_Loss_CE_checkpoints.py "$SITE_PKG/nnunet/training/network_training/nnUNetTrainerV2_Loss_CE_checkpoints.py"

# Import the MHub model definiton
ARG MHUB_MODELS_REPO
RUN buildutils/import_mhub_model.sh gc_nnunet_pancreas ${MHUB_MODELS_REPO}

# Add algorithm files to python path
ENV PYTHONPATH=/opt/algorithm:/app

# Configure main entrypoint
ENTRYPOINT ["python3", "-m", "mhubio.run"]
CMD ["--config", "/app/models/gc_nnunet_pancreas/config/default.yml"]
149 changes: 149 additions & 0 deletions models/gc_nnunet_pancreas/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
{
"id": "bf7ae4bb-c6f5-4b1e-89aa-a8de246def57",
"name": "gc_nnunet_pancreas",
"title": "Pancreatic Ductal Adenocarcinoma Detection in CT",
"summary": {
"description": "This algorithm produces a tumor likelihood heatmap for the presence of pancreatic ductal adenocarcinoma (PDAC) in an input venous-phase contrast-enhanced computed tomography scan (CECT). Additionally, the algorithm provides the segmentation of multiple surrounding anatomical structures such as the pancreatic duct, common bile duct, veins and arteries. The heatmap and segmentations are resampled to the same spatial resolution and physical dimensions as the input CECT image for easier visualisation.",
"inputs": [
{
"label": "Venous phase CT scan",
"description": "A contrast-enhanced CT scan in the venous phase and axial reconstruction",
"format": "DICOM",
"modality": "CT",
"bodypartexamined": "Abdomen",
"slicethickness": "2.5mm",
"non-contrast": false,
"contrast": false
}
],
"outputs": [
{
"type": "Prediction",
"valueType": "Likelihood map",
"label": "Pancreatic tumor likelihood heatmap",
"description": "Pancreatic tumor likelihood heatmap, where each voxel represents a floating point in range [0,1].",
"classes": []
},
{
"type": "Prediction",
"valueType": "Likelihood",
"label": "Pancreatic tumor likelihood",
"description": "Case-level pancreatic tumor likelihood value with a value in range [0,1].",
"classes": []
},
{
"type": "Segmentation",
"label": "Pancreas segmentation",
"description": "Segmentation of pancreas related tissues, these segmentation classes were not thoroughly validated, use them on your own risk!",
"classes": [
"veins",
"arteries",
"pancreas",
"pancreatic duct",
"bile duct"
]
}
],
"model": {
"architecture": "nnUnet ",
"training": "supervised",
"cmpapproach": "3D"
},
"data": {
"training": {
"vol_samples": 242
},
"evaluation": {
"vol_samples": 361
},
"public": true,
"external": false
}
},
"details": {
"name": "Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography",
"version": "0.1.0",
"devteam": "DIAGNijmegen (Diagnostic Image Analysis Group, Radboud UMC, The Netherlands)",
"type": "The models were developed using nnUnet. All models employed a 3D U-Net as the base architecture and were trained for 250.000 training steps with five-fold cross-validation.",
"date": {
"weights": "2023-06-28",
"code": "2022-07-19",
"pub": "2022-01-13"
},
"cite": "Alves N, Schuurmans M, Litjens G, Bosma JS, Hermans J, Huisman H. Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography. Cancers (Basel). 2022 Jan 13;14(2):376. doi: 10.3390/cancers14020376. PMID: 35053538; PMCID: PMC8774174.",
"license": {
"code": "Apache 2.0",
"weights": "Apache 2.0"
},
"publications": [
{
"title": "Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography ",
"uri": "https://www.mdpi.com/2072-6694/14/2/376"
}
],
"github": "https://github.com/DIAGNijmegen/CE-CT_PDAC_AutomaticDetection_nnUnet",
"zenodo": "",
"colab": "",
"slicer": false
},
"info": {
"use": {
"title": "Intended Use",
"text": "This algorithm is intended to be used only on venous-phase CECT examinations of patients with clinical suspicion of PDAC. This algorithm should not be used in different patient demographics.",
"references": [],
"tables": []
},
"analyses": {
"title": "Analysis",
"text": "The study evaluated a medical model's performance for tumor detection by analyzing receiver operating characteristic (ROC) and free-response receiver operating characteristic (FROC) curves, assessing both tumor presence and lesion localization, and compared three configurations using statistical tests and ensemble modeling. The table below lists the model's performance on an external evaluation dataset of 361 cases. Additional analysis details and results can be found in the original paper [1].",
"references": [
{
"label": "Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography",
"uri": "https://www.mdpi.com/2072-6694/14/2/376"
}
],
"tables": [
{
"label": "Evaluation results of the nnUnet_MS model on the external test set of 361 cases.",
"entries": {
"Mean AUC-ROC (95% CI)": "0.991 (0.970-1.0)",
"Mean pAUC-FROC (95% CI)": "3.996 (3.027-4.965)"
}
}
]
},
"evaluation": {
"title": "Evaluation Data",
"text": "This framework was tested in an independent, external cohort consisting of two publicly available datasets of respectively 281 and 80 patients each. The Medical Segmentation Decathlon pancreas dataset (training portion) [1] consisting of 281 patients with pancreatic malignancies (including lesions in the head, neck, body, and tail of the pancreas) and voxel-level annotations for the pancreas and lesion. The Cancer Imaging Archive dataset from the US National Institutes of Health Clinical Center [2], containing 80 patients with normal pancreas and respective voxel-level annotations.",
"references": [
{
"label": "The Medical Segmentation Decathlon pancreas dataset (training portion)",
"uri": "http://medicaldecathlon.com/"
},
{
"label": "The Cancer Imaging Archive dataset from the US National Institutes of Health Clinical Center",
"uri": "https://wiki.cancerimagingarchive.net/display/Public/Pancreas-CT"
}
],
"tables": []
},
"training": {
"title": "Training data",
"text": "CE-CT scans in the portal venous phase from 119 patients with pathology-proven PDAC in the pancreatic head (PDAC cohort) and 123 patients with normal pancreas (non-PDAC cohort), acquired between 1 January 2013 and 1 June 2020, were selected for model development.",
"references": [],
"tables": []
},
"ethics": {
"title": "",
"text": "",
"references": [],
"tables": []
},
"limitations": {
"title": "Before using this model",
"text": "Test the model retrospectively and prospectively on a diagnostic cohort that reflects the target population that the model will be used upon to confirm the validity of the model within a local setting.",
"references": [],
"tables": []
}
}
}
85 changes: 85 additions & 0 deletions models/gc_nnunet_pancreas/utils/GCNNUnetPancreasRunner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
"""
-----------------------------------------------------------
GC / MHub - Run Module for the GC NNUnet Pancreas Algorithm
-----------------------------------------------------------

-----------------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
-----------------------------------------------------------
"""

from mhubio.core import Module, Instance, InstanceData, DataType, Meta, IO, ValueOutput

from pathlib import Path
import SimpleITK
import sys


CLI_PATH = Path(__file__).parent / "cli.py"


@ValueOutput.Name('prostate_cancer_likelihood')
@ValueOutput.Label('ProstateCancerLikelihood')
@ValueOutput.Meta(Meta(min=0.0, max=1.0, type="likelihood"))
@ValueOutput.Type(float)
@ValueOutput.Description('Likelihood of case-level prostate cancer.')
class ProstateCancerLikelihood(ValueOutput):
pass


class GCNNUnetPancreasRunner(Module):
@IO.Instance()
@IO.Input('in_data', 'mha:mod=ct', the="input data")
@IO.Output('heatmap', 'heatmap.mha', 'mha:mod=heatmap:model=GCNNUnetPancreas', data="in_data",
the="raw heatmap of the pancreatic tumor likelihood (not masked with any pancreas segmentations).")
@IO.Output('segmentation_raw', 'segmentation_raw.mha', 'mha:mod=seg:src=original:model=GCNNUnetPancreas:roi=VEIN,ARTERY,PANCREAS,PANCREATIC_DUCT,BILE_DUCT,PANCREAS+CYST,RENAL_VEIN', data="in_data",
the="original segmentation of the pancreas, with the following classes: "
"0-background, 1-veins, 2-arteries, 3-pancreas, 4-pancreatic duct, 5-bile duct, 6-cysts, 7-renal vein")
@IO.Output('segmentation', 'segmentation.mha', 'mha:mod=seg:src=cleaned:model=GCNNUnetPancreas:roi=VEIN,ARTERY,PANCREAS,PANCREATIC_DUCT,BILE_DUCT', data="in_data",
the="cleaned segmentation of the pancreas, with the following classes: "
"0-background, 1-veins, 2-arteries, 3-pancreas, 4-pancreatic duct, 5-bile duct")
@IO.OutputData('cancer_likelihood', ProstateCancerLikelihood, the='Case-level pancreatic tumor likelihood. This is equivalent to the maximum of the pancreatic tumor likelihood heatmap.')
def task(self, instance: Instance, in_data: InstanceData, heatmap: InstanceData, segmentation_raw: InstanceData, segmentation: InstanceData, cancer_likelihood: ProstateCancerLikelihood, **kwargs) -> None:
# Call the PDAC CLI
# A CLI was used here to ensure the mhub framework properly captures the nnUNet stdout output
cmd = [
sys.executable,
str(CLI_PATH),
in_data.abspath,
heatmap.abspath,
segmentation_raw.abspath
]
self.subprocess(cmd, text=True)

# Remove cysts and renal vein classes from the original segmentation.
# Insufficient training samples were present in the training data for these classes.
# Hence, these classes should be omitted from the final output, since these are not
# expected to produce reliable segmentations.
self.clean_segementation(
segmentation_in=segmentation_raw,
segmentation_out=segmentation
)

# Extract case-level cancer likelihood
cancer_likelihood.value = self.extract_case_level_cancer_likelihood(
heatmap=heatmap
)

def clean_segementation(self, segmentation_in: InstanceData, segmentation_out: InstanceData):
self.log("Cleaning output segmentation", level="NOTICE")
seg_sitk = SimpleITK.ReadImage(segmentation_in.abspath)
seg_numpy = SimpleITK.GetArrayFromImage(seg_sitk)
seg_numpy[seg_numpy >= 6] = 0 # remove cysts and renal vein segmentation from original segmentation
remapped_sitk = SimpleITK.GetImageFromArray(seg_numpy)
remapped_sitk.CopyInformation(seg_sitk)
SimpleITK.WriteImage(remapped_sitk, segmentation_out.abspath, True)

def extract_case_level_cancer_likelihood(self, heatmap: InstanceData):
self.log("Extracting case-level cancer likelihood", level="NOTICE")
heatmap_sitk = SimpleITK.ReadImage(heatmap.abspath)
f = SimpleITK.MinimumMaximumImageFilter()
f.Execute(heatmap_sitk)
cancer_likelihood = f.GetMaximum()
assert 0.0 <= cancer_likelihood <= 1.0, "Cancer likelihood value must be in range [0.0, 1.0]"
return cancer_likelihood
1 change: 1 addition & 0 deletions models/gc_nnunet_pancreas/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .GCNNUnetPancreasRunner import *
60 changes: 60 additions & 0 deletions models/gc_nnunet_pancreas/utils/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
"""
silvandeleemput marked this conversation as resolved.
Show resolved Hide resolved
-------------------------------------------------------------
GC / MHub - CLI for the GC nnUnet Pancreas Algorithm
The model algorith was wrapped in a CLI to ensure
the mhub framework is able to properly capture the nnUNet
stdout/stderr outputs
-------------------------------------------------------------

-------------------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
-------------------------------------------------------------
"""
import argparse
from pathlib import Path

# Import the algorithm pipeline class from the CE-CT_PDAC_AutomaticDetection_nnUnet repository
from process import PDACDetectionContainer


def run_pdac_detection(
input_ct_image: Path, output_heatmap: Path, output_segmentation: Path
):
# Configure the algorithm pipeline class and run it
algorithm = PDACDetectionContainer(output_raw_heatmap=True)
algorithm.ct_image = input_ct_image
algorithm.heatmap_raw = output_heatmap
algorithm.segmentation = output_segmentation
algorithm.process()


def run_pdac_detection_cli():
parser = argparse.ArgumentParser("CLI for the GC nnUNet Pancreas Algorithm")
parser.add_argument(
"input_ct_image",
type=str,
help="input CT scan (MHA)"
)
parser.add_argument(
"output_heatmap",
type=str,
help="raw heatmap of the pancreatic tumor likelihood (MHA)",
)
parser.add_argument(
"output_segmentation",
type=str,
help="segmentation map of the pancreas (MHA), with the following classes: "
"0-background, 1-veins, 2-arteries, 3-pancreas, 4-pancreatic duct, 5-bile duct, "
"6-cysts, 7-renal vein",
)
args = parser.parse_args()
run_pdac_detection(
input_ct_image=Path(args.input_ct_image),
output_heatmap=Path(args.output_heatmap),
output_segmentation=Path(args.output_segmentation),
)


if __name__ == "__main__":
run_pdac_detection_cli()
Loading