Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHub / GC - Add GC nnUNet Pancreas model #39

Merged
merged 25 commits into from
Feb 28, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
0d7ec9d
WIP initial pancreas_pdac model commit
silvandeleemput Jul 4, 2023
af79adb
renamed nnunet_pancreas_pdac -> gc_nnunet_pancreas, addded run script…
silvandeleemput Jul 20, 2023
4bb7483
update for new base image, add dseg.json for class labels
silvandeleemput Aug 1, 2023
3be3473
Merge branch 'main' into m-gc-nnunet-pancreas
silvandeleemput Aug 1, 2023
12f30d8
cleanup code runner and run, configure dsegconverter, dataorganizer
silvandeleemput Aug 1, 2023
16d1893
add panimg backend for mhaconverter and cleanup
silvandeleemput Aug 30, 2023
aef205a
Merge branch 'MHubAI:main' into m-gc-nnunet-pancreas
silvandeleemput Sep 14, 2023
19d29d8
Updated and cleaned Dockerfile and Runner and added some comments
silvandeleemput Sep 14, 2023
790133a
Merge branch 'MHubAI:main' into m-gc-nnunet-pancreas
silvandeleemput Oct 10, 2023
737a5d2
add meta.json
silvandeleemput Oct 10, 2023
b04fb7e
Merge branch 'MHubAI:main' into m-gc-nnunet-pancreas
silvandeleemput Nov 23, 2023
334d573
update mhub model definition import Dockerfile
silvandeleemput Nov 23, 2023
4466e56
removed first comment line in Dockerfile
silvandeleemput Nov 23, 2023
2b035fe
Added segdb export, removed dseg.json, added remapped output to runne…
silvandeleemput Dec 7, 2023
a56e860
add cli for running the pdac_detection model
silvandeleemput Dec 7, 2023
713b0f3
added VEIN,ARTERY rois to output segmentation, cleaned config and run…
silvandeleemput Dec 12, 2023
55e34cf
add comments to CLI, add clean method and case-level likelihood extra…
silvandeleemput Jan 15, 2024
9c9508d
meta.json - update analysis section and evaluation data section #39
silvandeleemput Jan 15, 2024
4969367
updated model/algorithm version to latest commit, removed manual code…
silvandeleemput Jan 16, 2024
1b24825
meta.json - match model name #39
silvandeleemput Jan 16, 2024
561883b
meta.json - added disclaimer for output segmentation map
silvandeleemput Jan 18, 2024
2df25f3
updated to lastest version of algorithm, changed to output the raw he…
silvandeleemput Feb 1, 2024
f189937
meta.json - moved segmentation disclaimer to description and modified…
silvandeleemput Feb 1, 2024
726fdcb
fix dependencies conflict new base image #39
silvandeleemput Feb 8, 2024
343c6ab
meta.json - added version 0.1.0 to details
silvandeleemput Feb 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions models/gc_nnunet_pancreas/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .utils import *
35 changes: 35 additions & 0 deletions models/gc_nnunet_pancreas/config/default.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
general:
version: 1.0
data_base_dir: /app/data
description: base configuration for GC NNUnet Pancreas model (dicom to dicom)

execute:
- DicomImporter
- MhaConverter
- GCNNUnetPancreasRunner
- DsegConverter
- DataOrganizer

modules:
DicomImporter:
source_dir: input_data
import_dir: sorted_data
sort_data: true
meta:
mod: '%Modality'

MhaConverter:
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
engine: panimg
targets: [dicom:mod=ct]

DsegConverter:
model_name: 'GC NNUnet Pancreas'
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
source_segs: ['mha:mod=seg']
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
target_dicom: dicom:mod=ct
skip_empty_slices: True

DataOrganizer:
targets:
- mha:mod=heatmap-->[i:sid]/nnunet_pancreas_heatmap.mha
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
- mha:mod=seg-->[i:sid]/nnunet_pancreas.seg.mha
- dicomseg:mod=seg-->[i:sid]/nnunet_pancreas.seg.dcm
53 changes: 53 additions & 0 deletions models/gc_nnunet_pancreas/dockerfiles/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
FROM mhubai/base:latest

# Specify/override authors label
LABEL authors="[email protected]"

# Install PyTorch 2.0.1 (CUDA enabled)
RUN pip3 install --no-cache-dir torch==2.0.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html

# Install git-lfs (required for downloading the model weights)
RUN apt update && \
apt install -y --no-install-recommends git-lfs && \
rm -rf /var/lib/apt/lists/*

# Install the model weights and the algorithm files
# * Pull algorithm from repo into /opt/algorithm (main branch, commit e4f4008c6e18e60a79f693448562a340a9252aa8)
# * Remove .git folder to keep docker layer small
# * Replace input images path in process.py with an existing folder to avoid errors
# * Add specific data types and compression options to output data structures in process.py to reduce generated output footprint
RUN git clone https://github.com/DIAGNijmegen/CE-CT_PDAC_AutomaticDetection_nnUnet.git /opt/algorithm && \
cd /opt/algorithm && \
git reset --hard e4f4008c6e18e60a79f693448562a340a9252aa8 && \
rm -rf /opt/algorithm/.git && \
sed -i 's/Path("\/input\/images\/")/Path("\/app")/g' /opt/algorithm/process.py && \
silvandeleemput marked this conversation as resolved.
Show resolved Hide resolved
sed -i 's/pred_2_np = sitk\.GetArrayFromImage(pred_2_nii)/pred_2_np = sitk\.GetArrayFromImage(pred_2_nii)\.astype(np\.uint8)/g' /opt/algorithm/process.py && \
silvandeleemput marked this conversation as resolved.
Show resolved Hide resolved
sed -i 's/pm_image = np\.zeros(image_np\.shape)/pm_image = np\.zeros(image_np\.shape, dtype=np\.float32)/g' /opt/algorithm/process.py && \
sed -i 's/segmentation_np = np\.zeros(image_np\.shape)/segmentation_np = np\.zeros(image_np\.shape, dtype=np\.uint8)/g' /opt/algorithm/process.py && \
sed -i 's/sitk\.WriteImage(segmentation_image, str(self\.segmentation))/sitk\.WriteImage(segmentation_image, str(self\.segmentation), True)/g' /opt/algorithm/process.py && \
sed -i 's/sitk\.WriteImage(pred_itk_resampled, str(self\.heatmap))/sitk\.WriteImage(pred_itk_resampled, str(self\.heatmap), True)/g' /opt/algorithm/process.py

# Set this environment variable as a shortcut to avoid nnunet 1.7.0 crashing the build
# by pulling sklearn instead of scikit-learn
# N.B. this is a known issue:
# https://github.com/MIC-DKFZ/nnUNet/issues/1281
# https://github.com/MIC-DKFZ/nnUNet/pull/1209
ENV SKLEARN_ALLOW_DEPRECATED_SKLEARN_PACKAGE_INSTALL=True

# Install nnUNet 1.7.0 and other requirements
RUN pip3 install --no-cache-dir -r /opt/algorithm/requirements.txt

# Extend the nnUNet installation with custom trainers
RUN SITE_PKG=`pip3 show nnunet | grep "Location:" | awk '{print $2}'` && \
mv /opt/algorithm/nnUNetTrainerV2_Loss_CE_checkpoints.py "$SITE_PKG/nnunet/training/network_training/nnUNetTrainerV2_Loss_CE_checkpoints.py"

# Import the MHub model definiton
ARG MHUB_MODELS_REPO
RUN buildutils/import_mhub_model.sh gc_nnunet_pancreas ${MHUB_MODELS_REPO}

# Add algorithm files to python path
ENV PYTHONPATH=/opt/algorithm:/app

# Configure main entrypoint
ENTRYPOINT ["python3", "-m", "mhubio.run"]
CMD ["--config", "/app/models/gc_nnunet_pancreas/config/default.yml"]
129 changes: 129 additions & 0 deletions models/gc_nnunet_pancreas/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
{
"id": "bf7ae4bb-c6f5-4b1e-89aa-a8de246def57",
"name": "pdac_detection_in_ct",
"title": "Pancreatic Ductal Adenocarcinoma Detection in CT",
"summary": {
"description": "This algorithm produces a tumor likelihood heatmap for the presence of pancreatic ductal adenocarcinoma (PDAC) in an input venous-phase contrast-enhanced computed tomography scan (CECT). Additionally, the algorithm provides the segmentation of multiple surrounding anatomical structures such as the pancreatic duct, common bile duct, veins and arteries. The heatmap and segmentations are resampled to the same spatial resolution and physical dimensions as the input CECT image for easier visualisation.",
"inputs": [
{
"label": "Venous phase CT scan",
"description": "A contrast-enhanced CT scan in the venous phase and axial reconstruction",
"format": "DICOM",
"modality": "CT",
"bodypartexamined": "Abdomen",
"slicethickness": "2.5mm",
"non-contrast": false,
"contrast": false
}
],
"outputs": [
{
"type": "Segmentation",
"classes": [
"veins",
"arteries",
"pancreas",
"pancreatic duct",
"bile duct",
"cysts",
"renal vein"
]
},
{
"type": "Prediction",
"valueType": "number",
"label": "Pancreatic tumor likelihood",
"description": "Pancreatic tumor likelihood map with values between 0 and 1",
"classes": []
}
],
"model": {
"architecture": "nnUnet ",
"training": "supervised",
"cmpapproach": "3D"
},
"data": {
"training": {
"vol_samples": 242
},
"evaluation": {
"vol_samples": 361
},
"public": true,
"external": false
}
},
"details": {
"name": "Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography",
"version": "",
"devteam": "DIAGNijmegen (Diagnostic Image Analysis Group, Radboud UMC, The Netherlands)",
"type": "The models were developed using nnUnet. All models employed a 3D U-Net as the base architecture and were trained for 250.000 training steps with five-fold cross-validation.",
"date": {
"weights": "2023-06-28",
"code": "2022-07-19",
"pub": "2022-01-13"
},
"cite": "Alves N, Schuurmans M, Litjens G, Bosma JS, Hermans J, Huisman H. Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography. Cancers (Basel). 2022 Jan 13;14(2):376. doi: 10.3390/cancers14020376. PMID: 35053538; PMCID: PMC8774174.",
"license": {
"code": "Apache 2.0",
"weights": "Apache 2.0"
},
"publications": [
{
"title": "Fully Automatic Deep Learning Framework for Pancreatic Ductal Adenocarcinoma Detection on Computed Tomography ",
"uri": "https://www.mdpi.com/2072-6694/14/2/376"
}
],
"github": "https://github.com/DIAGNijmegen/CE-CT_PDAC_AutomaticDetection_nnUnet",
"zenodo": "",
"colab": "",
"slicer": false
},
"info": {
"use": {
"title": "Intended Use",
"text": "This algorithm is intended to be used only on venous-phase CECT examinations of patients with clinical suspicion of PDAC. This algorithm should not be used in different patient demographics.",
"references": [],
"tables": []
},
"analyses": {
"title": "Analysis",
"text": "The study evaluated a medical model's performance for tumor detection by analyzing receiver operating characteristic (ROC) and free-response receiver operating characteristic (FROC) curves, assessing both tumor presence and lesion localization, and compared three configurations using statistical tests and ensemble modeling.",
"references": [],
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"tables": []
},
"evaluation": {
"title": "Evaluation Data",
"text": "This framework was tested in an independent, external cohort consisting of two publicly available datasets.",
"references": [
{
"label": "The Medical Segmentation Decathlon pancreas dataset (training portion) consisting of 281 patients with pancreatic malignancies (including lesions in the head, neck, body, and tail of the pancreas) and voxel-level annotations for the pancreas and lesion.",
"uri": "http://medicaldecathlon.com/"
},
{
"label": "The Cancer Imaging Archive dataset from the US National Institutes of Health Clinical Center, containing 80 patients with normal pancreas and respective voxel-level annotations.",
"uri": "https://wiki.cancerimagingarchive.net/display/Public/Pancreas-CT"
}
],
"tables": []
},
"training": {
"title": "Training data",
"text": "CE-CT scans in the portal venous phase from 119 patients with pathology-proven PDAC in the pancreatic head (PDAC cohort) and 123 patients with normal pancreas (non-PDAC cohort), acquired between 1 January 2013 and 1 June 2020, were selected for model development.",
"references": [],
"tables": []
},
"ethics": {
"title": "",
"text": "",
"references": [],
"tables": []
},
"limitations": {
"title": "Before using this model",
"text": "Test the model retrospectively and prospectively on a diagnostic cohort that reflects the target population that the model will be used upon to confirm the validity of the model within a local setting.",
"references": [],
"tables": []
}
}
}
40 changes: 40 additions & 0 deletions models/gc_nnunet_pancreas/utils/GCNNUnetPancreasRunner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""
-----------------------------------------------------------
GC / MHub - Run Module for the GC NNUnet Pancreas Algorithm
-----------------------------------------------------------

-----------------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
-----------------------------------------------------------
"""

from mhubio.core import Module, Instance, InstanceData, DataType, Meta, IO

from pathlib import Path
import SimpleITK
import numpy as np
import sys


CLI_PATH = Path(__file__).parent / "cli.py"


class GCNNUnetPancreasRunner(Module):
@IO.Instance()
@IO.Input('in_data', 'mha:mod=ct', the="input data")
@IO.Output('heatmap', 'heatmap.mha', 'mha:mod=heatmap:model=GCNNUnetPancreas', data="in_data",
the="heatmap of the pancreatic tumor likelihood")
@IO.Output('segmentation', 'segmentation.mha', 'mha:mod=seg:model=GCNNUnetPancreas:roi=VEIN,ARTERY,PANCREAS,PANCREATIC_DUCT,BILE_DUCT,PANCREAS+CYST,RENAL_VEIN', data="in_data",
the="original segmentation of the pancreas, with the following classes: "
"0-background, 1-veins, 2-arteries, 3-pancreas, 4-pancreatic duct, 5-bile duct, 6-cysts, 7-renal vein")
def task(self, instance: Instance, in_data: InstanceData, heatmap: InstanceData, segmentation: InstanceData, **kwargs) -> None:
# Call the PDAC CLI
cmd = [
sys.executable,
str(CLI_PATH),
in_data.abspath,
heatmap.abspath,
segmentation.abspath
]
self.subprocess(cmd, text=True)
1 change: 1 addition & 0 deletions models/gc_nnunet_pancreas/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .GCNNUnetPancreasRunner import *
57 changes: 57 additions & 0 deletions models/gc_nnunet_pancreas/utils/cli.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
"""
silvandeleemput marked this conversation as resolved.
Show resolved Hide resolved
----------------------------------------------------
GC / MHub - CLI for the GC nnUnet Pancreas Algorithm
----------------------------------------------------

----------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
----------------------------------------------------
"""
import argparse
from pathlib import Path

# Import the algorithm pipeline class from the CE-CT_PDAC_AutomaticDetection_nnUnet repository
from process import PDACDetectionContainer


def run_pdac_detection(
input_ct_image: Path, output_heatmap: Path, output_segmentation: Path
):
# Configure the algorithm pipeline class and run it
algorithm = PDACDetectionContainer()
algorithm.ct_image = str(input_ct_image) # set as str not Path
algorithm.heatmap = output_heatmap
algorithm.segmentation = output_segmentation
algorithm.process()


def run_pdac_detection_cli():
parser = argparse.ArgumentParser("CLI for the GC nnUNet Pancreas Algorithm")
parser.add_argument(
"input_ct_image",
type=str,
help="input CT scan (MHA)"
)
parser.add_argument(
"output_heatmap",
type=str,
help="heatmap of the pancreatic tumor likelihood (MHA)",
)
parser.add_argument(
"output_segmentation",
type=str,
help="segmentation map of the pancreas (MHA), with the following classes: "
"0-background, 1-veins, 2-arteries, 3-pancreas, 4-pancreatic duct, 5-bile duct, "
"6-cysts, 7-renal vein",
)
args = parser.parse_args()
run_pdac_detection(
input_ct_image=Path(args.input_ct_image),
output_heatmap=Path(args.output_heatmap),
output_segmentation=Path(args.output_segmentation),
)


if __name__ == "__main__":
run_pdac_detection_cli()