Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MHub / GC - Add PICAI baseline model/algorithm #60

Merged
merged 16 commits into from
Jan 10, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions models/gc_picai_baseline/config/default.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
general:
data_base_dir: /app/data
version: 1.0
description: Prostate MRI classification default (dicom to json)

execute:
- FileStructureImporter
- MhaConverter
- PicaiBaselineRunner
- ReportExporter
- DataOrganizer

modules:
FileStructureImporter:
input_dir: input_data
structures:
- $sid@instance/$type@dicom:mod=mr
import_id: sid

MhaConverter:
engine: panimg
allow_multi_input: true

ReportExporter:
format: compact
includes:
- data: prostate_cancer_probability
label: prostate_cancer_probability
value: value

DataOrganizer:
targets:
- json:mod=report-->[i:sid]/cspca-case-level-likelihood.json
- mha:mod=hm-->[i:sid]/cspca-detection-map.mha
31 changes: 31 additions & 0 deletions models/gc_picai_baseline/config/mha-pipeline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
general:
data_base_dir: /app/data
version: 1.0
description: Prostate MRI classification MHA pipeline (mha to json)

execute:
- FileStructureImporter
- PicaiBaselineRunner
- ReportExporter
- DataOrganizer

modules:
FileStructureImporter:
input_dir: input_data
structures:
- $sid@instance/images/transverse-adc-prostate-mri/adc.mha@mha:mod=mradc
- $sid/images/transverse-t2-prostate-mri/t2w.mha@mha:mod=mrt2
- $sid/images/transverse-hbv-prostate-mri/hbv.mha@mha:mod=mrhbv
import_id: sid

ReportExporter:
format: compact
includes:
- data: prostate_cancer_probability
label: prostate_cancer_probability
value: value

DataOrganizer:
targets:
- json:mod=report-->[i:sid]/cspca-case-level-likelihood.json
- mha:mod=hm-->[i:sid]/cspca-detection-map.mha
46 changes: 46 additions & 0 deletions models/gc_picai_baseline/dockerfiles/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
FROM mhubai/base:latest

# Specify/override authors label
LABEL authors="[email protected]"

# Install PyTorch 2.0.1 (CUDA enabled)
RUN pip3 install --no-cache-dir torch==2.0.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html

# Install git-lfs (required for unpacking model weights)
RUN apt update && apt install -y --no-install-recommends git-lfs && rm -rf /var/lib/apt/lists/*

# Install PICAI baseline algorithm and model weights
# - Git clone the algorithm repository for v2.1.1 (fixed to v2.1.1 tag)
# - We remove unnecessary files for a compacter docker layer
# - Subsequently we remove the .git directory to procuce a compacter docker layer
RUN git clone --depth 1 --branch v2.1.1 https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm.git /opt/algorithm && \
rm -rf /opt/algorithm/test && \
rm -rf /opt/algorithm/.git

# Install additional PICAI requirements
RUN pip3 install --no-cache-dir -r /opt/algorithm/requirements.txt

# Extend the nnUNet installation with custom trainers
RUN SITE_PKG=`pip3 show nnunet | grep "Location:" | awk '{print $2}'` && \
mv /opt/algorithm/nnUNetTrainerV2_focalLoss.py "$SITE_PKG/nnunet/training/network_training/nnUNet_variants/loss_function/nnUNetTrainerV2_focalLoss.py"
RUN SITE_PKG=`pip3 show nnunet | grep "Location:" | awk '{print $2}'` && \
mv /opt/algorithm/nnUNetTrainerV2_Loss_CE_checkpoints.py "$SITE_PKG/nnunet/training/network_training/nnUNetTrainerV2_Loss_CE_checkpoints.py"
RUN SITE_PKG=`pip3 show nnunet | grep "Location:" | awk '{print $2}'` && \
mv /opt/algorithm/nnUNetTrainerV2_Loss_FL_and_CE.py "$SITE_PKG/nnunet/training/network_training/nnUNetTrainerV2_Loss_FL_and_CE.py"

# Two code edits to the __init__ method of the algorithm class in process.py to prevent some of its default behavior
# 1. Skip forced error caused by using a different input locations than expected (we don't use the GC dirs)
# 2. Prevent unnecessary folder creation before input directories have been set (we will set the correct directory later)
RUN sed -i "s|file_paths = list(Path(folder).glob(scan_glob_format))|return|g" /opt/algorithm/process.py && \
sed -i "s|self.cspca_detection_map_path.parent.mkdir(exist_ok=True, parents=True)||g" /opt/algorithm/process.py

# Import the MHub model definiton
ARG MHUB_MODELS_REPO
RUN buildutils/import_mhub_model.sh gc_picai_baseline ${MHUB_MODELS_REPO}

# Add lobe segmentation code base to python path
ENV PYTHONPATH="/app:/opt/algorithm"

# Default entrypoint
ENTRYPOINT ["python3", "-m", "mhubio.run"]
CMD ["--config", "/app/models/gc_picai_baseline/config/default.yml"]
138 changes: 138 additions & 0 deletions models/gc_picai_baseline/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
{
"id": "c5f886fb-9f54-4555-a954-da02b22d6d3f",
"name": "picai_baseline",
silvandeleemput marked this conversation as resolved.
Show resolved Hide resolved
"title": "PI-CAI challenge baseline",
"summary": {
"description": "The PI-CAI challenge is to validate modern AI algorithms at clinically significant prostate cancer (csPCa) detection and diagnosis. This model algorithm provides the baseline for the challenge.",
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"inputs": [
{
"label": "Prostate biparametric MRI",
"description": "Prostate biparametric MRI exam",
"format": "DICOM",
"modality": "MR",
"bodypartexamined": "Prostate",
"slicethickness": "",
"non-contrast": false,
"contrast": false
}
],
"outputs": [
{
"type": "Prediction",
"valueType": "Probability",
"label": "Prostate cancer probability",
"description": "Case-level likelihood of harboring clinically significant prostate cancer, in range [0,1]",
"classes": []
},
{
"type": "Prediction",
"valueType": "Probability map",
"label": "Transverse cancer detection map",
"description": "Detection map of clinically significant prostate cancer lesions in 3D, where each voxel represents a floating point in range [0,1]",
"classes": []
}
],
"model": {
"architecture": "3d fullres nnUNet",
"training": "supervised",
"cmpapproach": "3D"
},
"data": {
"training": {
"vol_samples": 1200
},
"evaluation": {
"vol_samples": 300
},
"public": false,
"external": false
}
},
"details": {
"name": "PI-CAI challenge baseline",
"version": "v2.1.1",
"devteam": "Diagnostic Image Analysis Group, Radboud University Medical Center, Nijmegen, The Netherlands",
"type": "Prediction",
"date": {
"weights": "2022-06-22",
"code": "2022-09-05",
"pub": ""
},
"cite": "",
"license": {
"code": "Apache 2.0",
"weights": "Apache 2.0"
},
"publications": [],
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"github": "https://github.com/DIAGNijmegen/picai_nnunet_semi_supervised_gc_algorithm",
"zenodo": "",
"colab": "",
"slicer": false
},
"info": {
"use": {
"title": "Intended use",
"text": "Prediction of the likelihood of harboring clinically significant prostate cancer (csPCa) in prostate biparametric MRI exams.",
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"references": [
{
"label": "PI-CAI baseline algorithm on grand-challenge",
"uri": "https://grand-challenge.org/algorithms/pi-cai-baseline-nnu-net-semi-supervised/"
}
],
"tables": []
},
"analyses": {
"title": "Evaluation",
"text": "Patient-level diagnosis performance is evaluated using the Area Under Receiver Operating Characteristic (AUROC) metric. Lesion-level detection performance is evaluated using the Average Precision (AP) metric. Overall score used to rank each AI algorithm is the average of both task-specific metrics: Overall Ranking Score = (AP + AUROC) / 2",
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"references": [
{
"label": "PI-CAI AI challenge details",
"uri": "https://pi-cai.grand-challenge.org/AI/"
}
],
"tables": []
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
},
"evaluation": {
"title": "Evaluation data",
"text": "The test sets are two private cohorts of 100 and 1000 biparametric MRI exams respectively. The first was used to tune the algorithms in a public leaderboard, the second was used to determine the top 5 AI algorithms.",
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"references": [
{
"label": "PI-CAI data section",
"uri": "https://pi-cai.grand-challenge.org/DATA/"
silvandeleemput marked this conversation as resolved.
Show resolved Hide resolved
}
],
"tables": []
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
},
"training": {
"title": "Training data",
"text": "For the PI-CAI a publicly available training datasets of 1500 biparametric MRI exams including 328 cases from the ProstateX challenge were made available.",
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved
"references": [
{
"label": "PI-CAI publicly available training data",
"uri": "https://zenodo.org/record/6624726"
},
{
"label": "PI-CAI publicly available training data annotations",
"uri": "https://github.com/DIAGNijmegen/picai_labels"
},
{
"label": "ProstateX challenge",
"uri": "https://prostatex.grand-challenge.org/"
}
],
"tables": []
},
"ethics": {
"title": "",
"text": "",
"references": [],
"tables": []
},
"limitations": {
"title": "Limitations",
"text": "This algorithm was developed for research purposes only.",
"references": [],
"tables": []
}
}
}
68 changes: 68 additions & 0 deletions models/gc_picai_baseline/utils/PicaiBaselineRunner.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
"""
---------------------------------------------------------
Mhub / DIAG - Run Module for the PICAI baseline Algorithm
---------------------------------------------------------

---------------------------------------------------------
Author: Sil van de Leemput
Email: [email protected]
---------------------------------------------------------
"""

import json
from pathlib import Path

from mhubio.core import Instance, InstanceData, IO, Module, ValueOutput, ClassOutput, Meta

# Import the PICAI Classifier algorithm class from /opt/algorithm
from process import csPCaAlgorithm as PicaiClassifier


@ValueOutput.Name('prostate_cancer_probability')
@ValueOutput.Meta(Meta(key="value"))
@ValueOutput.Label('ProstateCancerProbability')
@ValueOutput.Type(float)
@ValueOutput.Description('Probability of case-level prostate cancer.')
class ProstateCancerProbability(ValueOutput):
pass


class PicaiBaselineRunner(Module):

@IO.Instance()
@IO.Input('in_data_t2', 'mha:mod=mr:type=t2w', the='input T2 weighted prostate MR image')
@IO.Input('in_data_adc', 'mha:mod=mr:type=adc', the='input ADC prostate MR image')
@IO.Input('in_data_hbv', 'mha:mod=mr:type=hbv', the='input HBV prostate MR image')
@IO.Output('cancer_probability_json', 'cspca-case-level-likelihood.json', "json", bundle='model', the='output JSON file with PICAI baseline prostate cancer probability')
@IO.Output('cancer_detection_heatmap', 'cspca_detection_map.mha', "mha:mod=hm", bundle='model', the='output heatmap indicating prostate cancer probability')
@IO.OutputData('cancer_probability', ProstateCancerProbability, the='PICAI baseline prostate cancer probability')
def task(self, instance: Instance, in_data_t2: InstanceData, in_data_adc: InstanceData, in_data_hbv: InstanceData, cancer_probability_json: InstanceData, cancer_detection_heatmap: InstanceData, cancer_probability: ProstateCancerProbability) -> None:
# Initialize classifier object
classifier = PicaiClassifier()

# Specify input files (the order is important!)
classifier.scan_paths = [
Path(in_data_t2.abspath),
Path(in_data_adc.abspath),
Path(in_data_hbv.abspath),
]

# Specify output files
classifier.cspca_detection_map_path = Path(cancer_detection_heatmap.abspath)
classifier.case_confidence_path = Path(cancer_probability_json.abspath)

# Run the classifier on the input images
classifier.process()
LennyN95 marked this conversation as resolved.
Show resolved Hide resolved

# Extract cancer probability value from cancer_probability_file
if not Path(cancer_probability_json.abspath).is_file():
raise FileNotFoundError(f"Output file {cancer_probability_json.abspath} could not be found!")

with open(cancer_probability_json.abspath, "r") as f:
cancer_prob = float(json.load(f))

if not (isinstance(cancer_prob, (float, int)) and (0.0 <= cancer_prob <= 1.0)):
raise ValueError(f"Cancer probability value should be a probability value, found: {cancer_prob}")

# Output the predicted values
cancer_probability.value = cancer_prob
1 change: 1 addition & 0 deletions models/gc_picai_baseline/utils/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .PicaiBaselineRunner import *