Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring code to this repository for better reproducibility #16

Merged
merged 69 commits into from
Feb 20, 2024
Merged

Conversation

yw7
Copy link
Collaborator

@yw7 yw7 commented May 24, 2023

Add code for registering MPRAGE and spine-generic datasets to the PAM50 template.
Closes #15

@yw7 yw7 requested a review from jcohenadad May 24, 2023 12:27
@yw7 yw7 self-assigned this May 24, 2023
yw7 added 11 commits January 31, 2024 02:01
integration
- Separated dataset preparation and training instructions in the README
for clarity, specifying use of nnUNetv2 structure.
- Clarified the prerequisite of having a trained model before running
inferences in the README.
- Fixed the output directory variable assignment in inference script to
correctly use a second parameter.
- Enhanced the inference script to handle missing `_0000` suffixes and
to support new postprocessing steps.
- Added a new dataset preparation script (`prepare_nnunet_datasets.sh`)
to set up data structure for nnUNetv2.
- Removed dataset preparation steps from the training script
(`train_nnunet.sh`), focusing it solely on model training as per the new
separation of concerns.

The changes improve the accuracy, usability, and maintainability of
TotalSegMRI's implementation with nnUNetv2, facilitating better
segmentation results and a smoother experience for users following the
updated instructions.
Updated the argument flags across various MRI utility scripts to
standardize input directory flags as '-s' instead of the previous '-i'.
This change enhances the consistency of script interfaces, making it
easier to understand and use the tools for MRI data preparation and
processing. Adjusted scripts include those for generating sequential
labels, mapping labels, and fixing CSF labels. Compatibility with
existing conventions in subject subdirectory flags has been maintained
by switching from '-s' to '-u'.
and retains only the largest connected component for each label.
# RAM requirement in GB
RAM_REQUIREMENT=8
# Get the number of CPUs, subtract 1, and ensure the value is at least 1
JOBS_FOR_CPUS=$(( $(($(nproc) - 1 < 1 ? 1 : $(nproc) - 1 )) ))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the training done on GPU?

Copy link
Collaborator Author

@yw7 yw7 Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@valosekj Yes you're right but still some work done on CPU so this way it is mor efficient

Comment on lines 54 to 55
for f in data/bids/spider/*t2_SPACE.nii.gz; do mv $f ${f/t2_SPACE/T2Sw}; done
for f in data/bids/spider/derivatives/labels/*t2_SPACE.nii.gz; do mv $f ${f/t2_SPACE/T2Sw}; done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a nitpick, but we usually do not use the T2Sw suffix. For spine-generic, we used T2star.nii.gz; see here. BIDS standard then recommends T2starw.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is T2 SPACE so I just use it.. Do you have a better idea?
Thanks!!!

Copy link
Collaborator

@NathanMolinier NathanMolinier Feb 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, but I had to deal with this same issue when adding the data to our servers. What I decided, because the only difference is a change in resolution, is to use:

  • sub-XXX_acq-lowresSag_T2w for the standard T2w images with a resolution ranged from 3.3 x 0.33 x 0.33 mm to 4.8 x 0.90 x 0.90 mm.
  • sub-XXX_acq-highresSag_T2w for the T2w images obtained with the SPACE sequence, with a more isotropic resolution of 0.90 x 0.47 x 0.47 mm

Comment on lines +16 to +17
"disc_O": 2,
"disc_E": 3,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do disc_O and disc_E stand for?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

odd and even

# Parse command line arguments
parser = argparse.ArgumentParser(
description=textwrap.dedent(f'''
Fix csf label to include all non cord spinal canal, this will put the spinal canal label in all the voxels (labeled as a backgroupn) between the spinal canal and the spinal cord.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly hard to understand. Maybe you could consider rephrasing.

Copy link
Collaborator Author

@yw7 yw7 Feb 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you maybe help me with that? Thank you!!!
Sometimes there are voxels between the spinal cord and CSF that are maped by the model to background. This code make them labeled as spinal canal.

yw7 added 2 commits February 7, 2024 00:31
Refactored the dataset preparation steps, now sourcing the new `get_spine_generic_datasets.sh` script to fetch specific datasets from the Spine Generic Project repository. This update clarifies the preparation of SPIDER datasets in the BIDS structure. Revised training and inference instructions in `README.md` to correspond with the new dataset structure and included clear directives for model training and running inference. Removed unnecessary code from the `prepare_spider_bids_datasets.sh` script, further streamlining the process.

These changes make the data setup more intuitive and maintainable, enabling easier replication of the research environment.

Related to issue #4567.
Removed redundant `pairs_dict` function from both generate_labels_sequential.py and generate_largest_labels.py as it was no longer used in the current codebase. Updated generate_largest_labels.py to enhance clarity: renamed the function generate_labels to generate_largest_labels and updated its references to match the new name, ensuring consistency with the module's purpose. Removed unused imports to streamline dependencies and maintain clean and efficient code.
Copy link
Member

@jcohenadad jcohenadad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these big binaries should not be pushed in the main repository:

image

instead, upload them as assets (eg: in issues, or in releases), and link to these images.

git should ONLY be used for code, not binaries

@jcohenadad
Copy link
Member

jcohenadad commented Feb 20, 2024

because of #16 (review) this PR should be squashed on merging-- tag @jcohenadad when is time to do the merge and i'll do it

@yw7 yw7 mentioned this pull request Feb 20, 2024
@yw7
Copy link
Collaborator Author

yw7 commented Feb 20, 2024

I've created an issue for Images here: #21 and deleted all the images from the repository.
@jcohenadad We're ready to merge now.

Ensure the creation of nested directories for the SPIDER dataset by adding the '-p' flag to the 'mkdir' command in the README instructions. This prevents potential errors when users attempt to create subdirectories in a non-existent path.
@jcohenadad jcohenadad merged commit 0300574 into main Feb 20, 2024
@jcohenadad jcohenadad deleted the reg2pam50 branch February 20, 2024 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Move code from gdrive into this repository
4 participants