-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dataset aggregation #1
Comments
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I just remember that we also have a lot of data from UMass (git-annex data : umass-ms-* (3 datasets)) |
I updated the new code to aggregate the following datasets, which are labelled:
The command ran on python ms-lesion-agnostic/monai/1_create_msd_data.py -pd ~/net/ms-lesion-agnostic/data/ -po ~/net/ms-lesion-agnostic/msd_data/ --lesion-only --canproco-exclude canproco/exclude.yml The output is the following: Total number of derivatives in the root directory: 4407
Number of images in train set: 1636
Number of images in validation set: 569
Number of images in test set: 544
Total number of images in the dataset: 2749 The total number of images in the dataset (2749) is different from the total number of derivatives (4407) because we decided to keep only those which have lesions. The output is the following file: |
for now, but maybe in the future it would be desirable to develop a model that also has good specificity (ie: high true negative rate) |
There was an issue in the code when gathering segmentations from python ms-lesion-agnostic/monai/1_create_msd_data.py -pd ~/net/ms-lesion-agnostic/data/ -po ~/net/ms-lesion-agnostic/msd_data/ --lesion-only --canproco-exclude canproco/exclude.yml This is the output of the code: Total number of derivatives in the root directory: 4407
Number of images in train set: 1712
Number of images in validation set: 590
Number of images in test set: 569
Total number of images in the dataset: 2871 |
For the purpose of writing of an abstract for Actrims, I am referencing some information about the data that we use.
Sites used for external validation:
|
I ran the script to analyze the dataset : EDIT: I fixed the re-orientation problem so that the resolution be all taken in RPI orientation. Here is the output: Number of images: 2871
Number of images for training: 1712
Number of images for validation: 590
Number of images for testing: 569
Number of images per contrast: {'UNIT1': 265, 'T2w': 1773, 'STIR': 72, 'PSIR': 286, 'T2star': 474, 'T1w': 1}
Number of images per orientation: {'iso': 272, 'ax': 1693, 'sag': 906}
Average resolution: [1.25201238 0.54277958 2.93960629]
Std resolution: [1.17157561 0.23521095 1.95652873]
Median resolution: [0.57291669 0.5625 3.29999995]
-------------------------------------
Number of images in ms-basel-2018: 46
Contrast in ms-basel-2018: {'T2w', 'T1w'}
Number of images per contrast in ms-basel-2018: {'T2w': 24, 'T1w': 22}
Number of images in ms-basel-2020: 31
Contrast in ms-basel-2020: {'PD'}
Number of images per contrast in ms-basel-2020: {'PD': 31}
-------------------------------------
Number of images in umass: 3516
Contrast in umass: {'T2w', 'PD', 'T1w'}
Number of images per contrast in umass: {'T2w': 1806, 'PD': 537, 'T1w': 1173} |
I added the ms-nmo-beijing dataset, where we only get some T1w images. I also added the computation of the resolution and the orientation for every dataset
Here is the output after: Number of images: 2871
Number of images for training: 1712
Number of images for validation: 590
Number of images for testing: 569
Number of images per contrast: {'UNIT1': 265, 'T2w': 1773, 'STIR': 72, 'PSIR': 286, 'T2star': 474, 'T1w': 1}
Number of images per orientation: {'iso': 272, 'ax': 1693, 'sag': 906}
Average resolution: [1.25201238 0.54277958 2.93960629]
Std resolution: [1.17157561 0.23521095 1.95652873]
Median resolution: [0.57291669 0.5625 3.29999995]
Minimum pixel dimension: 0.1874999850988388
Maximum pixel dimension: 9.541563034057617
-------------------------------------
Number of images in ms-basel-2018: 46
Contrast in ms-basel-2018: {'T1w', 'T2w'}
Number of images per contrast in ms-basel-2018: {'T1w': 22, 'T2w': 24}
Number of images in ms-basel-2020: 31
Contrast in ms-basel-2020: {'PD'}
Number of images per contrast in ms-basel-2020: {'PD': 31}
Average resolution: [2.43636375 0.61377165 0.61377165]
Std resolution: [0.90967523 0.25884103 0.25884103]
Median resolution: [2.99999976 0.57291669 0.57291669]
Minimum pixel dimension: 0.3385416567325592
Maximum pixel dimension: 3.300001859664917
Number of images per orientation in basel: {'sag': 55, 'iso': 22}
-------------------------------------
Number of images in umass: 3512
Contrast in umass: {'T1w', 'T2w', 'PD'}
Number of images per contrast in umass: {'T1w': 1169, 'T2w': 1806, 'PD': 537}
Average resolution: [2.14845656 0.43490765 1.70391261]
Std resolution: [1.4642299 0.10913737 1.53917671]
Median resolution: [3.29995835 0.42969999 0.42970002]
Minimum pixel dimension: 0.3124999701976776
Maximum pixel dimension: 11.24999713897705
Number of images per orientation in umass: {'sag': 2088, 'ax': 1424}
-------------------------------------
Number of images in beijing: 346
Contrast in beijing: {'T1w'}
Number of images per contrast in beijing: {'T1w': 346}
Average resolution: [1.33011619 0.924434 2.19872953]
Std resolution: [1.05835971 0.13317857 2.83172577]
Median resolution: [1.00000072 1. 1. ]
Minimum pixel dimension: 0.390625
Maximum pixel dimension: 13.799997329711914
Number of images per orientation in beijing: {'sag': 113, 'iso': 174, 'ax': 59} |
I have adapted the code to only take 20 images for umass (5 per site) and 20 images from beijing. Also, I have computed the orientation in a more correct fashion than what I was doing before. Number of images: 2871
Number of images for training: 1712
Number of images for validation: 590
Number of images for testing: 569
Number of images per contrast: {'UNIT1': 265, 'T2w': 1773, 'STIR': 72, 'PSIR': 286, 'T2star': 474, 'T1w': 1}
PSIR are 2D sagital images: count PSIR images: 286
STIR are 2D sagital images: count STIR images: 72
UNIT1 are 3D images: count UNIT1 images: 265
T1w are 3D images: count T1w images: 1
For T2w, we have only 2D images: 1234 axial images and 539 sagital images
For T2star, we have only 2D images: 459 axial images and 15 sagital images
Total number of sagital images: 912
Total number of axial images: 1693
Total number of 3D images: 266
Number of subjects: 1541
Average resolution: [1.25201238 0.54277958 2.93960629]
Std resolution: [1.17157561 0.23521095 1.95652873]
Median resolution: [0.57291669 0.5625 3.29999995]
Minimum pixel dimension: 0.1874999850988388
Maximum pixel dimension: 9.541563034057617
-------------------------------------
Number of images in ms-basel-2018: 46
Contrast in ms-basel-2018: {'T1w', 'T2w'}
Number of images per contrast in ms-basel-2018: {'T1w': 22, 'T2w': 24}
Number of images in ms-basel-2020: 31
Contrast in ms-basel-2020: {'PD'}
Number of images per contrast in ms-basel-2020: {'PD': 31}
Total number of images: 77
2D sagital images: 55
3D images: 22
Number of subjects in ms-basel-2018: 23
Number of subjects in ms-basel-2020: 16
Average resolution: [2.43636375 0.61377165 0.61377165]
Std resolution: [0.90967523 0.25884103 0.25884103]
Median resolution: [2.99999976 0.57291669 0.57291669]
Minimum pixel dimension: 0.3385416567325592
Maximum pixel dimension: 3.300001859664917
-------------------------------------
Number of images in umass: 20
Contrast in umass: {'T1w', 'PD', 'T2w'}
Number of images per contrast in umass: {'T1w': 9, 'PD': 2, 'T2w': 9}
Number of subjects in umass: 20
For umass, we have 13 axial images, 7 sagital images and 0 3D images
Average resolution: [1.55297141 0.51953438 2.68904921]
Std resolution: [1.3874418 0.17851401 1.65879995]
Median resolution: [0.78125 0.42969374 3.62499213]
Minimum pixel dimension: 0.35159996151924133
Maximum pixel dimension: 5.000114440917969
-------------------------------------
Number of images in beijing: 20
Contrast in beijing: {'T1w'}
Number of images per contrast in beijing: {'T1w': 20}
For beijing, we have 2 axial images, 2 sagital images and 16 3D images
Number of subjects in beijing: 11
Average resolution: [1.24874925 0.93906249 1.62031279]
Std resolution: [0.69777444 0.12440287 1.96204985]
Median resolution: [1.00000021 1. 1. ]
Minimum pixel dimension: 0.625
Maximum pixel dimension: 7.500004768371582
-------------------------------------
-------------------------------------
Total number of images: 2988
Total number of subjects: 1611
Total number of sagital images: 976
Total number of axial images: 1708
Total number of 3D images: 304 |
I have aggregated all the annotated data from the following datasets This accounts for 4824 MRI scans which come from 2019 subjects. |
After a discussion about the complexity of seeing lesions in PDw images, we decided not to include them in the study. |
Here is an issue to describe the aggregation of available datasets.
The dataset which are of interest for this project are:
Labeled datasets
Unlabeled datasets:
The text was updated successfully, but these errors were encountered: