OSIC Pulmonary Fibrosis Progression

My 34th place solution to the OSIC Pulmonary Fibrosis Progression competition hosted on Kaggle by OSIC.

Initial Thoughts

This was an interesting competition for me because I saw that everyone else were only using the tabular data. I have no experience at all with tabular data comps and I like CV so I was stubborn and forced myself to use the CT scans. I was discouraged by my poor LB score but I keep pushing and it turned out to be the correct decision!

Overview

My final solution was an ensemble of 3 5fold Resnet50 models. They were trained on windowed lung ct scan images along with the meta data. I used Google Colab Pro to train all models for 30 epochs.

Models

Final model was a simple pretrained Resnet50 with a image and meta data part. The image part was simply pretrained model -> pool -> flatten -> dropout -> concat. And the meta model was a simple head with features -> linear -> relu -> linear -> relu -> concat. The final models either had 512->1024 or 100->100 features for the head. And finally a simple linear layer for the 3 FVC output.

Dataset

For each scan I loaded in all the dicom files then converted them to HU. I then cropped all the images and reshaped them to the size 50x512x512. After doing that I then windowed each image into three parts. For a detailed explanation of the windowing take a look at this post (I used the same function and window values as shared there, thanks so much Ian Pan!). After I got the three windowed images I simply saved them as pngs. Each image looked similar to this:

I made the dataset I used public here: https://www.kaggle.com/greatgamedota/osic-windowed-lung-images

For meta data I used the same meta data from @ulrich07 's baseline kernal except I only used the base Percent value as it increased my CV to LB to PB correlation.

Augmentation

Coarse dropout
SSR
Horizontal + Vertical flip
For one model: Random Saturation and Brightness

No tabular/meta augmentation

Training

Adam optimizer with Reduce on Plateau scheduler
Trained with an LR of .003 for 30 epochs
Batch size of 16 (bs of 4 for one model)
Trained using Quantile Regression with .8 qloss + .2 metric loss (same as Ulrich's kernal)
Trained 5 fold for every model
Didn't use any batch accumulation or mixed precision
Saved checkpoint based on best validation score (all weeks)

For training I removed 6 entire patients because their CT scans were broken. I then split each patient into a fold by randomly shuffling them then using GroupKFold.

Then while training I pick a random unique patient and randomly select an image from 10-40 (since the first and last 10 images don't contain any lung info). I then made it so that each iteration lasted 4 * amount of patients. For validation I only picked the 15th image since it was the middle image that usually contains the most information.

Ensembling/Blending

Simple mean average of their FVC predictions and confidence

Final Submission

My final submission was a blend of 3 Resnet50 models:

And another point is that the single model that would have scored gold had the same training parameters as the other models except the added brightness/saturation augmentation

What didn't work

3d resnets (I tried for at least a month with these)
Linear Decay Regression
Any other type of model besides Resnets and Efficientnets (determinism issues)
Efficientnets
Simple meta data head (just concat)
Random erase augmentation

Final Thoughts

I want to say again that this was an awesome comp that I am so glad I participated in! Very glad to get my third medal and second silver medal!

My previous competition: Melanoma Classification

My next competition: Lyft Motion Prediction

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
OSIC_Quantile_Regression.ipynb		OSIC_Quantile_Regression.ipynb
README.md		README.md
osic-inference2.ipynb		osic-inference2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OSIC Pulmonary Fibrosis Progression

Initial Thoughts

Overview

Models

Dataset

Augmentation

Training

Ensembling/Blending

Final Submission

What didn't work

Final Thoughts

About

Releases

Packages

Languages

GreatGameDota/OSIC-Pulmonary-Fibrosis-Prediction

Folders and files

Latest commit

History

Repository files navigation

OSIC Pulmonary Fibrosis Progression

Initial Thoughts

Overview

Models

Dataset

Augmentation

Training

Ensembling/Blending

Final Submission

What didn't work

Final Thoughts

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages