Skip to content

My 34th place solution to the OSIC Pulmonary Fibrosis Progression Competition hosted on Kaggle 🔬

Notifications You must be signed in to change notification settings

GreatGameDota/OSIC-Pulmonary-Fibrosis-Prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

OSIC Pulmonary Fibrosis Progression

My 34th place solution to the OSIC Pulmonary Fibrosis Progression competition hosted on Kaggle by OSIC.

Initial Thoughts

This was an interesting competition for me because I saw that everyone else were only using the tabular data. I have no experience at all with tabular data comps and I like CV so I was stubborn and forced myself to use the CT scans. I was discouraged by my poor LB score but I keep pushing and it turned out to be the correct decision!

Overview

My final solution was an ensemble of 3 5fold Resnet50 models. They were trained on windowed lung ct scan images along with the meta data. I used Google Colab Pro to train all models for 30 epochs.

Models

Final model was a simple pretrained Resnet50 with a image and meta data part. The image part was simply pretrained model -> pool -> flatten -> dropout -> concat. And the meta model was a simple head with features -> linear -> relu -> linear -> relu -> concat. The final models either had 512->1024 or 100->100 features for the head. And finally a simple linear layer for the 3 FVC output.

Dataset

For each scan I loaded in all the dicom files then converted them to HU. I then cropped all the images and reshaped them to the size 50x512x512. After doing that I then windowed each image into three parts. For a detailed explanation of the windowing take a look at this post (I used the same function and window values as shared there, thanks so much Ian Pan!). After I got the three windowed images I simply saved them as pngs. Each image looked similar to this:

I made the dataset I used public here: https://www.kaggle.com/greatgamedota/osic-windowed-lung-images

For meta data I used the same meta data from @ulrich07 's baseline kernal except I only used the base Percent value as it increased my CV to LB to PB correlation.

Augmentation

  • Coarse dropout
  • SSR
  • Horizontal + Vertical flip
  • For one model: Random Saturation and Brightness

No tabular/meta augmentation

Training

  • Adam optimizer with Reduce on Plateau scheduler
  • Trained with an LR of .003 for 30 epochs
  • Batch size of 16 (bs of 4 for one model)
  • Trained using Quantile Regression with .8 qloss + .2 metric loss (same as Ulrich's kernal)
  • Trained 5 fold for every model
  • Didn't use any batch accumulation or mixed precision
  • Saved checkpoint based on best validation score (all weeks)

For training I removed 6 entire patients because their CT scans were broken. I then split each patient into a fold by randomly shuffling them then using GroupKFold.

Then while training I pick a random unique patient and randomly select an image from 10-40 (since the first and last 10 images don't contain any lung info). I then made it so that each iteration lasted 4 * amount of patients. For validation I only picked the 15th image since it was the middle image that usually contains the most information.

Ensembling/Blending

Simple mean average of their FVC predictions and confidence

Final Submission

My final submission was a blend of 3 Resnet50 models:

And another point is that the single model that would have scored gold had the same training parameters as the other models except the added brightness/saturation augmentation

What didn't work

  • 3d resnets (I tried for at least a month with these)
  • Linear Decay Regression
  • Any other type of model besides Resnets and Efficientnets (determinism issues)
  • Efficientnets
  • Simple meta data head (just concat)
  • Random erase augmentation

Final Thoughts

I want to say again that this was an awesome comp that I am so glad I participated in! Very glad to get my third medal and second silver medal!

My previous competition: Melanoma Classification

My next competition: Lyft Motion Prediction

About

My 34th place solution to the OSIC Pulmonary Fibrosis Progression Competition hosted on Kaggle 🔬

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published