Monocular depth estimation with ResNet-50
I collect most of the depth data for this project using Depth Dector, an iOS Application i created to obtain depth information via iDevices's TrueDepth camera.
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect.
Architecuture figure from Iron Laina et al. paper
The images are first feed into a ResNet-50 network and then go through four upsampling blocks. Different from what described in the figure, our model has an input size of 320x240x3 (WxHxC) and output size of 160x128x1 (WxHxC).
- Data augmentation
- More loss functions
- Experiments on Upsample/Upconvolution block
Deep Optics for Monocular Depth Estimation and 3D Object Detection by Julie Chang and Gordon Wetzstein
Depth Estimation from Single Image Using CNN-Residual Network by Xiaobai Ma, Zhenglin Geng, and Zhi Bie
Deeper Depth Prediction with Fully Convolutional Residual Networks by Iro Laina