Collection of papers and other resources for object detection and tracking using deep learning
- Mask R-CNN (pdf, arxiv, github) by Facebook AI Research!
- Summary goes here...
- Tensorflow object detection API: https://github.com/tensorflow/models/tree/master/object_detection. Only the two SSD nets can run at 12.5 FPS on one GTX 1080 TI (less accurate than YOLO 604x604). Next two models at 4-5 FPS (4-5% mAP better than YOLO). Best model < 1 FPS. Currently code only allow inference of 1 image at a time. Speed might improve by 2.5 times when they allow multiple image inference.
-
Multi Object Tracking
-
Learning to Track: Online Multi-object Tracking by Decision Making (ICCV 2015) (Stanford) (pdf, github (Matlab), project page)
-
Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies (arxiv April 2017) (Stanford) (pdf, arxiv, project page)
-
Near-Online Multi-target Tracking with Aggregated Local Flow Descriptor (ICCV 2015) (NEC Labs) (pdf, author page)
-
A Multi-cut Formulation for Joint Segmentation and Tracking of Multiple Objects (highest MT on MOT2015) (University of Freiburg, Germany) (pdf, arxiv, author page)
-
-
Single Object Tracking
-
Deep Reinforcement Learning for Visual Object Tracking in Videos (arxiv April 2017) (USC-Santa Barbara, Samsung Research) (pdf, arxiv, author page)
-
Visual Tracking by Reinforced Decision Making (arxiv February 2017) (Seoul National University, Chung-Ang University) (pdf, arxiv, author page)
-
Action-Decision Networks for Visual Tracking with Deep Reinforcement Learning (CVPR 2017) (Seoul National University) (pdf, project page)
-
End-to-end Active Object Tracking via Reinforcement Learning (arxiv 30 May 2017) (Peking University, Tencent AI Lab) (pdf, arxiv
-
- Deep Feature Flow for Video Recognition (pdf, arxiv, github) by Microsoft Research
- Summary goes here...
- GRAM Road-Traffic Monitoring
- Stanford Drone Dataset
- Ko-PER Intersection Dataset
- TRANCOS Dataset
- Urban Tracker Dataset
- DARPA VIVID / PETS 2005 dataset (Non stationary camera)
- KIT-AKS Dataset (No ground truth)
- CBCL StreetScenes Challenge Framework (No top down viewpoint)
- MOT 2015 (mostly street level camera viewpoint)
- MOT 2016 (mostly street level camera viewpoint)
- MOT 2017 (mostly street level camera viewpoint)
- PETS 2009 (No vehicles)
- PETS 2017 (Low density; mostly pedestrians)
- KITTI Tracking Dataset (No top down viewpoint; non stationary camera)
- List of traffic surveillance datasets
- List of deep learning based tracking papers
- List of multi object tracking papers
- List of single object trackers with results on OTB
- List of Matlab frameworks, libraries and software
- Multi Object Tracking
- Learning to Track: Online Multi-Object Tracking by Decision Making (ICCV 2015)[MATLAB]
- Multiple Hypothesis Tracking Revisited (ICCV 2015) (highest MT on MOT2015 among open source trackers)[MATLAB]
- Joint Tracking and Segmentation of Multiple Targets (CVPR 2015)[MATLAB]
- Single Object Tracking
- DeepTracking: Seeing Beyond Seeing Using Recurrent Neural Networks (AAAI 2016)[Torch 7]
- Hierarchical Convolutional Features for Visual Tracking (ICCV 2015)[Matlab]
- Learning Multi-Domain Convolutional Neural Networks for Visual Tracking (Winner of The VOT2015 Challenge)[Matlab/MatConvNet]
- RATM: Recurrent Attentive Tracking Model[Python]
- Visual Tracking with Fully Convolutional Networks (ICCV 2015)[Matlab]
- Fully-Convolutional Siamese Networks for Object Tracking[Tensor flow]
- ROLO : Spatially Supervised Recurrent Convolutional Neural Networks for Visual Object Tracking[Tensor flow]
- Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking (ECCV 2016)[MATLAB]
- ECO: Efficient Convolution Operators for Tracking (CVPR 2017)[MATLAB]
- End-to-end representation learning for Correlation Filter based tracking (CVPR 21017)[MATLAB]
- A collection of common tracking algorithms
- Object Detection and Matching
- Matchnet
- Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches
- Asynchronous Methods for Deep Reinforcement Learning
- YOLO9000: Better, Faster, Stronger - Real-Time Object Detection. 9000 classes!
- Deformable Convolutional Networks
- R-FCN: Object Detection via Region-based Fully Convolutional Networks
- PVANet: Lightweight Deep Neural Networks for Real-time Object Detection
- Mask RCNN in TensorFlow
- Optical Flow
- Misc