This project implements three services using Computer Vision: shadow detection & removal (ShadowSight), pose detection, and visual question-answering.
pip install -r final_requirements.txt
Note: You don't necessarily need GPUs since this project uses pretrained models and checkpoints. However, if you wish to use GPUs, you would need to install the respective PyTorch and torchvision wheel versions from here. Copy the wheel links from the PyTorch website and replace the PyTorch and torchvision entries in final_requirements.txt
with them.
You can download checkpoints for ShadowSight from here. Download and place them inside app in a folder named checkpoints.
For the rest of the services, download the pre-trained weights from here (download the entire folder) and place them inside app in a folder named pickles.
python run.py
While all paths in the scripts are relative, in case of any discrepancies look through init.py, test.py, and views.py in the app folder and change values for path variables accordingly.
- https://github.com/michalfaber/keras_Realtime_Multi-Person_Pose_Estimation
- https://github.com/IsHYuhi/ST-CGAN_Stacked_Conditional_Generative_Adversarial_Networks
- https://github.com/jiasenlu/HieCoAttenVQA
-
Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal, Jifeng Wang∗, Xiang Li∗, Le Hui, Jian Yang, Nanjing University of Science and Technology, [arXiv]
-
Lu, Jiasen, et al. "Hierarchical question-image co-attention for visual question answering." Advances in neural information processing systems. 2016.
-
Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, ‘‘Realtime multi-person 2D pose estimation using part affinity fields,’’ in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Jul. 2017, pp. 1302–1310.