Skip to content
/ SSLVC Public
forked from mvcisback/SSLVC

Sound Source Localization using Visual Cues

Notifications You must be signed in to change notification settings

dsx-ai/SSLVC

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS598ps_project

In this paper we present a creative approach to reconstruct 3D audio for multiple sources from a single channel input by detecting and tracking visual cues using supervised learning methods. We also discuss a similar approach for improving speaker’s classification from a video stream by employing both facial and speech likelihoods, or simply Multimodal Speaker Recognition on a video stream.

Videos assets are here:

About

Sound Source Localization using Visual Cues

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TeX 77.3%
  • Python 20.9%
  • Makefile 1.8%