Pixel Club: Visual Audio Denoising

Dana Segev (EE, Technion)
Tuesday, 24.1.2012, 11:30
EE Meyer Building 1061

Audio denoising is a long studied problem, with numerous algorithms and a wide accumulated knowledge. Considering non-stationary noise (like noise in a cocktail party environment) and strong, this task becomes very difficult to handle. This paper considers such audio denoising problems, where the audio track is accompanied by a video. Furthermore, the disturbing source is generally not visible in the field of view, and its nature is unknown. Overcoming such unknown noise is different than source separation, where the disturbing sources or their priors are at least partially accessible. We demonstrate the core-ability to use the video information to better filter the audio. The approach we take in this work is an example-based one, assuming that we have at our disposal a relatively good-quality movie (video+audio) to train on. The noise removal itself is done by processing short temporal segments of video-and-audio, seeking relevant examples from the training set. The video information eliminates non- relevant training examples and improves the algorithm performance. We demonstrate this approach on two different types of signals. In all the experiments, a very significant denoising is achieved, also in cases where audio-only processing methods fail.

MSc thesis under the supervision of Michael Elad and Yoav Schechner.

Back to the index of events