Pixel Club: Computational Geometric Vision

Speaker:
Aaron Wetzler (CS, Technion)
Date:
Thursday, 22.6.2017, 11:30
Place:
Room 337 Taub Bld.

By combining geometric principles for shape analysis with modern sensing techniques, large-scale datasets and powerful computational architectures we show various new ways of enabling computers to better perceive, interpret and comprehend the geometry of the world around them. Specifically we explore the topics of reconstruction, filtering and semantic processing within the context of computational geometric vision. Reconstruction We start by discussing the problem of sensing and reconstructing three-dimensional geometry. We develop a method of performing efficient photometric stereo which can model non-linear and near-source lighting setups while avoiding directly computing normal fields. We then look into to two alternative approaches for reconstructing geometry on a smartphone using projected light.

Filtering
Image and shape data which is obtained through modern cameras and sensors is typically noisy. We contribute a new patch based data denoising framework called the Beltrami Patch filter for denoising grayscale, color images and extend it to depth fields and 3D meshes. We then modify the approach by reformulating the differential operators used as trainable kernels in a deep neural network and unrolling the update step through time. We demonstrate state of the art results and highlight the fact that other PDE based methods could take advantage of the same basic idea.

Semantic processing
In the last part of this talk we discuss the problem of localizing and identifying self similar geometric objects in a complex visual space. We specifically focus on the problem of identifying fingertips on articulating hands observed by depth cameras. We describe how we used high accuracy magnetic sensors to annotate large quantities of training data for both front facing and ego-centric hand motion. To perform learning efficiently we turn to random forests and contribute a new approach for efficiently mapping the training of a random decision tree on billions of training samples with trillions of features to a single multi-GPU computing node. Similarly for inference we describe a novel pipelined FPGA hardware implementation.

Back to the index of events