Seminar: Deeplab to UberNet: from task-specific to task-agnostic deep learning in computer vision

SpeakerIasonas Kokkinos
DateWednesday, 19 Oct 2016
Time13:00 - 14:00
LocationMalet Place Engineering Building 1.03
Event seriesMicrosoft Research CSML Seminar Series

Over the last few years Convolutional Neural Networks (CNNs) have been shown to deliver excellent results in a broad range of low- and high-level vision tasks, spanning effectively the whole spectrum of computer vision problems.

In this talk we will present recent research progress along two complementary directions.

In the first part we will present research efforts on integrating established computer vision ideas with CNNs, thereby allowing us to incorporate task-specific domain knowledge in CNNs. We will present CNN-based adaptations of structured prediction techniques that use discrete (DenseCRF - Deeplab) and continuous energy-based formulations (Deep Gaussian CRF), and will also present methods to incorporate ideas from multi-scale processing, Multiple-Instance Learning and Spectral Clustering into CNNs.

In the second part of the talk we will turn to designing a generic architecture that can tackle a multitude of tasks jointly, aiming at designing a `swiss knife’ for computer vision. We call this network an ‘UberNet’ to underline its overarching nature. We will introduce techniques that allow us to train an UberNet while using datasets with diverse annotations, while also handling the memory limitations of current hardware. The proposed architecture is able to jointly address (a) boundary detection (b) saliency detection (c) normal estimation (d) semantic segmentation (e) human part segmentation (f) human boundary detection (g) region proposal generation and object detection in 0.7 seconds per frame, with a level of performance that is comparable to the current state-of-the-art on these tasks.

UberNet demo:
Boundary Detection:

I. Kokkinos, UberNet: Training a ‘Universal’ CNN for Low-, Mid-, and High- Level Vision using Diverse Datasets and Limited Memory, arxiv, 2016

S. Chandra and I. Kokkinos, Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs, Proc. European Conf. on Computer Vision (ECCV), 2016

L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs v1: ICLR 2015, v2: arxiv, 2016

I. Kokkinos, Pushing the Boundaries of Boundary Detection using Deep Learning, Int.l Conf. on Learning Representations (ICLR), 2016.

iCalendar csml_id_318.ics