Seminar: NIPS Previews

SpeakerAndrew McDonald, Kacper Chwialkowski, Balaji Lakshminarayanan
AffiliationUCL/Gatsby
DateFriday, 05 Dec 2014
Time13:00 - 14:00
LocationRoberts G08 (Sir David Davies lecture theatre)
Event seriesMicrosoft Research CSML Seminar Series
Description

A presentation of several NIPS 2014 papers from UCL researchers.

Talk 1

Speaker: Andrew McDonald

Title: Spectral k-Support Norm Regularization

Abstract: The k-support norm has successfully been applied to sparse vector prediction problems. We observe that it belongs to a wider class of norms, which we call the box-norms. Within this framework we derive an efficient algorithm to compute the proximity operator of the squared norm, improving upon the original method for the k-support norm. We extend the norms from the vector to the matrix setting and we introduce the spectral k-support norm. We study its properties and show that it is closely related to the multitask learning cluster norm. We apply the norms to real and synthetic matrix completion datasets. Our findings indicate that spec- tral k-support norm regularization gives state of the art performance, consistently improving over trace norm regularization and the matrix elastic net. (Joint work with Massimiliano Pontil and Dimitris Stamos)

Talk 2

Speaker: Kacper Chwialkowski

Title: A Wild Bootstrap for Degenerate Kernel Tests

Abstract: A wild bootstrap method for nonparametric hypothesis tests based on kernel distribution embeddings is proposed. This bootstrap method is used to construct provably consistent tests that apply to random processes, for which the naive permutation-based bootstrap fails. It applies to a large group of kernel tests based on V-statistics, which are degenerate under the null hypothesis, and non-degenerate elsewhere. To illustrate this approach, we construct a two-sample test, an instantaneous independence test and a multiple lag independence test for time series. In experiments, the wild bootstrap gives strong performance on synthetic examples, on audio data, and in performance benchmarking for the Gibbs sampler.
(Joint work with Dino Sejdinovic and Arthur Gretton)

Talk 3

Speaker: Balaji Lakshminarayanan

Title: Mondrian Forests: Efficient Online Random Forests

Abstract: Ensembles of randomized decision trees, usually referred to as random forests, are widely used for classification and regression tasks in machine learning and statistics. Random forests achieve competitive predictive performance and are computationally efficient to train and test, making them excellent candidates for real-world prediction tasks. The most popular random forest variants (such as Breiman's random forest and extremely randomized trees) operate on batches of training data. Online methods are now in greater demand. Existing online random forests, however, require more training data than their batch counterpart to achieve comparable predictive performance. In this work, we use Mondrian processes (Roy and Teh, 2009) to construct ensembles of random decision trees we call Mondrian forests. Mondrian forests can be grown in an incremental/online fashion and remarkably, the distribution of online Mondrian forests is the same as that of batch Mondrian forests. Mondrian forests achieve competitive predictive performance comparable with existing online random forests and periodically re-trained batch random forests, while being more than an order of magnitude faster, thus representing a better computation vs accuracy tradeoff. (Joint work with Daniel M. Roy and Yee Whye Teh)

Video of the talks here.

iCalendar csml_id_203.ics