Seminar: Spectral Learning of Latent-Variable PCFGs

SpeakerMike Collins
AffiliationColumbia University
DateWednesday, 21 Nov 2012
Time13:00 - 14:00
LocationRoberts 508
Event seriesComputer Science Seminars
Description

Abstract: Statistical models with hidden or latent variables are of great
importance in natural language processing, speech, and many other
fields. The EM algorithm is a remarkably successful method for
parameter estimation within these models: it is simple, it is often
relatively efficient, and it has well understood formal properties. It
does, however, have a major limitation: it has no guarantee of finding
the global optimum of the likelihood function. From a theoretical
perspective, this means that the EM algorithm is not guaranteed to
give consistent parameter estimates. From a practical perspective,
problems with local optima can be difficult to deal with.

In this talk I'll describe a spectral algorithm for learning of
latent-variable probabilistic context-free grammars (L-PCFGs).
L-PCFGs have been shown to be highly effective models for natural
language parsing (Petrov et al. 2006; Matuszaki et al., 2005).
Under a separation (singular value) condition, our algorithm
provides consistent parameter estimates; this is in contrast with
previous work, which has used the EM algorithm for parameter
estimation, with the usual problems of local optima.

In experiments on parsing using L-PCFGs, we show that the
method is as accurate as models estimated using EM, while being an
order of magnitude or two more efficient (it requires a singular value
decomposition followed by a single pass over the data, in contrast to
multiple iterations over the data required by EM).

While the focus of this talk is on models for natural language
parsing, the techniques I will describe should be relevant to many
other latent-variable models used in machine learning and natural
language processing. The method requires a significant generalization
on prior work on spectral learning algorithms, in particular the
seminal work of Hsu et al (2009) on provably consistent learning of
HMMs.

This is joint work with Shay Cohen, Karl Stratos, Dean Foster, and
Lyle Ungar.

Bio: Michael Collins is the Vikram S. Pandit Professor of computer science
at Columbia University. He completed a PhD in computer science from
the University of Pennsylvania in December 1998. From January 1999 to
November 2002 he was a researcher at AT&T Labs-Research, and from
January 2003 until December 2010 he was an assistant/associate
professor at MIT. His research is focused on topics including
statistical parsing, structured prediction problems in machine
learning, and applications including machine translation, dialog
systems, and speech recognition.

iCalendar csml_id_124.ics