Master Class: Machine Learning at Scale: Big Data with Small Clusters (Carlos Guestrin)

SpeakerCarlos Guestrin
AffiliationUniversity of Washington
DateThursday, 02 Jul 2015
Time12:00 - 13:00
LocationTorrington (1-19) 115 Galton LT
Event seriesMaster Class: Carlos Guestrin (2-3 July 2015)

Machine learning has become the hottest topic in computing. Industries are being disrupted by intelligent applications that use ML at their core. From e-commerce, through movie streaming, to taxis, new companies that rely on ML are displacing old incumbents. And, these applications require the training of models on ever-increasing data set sizes. Thus, a significant amount of effort has been devoted to running these methods on very large computer clusters, at significant financial cost and endless headaches.

In this talk, we will build on a series of systems for ML (including GraphLab, GraphChi and SFrames) to describe a design strategy for scaling up machine learning algorithms. In particular, we will demonstrate that a small cluster or even a single machine, with the right systems, data layout and algorithms, can note only outperform large clusters, on very large real world problems. We will also explore algorithmic designs for ML which, when combine with such systems, can make the techniques accessible to non-ML experts who want to build ML-infused applications and potentially disrupt new markets.

Lunch will be provided after the talk in Room 102

iCalendar csml_id_236.ics