CSML Master Class with Carlos Guestrin (additional talk by Emily Fox)

We are pleased to announce that the next CSML Master Class, sponsored by Google Deepmind, will take place from 2-3 July 2015. Carlos Guestrin (University of Washington) will give two talks, with an additional talk being given by Emily Fox (University of Washington).

Further details are available at

Next CSML Master Class with Sham Kakade - registration open!

Sham Kakade from Microsoft Research New England will be giving the next CSML Master Class from 3 - 5 November 2014. The topic of the Master Class will be 'Some Algorithmic Questions in Statistics: Estimation and Optimization'.

Please see the events pages for full details and a link to register

UCL team UCLStats win RSS challenge 2014 !

UCL team "UCLStats" of Beate Franke, Alfredo Kalaitzis, Sam Livingstone and Michael Betancourt are the Overall Winners of the Royal Statistical Society Statistical Analytics challenge 2014, organized by The Young Statisticians Section & Research Section of the Royal Statistical Society, sponsored by Select Statistics. The team will be invited to present their findings at the YSS and Research Section Statistical Analytics Challenge Session at the RSS 2014 conference in Sheffield on Wednesday 3rd September 2014.

The 2014 competition was designed to analyse some data from Neuroimaging, namely resting state functional MRI data. Unlike many data prediction competitions, there was no right answer here (at least not one that is known), but care in the statistical analysis was the focus of the challenge. The title of the UCLStats submission was "Long Range Spatio-Temporal Dependence in fMRI Imaging Data".

Next CSML Master Class with Shai Ben-David - Registration now open!

The next CSML Master Class will be given by Shai Ben-David from University of Waterloo. The talks will take place from 21-23 July 2014, and registration is now open. Full details are available at

Data Science Position in Bristol

Research Assistant in Data Science
(Full time - Closing date for applications - 17-Jun-2014)

We are looking for an outstanding data scientist to join the ERC project "ThinkBIG" (led by Nello Cristianini). You will contribute to extend a pre-existing software infrastructure for the analysis of large corpora. This is a position for an experienced data scientist with a strong background in a STEM field, and research training, with experience in statistical machine learning applied to textual data, and in scaling software for big-data applications.

In addition to your online application, you will need to submit a sample of code (preferably Java) written by you which shows your best work in the creation of reusable, maintainable code in a group environment. Please select a sample no larger than 5 classes. This sample of code does not necessarily need to be related to text mining.

Please send the sample of code (alone) to Tom Welfare at: code.for.vacancy(at) with a short explanation. Please do not send any enquiries to this email address.

Informal enquiries can be made to Nello Cristianini: nello.cristianini(at)

It is expected that interviews will be held in late June 2014.

ThinkBIG page:

Please check all details and specific requirements:

Next CSML Master Class - Registration now open!

The next CSML Master Class will be given by Andrew Gelman from Columbia University. The talks will take place from 14-16 April 2014, and registration is now open. Full details are available at

Postdoc position involving pathbreaking work in MRP, Stan, and the 2014 election!

The below vacancy in the Applied Statistics Center at Columbia University may be of interest to CSML members

We’re working with polling company YouGov to track public opinion, state-by-state and district-by-district, during the 2014 campaign. We’ll be using multilevel regression and poststratification, and implementing it in Stan, and developing the necessary new parts of Stan to get this running scalably and efficiently. And we’ll be making the most detailed, up-to-date election forecasts.

What you’ll be doing if you join us as a postdoc:

  • You’ll be in the midst of the most advanced polling team anywhere;
  • You’ll be doing cutting-edge statistical research on MRP with deep interactions;
  • You’ll be doing basic research in statistical computing, developing fast and scalable deterministic and stochastic algorithms for fitting multilevel models;
  • You’ll be working inside Stan, the most advanced general computational framework for Bayesian analysis. We’re doing research, not just implementing existing methods.

What we need:
  • Stats knowledge. You should know your way around Bayesian data analysis;
  • Serious computing skills. You should be a skilled C++ programmer;
  • Interest in the application area. You should care about public opinion, and it should be important to you that our forecasts are good. When our estimates of opinion for some group in the population don’t make sense, you should notice and be bothered by it.

We have a great team with diverse skills (computing, statistics, political science) that plays nicely together and from whom you can learn a lot. Also lots of opportunity to collaborate with researchers in many different quantitative disciplines through Columbia’s Applied Statistics Center.

The position is a 2-year postdoc, funded by Yougov and situated in the Applied Statistics Center at Columbia University. More details on the position can be found at .

Internships at Amazon

The Amazon Machine Learning Science Team is seeking interns to join the Scalable Machine Learning, Forecasting, Content Linkage, and Computer Vision groups in Berlin, Germany. We focus on data-driven approaches in these areas for Amazon as a whole as well as their applications to particular products.

  • In the Forecasting group, we develop sophisticated algorithms that involve learning from very large amounts of data, such as prices, promotions, similar products, and a product’s attributes, in order to forecast the demand of over 10 million products. These forecasts are used to automatically order more than $200 million worth of inventory weekly, establish labor plans for over 10,000 employees, and predict the overall company’s financial performance. The work is complex and important to Amazon. The better our forecasts, the more we can lower prices for customers and offer in-stock selection.
  • In the Content Linkage group, our goal is to learn to dynamically link digital content on Amazon and the Web, so that customers can discover fresh and related content relevant to books and other media they purchase. Our other task is to enable fast and scalable translation of Amazon's product catalog, customer reviews and other natural language data across a large number of language pairs.
  • In the Computer Vision group our goal is to develop real-time object recognition and object tracking algorithms that can distinguish between 10+M products and can be used to automate Amazon's fulfillment and product quality systems. The more accurately we can detect products, the more we can speed-up the delivery to customers.
  • In the Scalable Machine Learning group, we develop learning algorithms able to handle the gigantic data volumes collected by Amazon and able to operate at internet speed. We explore new paradigms for distributed optimisation and inference. We also investigate efficient representations of very large and complex data models. Finally, we automate the process of building and learning data models to enable evidence-driven decision-making with minimal effort from the domain experts.
As an intern, you will have an opportunity to work on complex mathematical problems with a large element of uncertainty. You will develop new scalable algorithms and improve existing approaches based on modern statistical, machine learning, and data mining methods to impact the core business of Amazon. You are an individual with outstanding analytical abilities and excellent communication skills. You will be responsible for researching, developing, and analyzing statistical models. You will also be prototyping the implementations by using high-level modelling languages such as MATLAB or R, or in software languages such as Java, or C++. You should have a strong background in machine learning with domain knowledge and experience in the following areas: data-driven statistical modelling, graphical models, feature extraction and analysis, supervised learning (in particular, discriminative methods). You should have at least one refereed academic publications in these areas.

Please contact Cedric Archambeau cedrica(at) if you are interested in this opportunity.

Connectionists: Call for Participation: MLSS Machine Learning Summer School, Reykjavik, April 25 - M

Colocated with AISTATS 2014, the Seventeenth International Conference on Artificial Intelligence and Statistics

The Machine Learning Summer School will take place at Reykjavik University in Reykjavik, Iceland, from April 25 to May 4, 2014. The field of machine learning is at the intersection of computer science, statistics, mathematics, and optimization. The Machine Learning Summer School (MLSS) is a great venue for graduate students, researchers, and professionals to learn about fundamental and advanced methods of machine learning, data analysis, and inference, from theory to practice.

The MLSS in Reykjavik features an exciting program with talks from leading experts in the field. MLSS is colocated with the high-profile international conference AISTATS 2014, the Seventeenth International Conference on Artificial Intelligence and Statistics. The MLSS also features a poster session for students, jointly with a poster session of AISTATS.

Limited travel support for students is available, see the website for details.

UK Computational Statistics Researcher wins ISBA Mitchell Prize

Dr Ioanna Manolopoulou and her research collaborators have been awarded the 2012 Mitchell Prize by the International Society for Bayesian Analysis for their paper "Bayesian Spatio-Dynamic Modelling in Cell Motility Studies: Learning Nonlinear Taxic Fields Guiding Immune Response" for its sophisticated development of model-based inference methods in the expanding field of immune-cell dynamics.

The Mitchell Prize is awarded annually in recognition of an outstanding paper that describes how a Bayesian analysis has solved an important applied problem.

CSML host meeting on variable selection and dimension reduction in clustering and classification

On 8-9 November 2013 there will be a joint meeting of the British Classification Society (BCS) and the working group on data analysis and numerical classification (AG DANK) at the Department of Statistical Science, UCL. The meeting is supported by the Centre of Computational Statistics and Machine Learning (CSML). The local organiser is Dr Christian Hennig.

Focus topic of the meeting is "Variable selection and dimension reduction in clustering and classification".

There will be three invited speakers: Gilles Celeux, Paris; Mario Figueiredo, Lisbon; and Roberto Rocci, Rome.

Up to date information on the meeting including deadlines and contact details can be found on here.

Dr Christian Hennig to speak in the president's invited session at IFCS 2013, Tilburg

Dr Christian Hennig was invited by the president of the International Federation of Classification Societies (IFCS), Iven van Mechelen, to give a presentation on "Measurement of quality in cluster analysis" in the president's invited session at the IFCS 2013 conference in Tilburg, the Netherlands, 15-17 July 2013. Dr Hennig will give further invited presentations on the same topic at the Summer Working Group on Mixture Models, Bologna, 22-26 July and the ISI conference in Hong Kong, 26-30 August. Dr Hennig was also invited to the ERCIM conference 2013 in London, 14-16 December (title to be announced).

Dr Christian Hennig awarded EPRSC standard research grant

Dr Christian Hennig was awarded an EPSRC standard research grant for work on "A multicriterion approach to cluster validation". In this research, Dr Hennig will look at the notoriously difficult problem of estimating the number of clusters from a fresh angle, connecting criteria properly to different research aims. The project includes collaborations with the robust clustering group of the University of Valladolid led by Prof Carlos Matran, the expert for biological species delimitation Dr Bernhard Hausdorf, University of Hamburg, the IFCS cluster benchmarking initiative led by IFCS president Prof Iven van Mechelen, KU Leuven and Dr Hennig's former PhD student Dr Pietro Coretto, University of Salerno.

EPSRC grant for Advanced Stochastic Computation for Inference from Tree, Graph and Network Models

PI: M. De Iorio; CO-I A. Beskos, D. Balding, A. Jasra

The main objective is to develop and characterise principled approximations to complex statistical models.

The research work aims to reduce both the variance of simulation-based estimates, and the computational time of stochastic algorithms. This will facilitate the implementation of stochastic models in real biological applications. The grant will support a postdoctoral researcher for three years.

EPSRC grant funding for Gabriel Brostow, Mark Girolami and Kate Jones (#EP/K015664/1)

Funded by EPSRC for 3 years

ENGAGE: Interactive Machine Learning Accelerating Progress in Science, An Emerging Theme of ICT Research In short, it brings together Machine Learning, Computer Vision, Human-Computer Interaction, and Biodiversity Science to help cope with the global extinction crisis.

The PI's on the project are:
Gabriel Brostow - UCL CS
Mark Girolami - UCL Stats
Kate Jones - UCL Department of Genetics, Evolution and Environment Mike Terry - University of Waterloo, CS and ZSL Chair, Ecology and Biodiversity

Our vision is to establish and lead a new theme in ICT research based on Interactive Machine Learning (IML). Our expansion of IML will give scientists and non-ICT specialists unprecedented access to cutting-edge Machine Learning algorithms by providing a humancomputer interface by which they can directly interact with large scale data and computing resources in an intuitive visual environment. In addition, the outcome of this particular project will have a direct transformative impact on the sciences by making it possible for non-programming individuals (scientists), to create systems that semi-automatically detect objects and events in vast quantities of A) audio and B) visual data.

By working together across two parallel, highly interconnected streams of ICT research, we will develop the foundations of statistical methodology, algorithms and systems for IML. As an exemplar, this project partners with world leading scientists grappling with the challenge of analysing enormous quantities of heterogeneous data being generated in Biodiversity Science.

i-like Opening workshop - 31 January 2013, 2:10pm to 5:30pm

Please go to for details of this event.

The day is principally aimed at phd and post doc level, although others are more than welcome to attend as well.

Faculty Member In Machine Learning Or Statistics

The Gatsby Computational Neuroscience Unit at University College London is looking to recruit a junior or senior level faculty member in machine learning or statistics. We especially seek candidates who work in probabilistic or statistical machine learning.

The closing date for applications is 20 December 2012

For further information, please see

Professor Mark Girolami awarded EPSRC funding to lead UK research network on computational statistic

EPSRC are providing three years of funding to establish and build a UK wide research network on Computational Statistics and Machine Learning. This network will be led by the UCL Centre of Computational Statistics and Machine Learning ( Website

RSS Ordinary Meeting - Wednesday, November 14, 2012, 5.00 p.m. - How to find an appropriate cluster

The paper “How to find an appropriate clustering for mixed type variables with application to socio-economic stratification” by Christian Hennig (Department of Statistical Science/CSML, University College London, UK) and Tim F. Liao (the University of Illinois, Champaign, USA) will be read at an Ordinary Meeting at the RSS. It will be presented to the Society on Wednesday, November 14th, 2012, at 5 p.m. Information about Ordinary Meetings including a PDF file of the paper can be obtained from the RSS website Everybody can controbute to the discussion.

CFP: Two NIPS workshops

NIPS 2012 Workshop on the Confluence between Kernel Methods and Graphical Models
This workshop addresses two main research questions: first, how may kernel methods be used to address difficult learning problems for graphical models, such as inference for multi-modal continuous distributions on many variables, and dealing with non-conjugate priors? And second, how might kernel methods be advanced by bringing in concepts from graphical models, for instance by incorporating sophisticated conditional independence structures, latent variables, and prior information?

Submissions due: Sep 26
Notification: Oct 14

NIPS 2012 Workshop on Modern Nonparametric Methods in Machine Learning
Statistical analysis of big, high-dimensional data has become frequent in many scientific fields ranging from biology, genomics and health sciences to astronomy, economics and machine learning. The aim of this workshop is to bring together practitioners, who work on specialized applications, and theoreticians that are interested in providing sound methodology. We hope to advertise recent successes of nonparametric methods in a number of domains, involving large scale high-dimensional problems, and to dismiss the common belief that nonparametric methods are not suitable for dealing with challenges arising from big data.

Submissions due: Sep 16
Notification: Oct 7

CSML Research Network is to be funded

We have just been informed by EPSRC that our proposed CSML Research Network is to be funded. This is really great news for the Computational Statistics and Machine Learning communities in the UK. More details will be released in due course.

Dr Simon Byrne awarded EPSRC Fellowship

Dr Simon Byrne has been awarded a Post-Doctoral Fellowship from the Engineering and Physical Sciences Research Council (EPSRC) for his project entitled "Information geometry for Bayesian hierarchical models". This will provide support for three years, with the aim of developing both theoretical understanding of the geometric structure of hierarchical models, as well as practical computational tools to make these models feasible for larger and more complex problems.

Michael Epstein wins Best Poster at the Second Annual CLMS Symposium

Michael Epstein won a prize for Best Poster at the recent Second Annual CLMS Symposium. The symposium included sessions in a variety of interdisciplinary approaches to Life Science problems such as Drug Design, Motion Tracking and Immunology. The UCL Computational Life and Medical Sciences (CLMS) Network supports collaboration, communication and co-operation between the various domains that promote the development of computational life and medical sciences at UCL. The poster can be found at

dCSE Funding Awarded To Ben Calderhead and Mark Girolami

Dr Ben Calderhead, Research Fellow in CoMPLEX, and Professor Mark Girolami, Director of CSML, have been awarded dCSE funding from EPSRC and The Numerical Algorithms Group (NAG) to employ a software developer for 1 year for an ambitious plan to develop highly parallelised code implementing differential geometric MCMC statistical methodology for cluster computers in UCL, and for HECToR, the national supercomputer.

Chris Bracegirdle Runner-up for Best Student Paper at ICML

The paper Bayesian Conditional Cointegration by CSML PhD student Chris Bracegirdle and his supervisor David Barber has been selected as runner-up for the best student paper award at the International Conference on Machine Learning (ICML) 2012.

Larry Wasserman's Blog

Larry Wasserman, who recently visited CSML as the first speaker in the Master Class series, just started a blog, where he will share his Thoughts on Statistics and Machine Learning.

Two New Papers by CSML Members

Two papers of CSML members Prof. Mark Girolami and Dr. Arthur Gretton have recently been published. These two are the first papers to carry the official CSML affiliation.

CSML on Facebook and Twitter

CSML now has a Facebook page and a Twitter account where news and events are announced. So like CSML on Facebook, follow @uclcsml and spread the word!

Model selection meeting 29th March

The Biritsh and Irish region of the International Biometric Society are holding the meeting: Model selection for genetic and epidemiological data, on 29 March 2012, 1:30PM-5PM in the Manson Lecture theatre, LSHTM.

Cost and Registration: £20 for International Biometric Society-British and Irish region members, £40 for non members and free for student members (paypal payment available or by cheque on site).

For the program or to register go to:

CSML Annual Open Members Meeting, May 10th

On May 10th there will be the first annual open meeting of the membership of CSML to help define forward plans and strategy. Please mark your calendar with the following details (see event details or subscribe to the CSML events calendar):

Date: 10th May Time: 12-2pm Location: Wilkins Haldane Room, Wilkins Building

Lunch will be provided from 12 noon. We look forward to seeing as many of you as possible there!

IT Future of Medicine (ITFoM) to present a “virtual patient” at the FET Flagship Midterm Conference

The IT Future of Medicine (ITFoM) project, involving work by Professor Mark Girolami, is to present a “virtual patient” at the FET Flagship Midterm Conference in Warsaw. The full press release is available here.

Collaboration between ISM and CSML

The Institute of Statistical Mathematics (ISM) in Tokyo and CSML have signed an agreement to undertake academic and collaboration to develop mutually beneficial, creative and productive scholarly activities in the field of statistical machine learning. Professor Shiro Ikeda from ISM visited UCL this week to officially sign the agreement, here seen with the Dean of MAPS Richard Catlow, Ricardo Silva and Mark Girolami both from CSML. This is an important agreement and opens a number of new developing collaborations between UCL and ISM.

One collaboration currently underway is on the application of kernel methods from machine learning to problems in statistics (hypothesis testing and Bayesian inference), undertaken by Professor Fukumizu at the department of statistical modelling at ISM, and Arthur Gretton at the Gatsby Unit.

New Website

Welcome to the new CSML website. We hope the new structure will make it easier to keep up to date with all the exciting things that are happening at CSML!