Seminar: Variational Inference for Reinforcement Learning

SpeakerCANCELLED - Matthew Fellows
AffiliationUniversity of Oxford
DateFriday, 22 Mar 2019
Time13:00 - 14:00
LocationN/A
Event seriesDeepMind CSML Seminar Series
Description

Applying probabilistic models to reinforcement learning (RL) enables the application of powerful optimisation tools such as variational inference to RL. However, existing inference frameworks and their algorithms pose significant challenges for learning optimal policies, e.g., the absence of mode capturing behaviour in pseudo-likelihood methods and difficulties learning deterministic policies in maximum entropy RL based approaches.

In this talk, the speaker will introduce a new framework, known as VIREL, whose likelihood is proportional to the exponential of a parametrised action-value function divided by the mean squared Bellman error. It will then be shown that this form of likelihood affords several useful properties, including 1) an exact reduction of entropy regularised RL to probabilistic inference with the ability to learn deterministic policies; 2) the reconciliation of existing actor critic and policy gradient methods like A3C and EPG into a broader theoretical approach; 3) a framework for developing inference-style algorithms for RL that can also learn deterministic policies; and 4) a practical algorithm arising from the framework that adaptively balances exploration-driving entropy with the RL objective and outperforms the current state of the art.

iCalendar csml_id_374.ics