Monday, June 28
Optimal Bayesian Sequential Design Using Reinforcement Learning with Policy Gradient Methods
Monday, June 28
7:25 pm - 7:30 pm


  • Xun Huan (Speaker) Assistant Professor, University of Michigan


We focus on designing a finite sequence of experiments, seeking fully optimal design policies (strategies) that can (a) adapt to newly collected data during the sequence (i.e. feedback) and (b) anticipate future changes (i.e. lookahead). We approach this sequential decision-making problem in a Bayesian setting with information-based utilities, and solve it numerically via policy gradient methods from reinforcement learning. In particular, we directly parameterize the policy and value functions by neural networks—thus adopting an actor-critic approach—and improve them using gradient estimates produced from simulated design and observation sequences. The overall method is demonstrated on an algebraic benchmark and a sensor movement application for source inversion. The results provide intuitive insights on the benefits of feedback and lookahead, and indicate computational advantages compared to previous approaches based on approximate dynamical programming.

Add to my calendar

Add to Google Add to Outlook (.ics)

Create your personal schedule through the official app, Whova!

Get Started