Estados Unidos
Kreisfreie Stadt Bonn, Alemania
Here and in a follow-on paper, we consider a simple control problem in which the underlying dynamics depend on a parameter a that is unknown and must be learned. In this paper, we assume that a is bounded, i.e., that ∣a∣≤aMAX, and we study two variants of the control problem. In the first variant, Bayesian control, we are given a prior probability distribution for a and we seek a strategy that minimizes the expected value of a given cost function. Assuming that we can solve a certain PDE (the Hamilton–Jacobi–Bellman equation), we produce optimal strategies for Bayesian control. In the second variant, agnostic control, we assume nothing about a and we seek a strategy that minimizes a quantity called the regret. We produce a prior probability distribution dPrior(a) supported on a finite subset of [−a MAX,a MAX] so that the agnostic control problem reduces to the Bayesian control problem for the prior dPrior(a).
© 2001-2025 Fundación Dialnet · Todos los derechos reservados