Ayuda
Ir al contenido

Dialnet


Random search for constrained Markov decision processes with multi-policy improvement

    1. [1] Sogang University

      Sogang University

      Corea del Sur

  • Localización: Automatica: A journal of IFAC the International Federation of Automatic Control, ISSN 0005-1098, Vol. 58, 2015, págs. 127-130
  • Idioma: inglés
  • Texto completo no disponible (Saber más ...)
  • Resumen
    • This communique first presents a novel multi-policy improvement method which generates a feasible policy at least as good as any policy in a given set of feasible policies in finite constrained Markov decision processes (CMDPs). A random search algorithm for finding an optimal feasible policy for a given CMDP is derived by properly adapting the improvement method. The algorithm alleviates the major drawback of solving unconstrained MDPs at iterations in the existing value-iteration and policy-iteration type exact algorithms. We establish that the sequence of feasible policies generated by the algorithm converges to an optimal feasible policy with probability one and has a probabilistic exponential convergence rate.


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus

Opciones de compartir

Opciones de entorno