Optimal Open Loop Markov Decision Rules May Require Parametric Excitation

Roger Brockett

doi:cis/1290608951

2010 Optimal Open Loop Markov Decision Rules May Require Parametric Excitation

Roger Brockett

Commun. Inf. Syst. 10(4): 279-292 (2010).

Abstract

We present here a general theory, and give a specific example, showing that there exist time invariant Markov decision problems, with no time variation in the model which, when optimized over an infinite interval, have optimal closed loop control laws that are time varying. Although similar behavior was observed much earlier for specific problems arising in chemical and aeronautical engineering, this work is not applicable to Markov decision problems because of the specific form of the constraints involving the action of the semigroup of stochastic matrices on the standard simplex and the bilinear structure that goes along with rate control for Markov processes. The results given here are especially interesting insofar as they are analogous to the optimal solutions of stochastic control problems associated with Carnot cycles. As in some earlier work, the conditions under which time varying controls are optimal are characterized in terms of the the second variation about a singular solution. In this case the second variation is expressible in terms of a kernel function and conditions under which the second variation is positive definite can be checked by determining if the transform of this kernel is positive real or not.