Journal of Applied Probability

Average optimality for continuous-time Markov decision processes under weak continuity conditions

Yi Zhang

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

This paper considers the average optimality for a continuous-time Markov decision process in Borel state and action spaces, and with an arbitrarily unbounded nonnegative cost rate. The existence of a deterministic stationary optimal policy is proved under the conditions that allow the following; the controlled process can be explosive, the transition rates are weakly continuous, and the multifunction defining the admissible action spaces can be neither compact-valued nor upper semicontinuous.

Article information

Source
J. Appl. Probab., Volume 51, Number 4 (2014), 954-970.

Dates
First available in Project Euclid: 20 January 2015

Permanent link to this document
https://projecteuclid.org/euclid.jap/1421763321

Mathematical Reviews number (MathSciNet)
MR3301282

Zentralblatt MATH identifier
1307.90196

Subjects
Primary: 90C40: Markov and semi-Markov decision processes
Secondary: 60J25: Continuous-time Markov processes on general state spaces

Keywords
Continuous-time Markov decision process average optimality weak continuity

Citation

Zhang, Yi. Average optimality for continuous-time Markov decision processes under weak continuity conditions. J. Appl. Probab. 51 (2014), no. 4, 954--970. https://projecteuclid.org/euclid.jap/1421763321


Export citation

References

  • Berberian, S. K. (1999). Fundamentals of Real Analysis. Springer, New York.
  • Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control. Academic Press, New York.
  • Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in Markov decision chains with the average cost criterion. Syst. Control Lett. 16, 387–392.
  • Cavazos-Cadena, R. and Salem-Silva, F. (2010). The discunted method and equivalence of average criteria for risk-sensitive Markov decision processes on Borel spaces. Appl. Math. Optimization 61, 167–190.
  • Costa, O. L. V. and Dufour, F. (2012). Average control of Markov decision processes with Feller transition probabilities and general acton spaces. J. Math. Anal. Appl. 396, 58–69.
  • Feinberg, E. A. (2012). Reduction of discounted continuous-time MDPs with unbounded jump and reward rates to discrete-time total-reward MDPs. In Optimization, Control, and Applications of Stochastic Systems, Birkhäuser, New York, pp. 77–97.
  • Feinberg, E. A. and Lewis, M. E. (2007). Optimality inequalities for average cost Markov decision processes and the stochastic cash balance problem. Math. Operat. Res. 32, 769–783.
  • Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2012). Average cost Markov decision processes with weakly continuous transition probabilities. Math. Operat. Res. 37, 591–607.
  • Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2013). Berge's theorem for noncompact image sets. J. Math. Anal. Appl. 397, 255–259.
  • Feinberg, E. A., Kasyanov, P. O. and Zadoianchuk, N. V. (2013). Fatou's lemma for weakly converging probabilities. Preprint, Department of Applied Mathematics and Statistics, State University of New York at Stony Brook. Available at http://arxiv.org/abs/1206.4073v2.
  • Feinberg, E. A., Mandava, M. and Shiryaev, A. N. (2014). On solutions of Kolmogorov's equations for nonhomogeneous jump Markov processes. J. Math. Anal. Appl. 411, 261–270.
  • Gíhman, \uI. Ī. and Skorohod, A. V. (1975). The Theory of Stochastic Processes. II. Springer, New York.
  • Guo, X. (2007). Continuous-time Markov decision processes with discounted rewards: the case of Polish spaces. Math. Operat. Res. 32, 73–87.
  • Guo, X. and Hernández-Lerma, O. (2003). Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans. Automatic Control 48, 236–245.
  • Guo, X. and Hernández-Lerma, O. (2009). Continuous-Time Markov Decision Processes: Theory and Applications. Springer, Berlin.
  • Guo, X. and Liu, K. (2001). A note on optimality conditions for continuous-time Markov decision processes with average cost criterion. IEEE Trans. Automatic Control 46, 1984–1989.
  • Guo, X. and Rieder, U. (2006). Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Prob. 16, 730–756.
  • Guo, X. and Ye, L. (2010). New discount and average optimality conditions for continuous-time Markov decision processes. Adv. Appl. Prob. 42, 953–985.
  • Guo, X. and Zhang, Y. (2013). Generalized discounted continuous-time Markov decision processes. Preprint. Available at http://arxiv.org/abs/1304.3314.
  • Guo, X., Hernández-Lerma, O. and Prieto-Rumeau, T. (2006). A survey of recent results on continuous-time Markov decision processes. Top 14, 177–261.
  • Guo, X., Huang, Y. and Song, X. (2012). Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J. Control Optimization 50, 23–47.
  • Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes. Springer, New York.
  • Hernández-Lerma, O. and Lasserre, J. B. (2000). Fatou's lemma and Lebesgue's convergence theorem for measures. J. Appl. Math. Stoch. Anal. 13, 137–146.
  • Jaśkiewicz, A. (2009). Zero-sum ergodic semi-Markov games with weakly continuous transition probabilities. J. Optimization Theory Appl. 141, 321–347.
  • Jaśkiewicz, A. and Nowak, A. S. (2006). On the optimality equation for average cost Markov control processes with Feller transition probabilities. J. Math. Anal. Appl. 316, 495–509.
  • Jaśkiewicz, A. and Nowak, A. S. (2006). Optimality in Feller semi-Markov control processes. Operat. Res. Lett. 34, 713–718.
  • Kitaev, M. Yu. and Rykov, V. V. (1995). Controlled Queueing Systems. CRC, Boca Raton, FL.
  • Kitayev, M. Yu. (1986). Semi-Markov and jump Markov controlled models: average cost criterion. Theory. Prob. Appl. 30, 272–288.
  • Kuznetsov, S. E. (1981). Any Markov process in a Borel space has a transition function. Theory. Prob. Appl. 25, 384–388.
  • Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optimization 49, 2032–2061.
  • Piunovskiy, A. and Zhang, Y. (2012). The transformation method for continuous-time Markov decision processes. J. Optimization Theory Appl. 154, 691–712.
  • Prieto-Rumeau, T. and Hernández-Lerma, O. (2012). Selected Topics on Continuous-time Controlled Markov Chains and Markov Games. Imperial College Press, London.
  • Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.
  • Zhu, Q. (2008). Average optimality for continuous-time Markov decision processes with a policy iteration approach. J. Math. Anal. Appl. 339, 691–704.