Advances in Applied Probability

Impulsive control for continuous-time Markov decision processes

François Dufour and Alexei B. Piunovskiy

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

In this paper our objective is to study continuous-time Markov decision processes on a general Borel state space with both impulsive and continuous controls for the infinite time horizon discounted cost. The continuous-time controlled process is shown to be nonexplosive under appropriate hypotheses. The so-called Bellman equation associated to this control problem is studied. Sufficient conditions ensuring the existence and the uniqueness of a bounded measurable solution to this optimality equation are provided. Moreover, it is shown that the value function of the optimization problem under consideration satisfies this optimality equation. Sufficient conditions are also presented to ensure on the one hand the existence of an optimal control strategy, and on the other hand the existence of a ε-optimal control strategy. The decomposition of the state space into two disjoint subsets is exhibited where, roughly speaking, one should apply a gradual action or an impulsive action correspondingly to obtain an optimal or ε-optimal strategy. An interesting consequence of our previous results is as follows: the set of strategies that allow interventions at time t = 0 and only immediately after natural jumps is a sufficient set for the control problem under consideration.

Article information

Source
Adv. in Appl. Probab., Volume 47, Number 1 (2015), 106-127.

Dates
First available in Project Euclid: 31 March 2015

Permanent link to this document
https://projecteuclid.org/euclid.aap/1427814583

Digital Object Identifier
doi:10.1239/aap/1427814583

Mathematical Reviews number (MathSciNet)
MR3327317

Zentralblatt MATH identifier
1311.90170

Subjects
Primary: 90C40: Markov and semi-Markov decision processes
Secondary: 60J25: Continuous-time Markov processes on general state spaces

Keywords
Impulsive control continuous control continuous-time Markov decision process discounted cost

Citation

Dufour, François; Piunovskiy, Alexei B. Impulsive control for continuous-time Markov decision processes. Adv. in Appl. Probab. 47 (2015), no. 1, 106--127. doi:10.1239/aap/1427814583. https://projecteuclid.org/euclid.aap/1427814583


Export citation

References

  • Bellman, R. (1957). Dynamic Programming. Princeton University Press.
  • Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control: The Discrete Time Case (Math. Sci. Eng. 139). Academic Press, New York.
  • Bourbaki, N. (1971). Éléments de Mathématique. Topologie Générale. Chapitres 1 à 4. Hermann, Paris.
  • Brémaud, P. (1981). Point Processes and Queues. Springer, New York.
  • Davis, M. H. A. (1993). Markov Models and Optimization (Monogr. Statist. Appl. Prob. 49). Chapman & Hall, London.
  • De Leve, G. (1964). Generalized Markovian Decision Processes. Part I: Model and Method (Math. Centre Tracts 3). Mathematisch Centrum, Amsterdam.
  • De Leve, G. (1964). Generalized Markovian Decision Processes. Part II: Probabilistic Background (Math. Centre Tracts 4). Mathematisch Centrum, Amsterdam.
  • Guo, X. and Hernández-Lerma, O. (2009). Continuous-Time Markov Decision Processes. Theory and Applications (Stoch. Modelling Appl. Prob. 62). Springer, Berlin.
  • Guo, X., Hernández-Lerma, O. and Prieto-Rumeau, T. (2006). A survey of recent results on continuous-time Markov decision processes. With comments and a rejoinder by the authors. Top 14, 177–261.
  • Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-Time Markov Control Processes. Basic Optimality Criteria (Appl. Math. (New York) 30). Springer, New York.
  • Hordijk, A. and van der Duyn Schouten, F. A. (1983). Average optimal policies in Markov decision drift processes with applications to a queueing and a replacement model. Adv. Appl. Prob. 15, 274–303.
  • Hordijk, A. and van der Duyn Schouten, F. A. (1984). Discretization and weak convergence in Markov decision drift processes. Math. Operat. Res. 9, 112–141.
  • Hordijk, A. and van der Duyn Schouten, F. (1985). Markov decision drift processes: conditions for optimality obtained by discretization. Math. Operat. Res. 10, 160–173.
  • Howard, R. A. (1960). Dynamic Programming and Markov Processes. The Technology Press of M.I.T., Cambridge, MA.
  • Jacod, J. (1975). Multivariate point processes: predictable projection, Radon–Nikodým derivatives, representation of martingales. Z. Wahrscheinlichkeitsth. 31, 235–253.
  • Jacod, J. (1979). Calcul Stochastique et Problèmes de Martingales (Lecture Notes Math. 714). Springer, Berlin.
  • Last, G. and Brandt, A. (1995). Marked Point Processes on the Real Line. The Dynamic Approach. Springer, New York.
  • Prieto-Rumeau, T. and Hernández-Lerma, O. (2012). Selected Topics on Continuous-Time Controlled Markov Chains and Markov Games (ICP Adv. Texts Math. 5). Imperial College Press, London.
  • Van der Duyn Schouten, F. A. (1983). Markov Decision Processes With Continuous Time Parameter (Math. Centre Tracts 164). Mathematisch Centrum, Amsterdam.
  • Yushkevich, A. A. (1983). Continuous time Markov decision processes with interventions. Stochastics 9, 235–274.
  • Yushkevich, A. A. (1986). Markov decision processes with both continuous and impulsive control. In Stochastic Optimization (Kiev, 1984; Lecture Notes Control Inf. Sci. 81), Springer, Berlin, pp. 234–246.
  • Yushkevich, A. A. (1987). Bellman inequalities in Markov decision deterministic drift processes. Stochastics 23, 25–77.
  • Yushkevich, A. A. (1989). Verification theorems for Markov decision processes with controllable deterministic drift and gradual and impulsive controls. Theory Prob. Appl. 34, 474–496.