Open Access
January, 1975 A Modified Form of the Iterative Method of Dynamic Programming
Arie Hordijk, Henk Tijms
Ann. Statist. 3(1): 203-208 (January, 1975). DOI: 10.1214/aos/1176343008

Abstract

This paper considers the discrete time finite state Markovian decision problem with the average return criterion. A modified form of the iterative method of dynamic programming is studied. Under the assumption that the maximal average return is independent of the initial state the asymptotic behaviour of the sequence of functions generated by this modified method is found. It is shown that the modified iterative method supplies both upper and lower bounds on the maximal average return and $\varepsilon$-optimal policies. Moreover, a convergence result is proved for the policies produced by the modified iterative method.

Citation

Download Citation

Arie Hordijk. Henk Tijms. "A Modified Form of the Iterative Method of Dynamic Programming." Ann. Statist. 3 (1) 203 - 208, January, 1975. https://doi.org/10.1214/aos/1176343008

Information

Published: January, 1975
First available in Project Euclid: 12 April 2007

zbMATH: 0304.90115
MathSciNet: MR378837
Digital Object Identifier: 10.1214/aos/1176343008

Subjects:
Primary: 90C40

Keywords: average return , convergence results , dynamic programming , Markov decision theory , modified iterative method

Rights: Copyright © 1975 Institute of Mathematical Statistics

Vol.3 • No. 1 • January, 1975
Back to Top