Abstract
Consider a dynamic programming problem with separable metric state space $S$, constraint set $A$, and reward function $r(x, P, y)$ for $(x, P)\in A$ and $y\in S$. Let $Tf$ be the optimal reward in one move, for the reward function $r(x, P, y) + f(y)$. Three results are proved. First, suppose $S$ is compact, $A$ closed, and $r$ upper semi-continuous; then $T^n0$ is upper semi-continuous, and there is an optimal Borel strategy for the $n$-move game. Second, suppose $S$ is compact, $A$ is an $F_\sigma$, and $\{r > a\}$ is an $F_\sigma$ for all $a$; then $\{T^n0 > a\}$ is an $F_\sigma$ for all $a$, and there is an $\varepsilon$-optimal Borel strategy for the $n$-move game. Third, suppose $A$ is open and $r$ is lower semi-continuous; then $T^n0$ is lower semi-continuous, and there is an $\varepsilon$-optimal Borel measurable strategy for the $n$-move game.
Citation
David A. Freedman. "The Optimal Reward Operator in Special Classes of Dynamic Programming Problems." Ann. Probab. 2 (5) 942 - 949, October, 1974. https://doi.org/10.1214/aop/1176996559
Information