Abstract
This paper considers in parallel the scheduling problem for multiclass queueing networks, and optimization of Markov decision processes. It is shown that the value iteration algorithm may perform poorly when the algorithm is not initialized properly. The most typical case where the initial value function is taken to be zero may be a particularly bad choice. In contrast, if the value iteration algorithm is initialized with a stochastic Lyapunov function, then the following hold: (i) a stochastic Lyapunov function exists for each intermediate policy, and hence each policy is regular (a strong stability condition), (ii) intermediate costs converge to the optimal cost, and (iii) any limiting policy is average cost optimal. It is argued that a natural choice for the initial value function is the value function for the associated deterministic control problem based upon a fluid model, or the approximate solution to Poisson’s equation obtained from the LP of Kumar and Meyn. Numerical studies show that either choice may lead to fast convergence to an optimal policy.
Similar content being viewed by others
References
A. Arapostathis, V.S. Borkar, E. Fernandez-Gaucherand, M.K. Ghosh and S.I. Marcus, Discretetime controlled Markov processes with average cost criterion: A survey, SIAM J. Control Optim. 31(1993) 282–344.
V.S. Borkar, Topics in Controlled Markov Chains, Pitman Research Notes in Mathematics Series, Vol. 240(Longman Sci. Tech., Harlow, UK, 1991).
V.S. Borkar and S.P. Meyn, Stability and convergence of stochastic approximation using the ODE method, to appear in SIAM J. Control Optim., and IEEE CDC (December 1998).
V.S. Borkar and S.P. Meyn, Risk sensitive optimal control: Existence and synthesis for models with unbounded cost, submitted to SIAM J. Control Optim. (1998).
R. Cavazos-Cadena, Value iteration in a class of communicating Markov decision chains with the average cost criterion, Technical Report, Universidad Autónoma Agraria Anonio Narro (1996).
R. Cavazos-Cadena and E. Fernandez-Gaucherand, Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence, in: Proc.of the 34th IEEE Conf.on Decision and Control, New Orleans, LA (1995) pp. 2283–2288.
H. Chen and A. Mandelbaum, Discrete flow networks: Bottlenecks analysis and fluid approximations, Math. Oper. Res. 16(1991) 408–446.
H. Chen and H. Zhang, Stability of multiclass queueing networks under priority service disciplines, Technical Note (1996).
J.G. Dai, On the positive Harris recurrence for multiclass queueing networks: A unified approach via fluid limit models, Ann. Appl. Probab. 5(1995) 49–77.
J.G. Dai and S.P. Meyn, Stability and convergence of moments for multiclass queueing networks via fluid limit models, IEEE Trans. Automat. Control 40(November 1995) 1889–1904.
P.W. Glynn and S.P. Meyn, A Lyapunov bound for solutions of Poisson's equation, Ann. Probab. 24(April 1996).
J.M. Harrison, The BIGSTEP approach to flow management in stochastic processing networks, in: Stochastic Networks Theory and Applications, eds. F.P. Kelly, S. Zachary and I. Ziedins (Clarendon Press, Oxford, 1996) pp. 57–89.
J.M. Harrison and L.M. Wein, Scheduling networks of queues: Heavy traffic analysis of a simple open network, Queueing Systems 5(1989) 265–280.
O. Hernández-Lerma and J.B. Lasserre, Discrete Time Markov Control Processes I (Springer, New York, 1996).
A. Hordijk, Dynamic Programming and Markov Potential Theory (1977).
J. Humphrey, D. Eng and S.P. Meyn, Fluid network models: Linear programs for control and performance bounds, in: Proc.of the 13th IFAC World Congress, Vol. B, eds. J. Cruz, J. Gertler and M. Peshkin, San Francisco, CA (1996) pp. 19–24.
S. Kumar and P.R. Kumar, Performance bounds for queueing networks and scheduling policies, IEEE Trans. Automat. Control 39(August 1994) 1600–1611.
P.R. Kumar and S.P. Meyn, Stability of queueing networks and scheduling policies, IEEE Trans. Automat. Control 40(2) (1995) 251–260.
P.R. Kumar and S.P. Meyn, Duality and linear programs for stability and performance analysis queueing networks and scheduling policies, IEEE Trans. Automat. Control 41(1) (1996) 4–17.
P.R. Kumar and T.I. Seidman, Dynamic instabilities and stabilization methods in distributed realtime scheduling of manufacturing systems, IEEE Trans. Automat. Control 35(3) (1990) 289–298.
L.F. Martins, S.E. Shreve and H.M. Soner, Heavy traffice convergence of a controlled, multiclass queueing system, SIAM J. Control Optim. 34(6) (1996) 2133–2171.
S.P. Meyn, The policy improvement algorithm: General theory with applications to queueing networks and their fluid models, in: 35th IEEE Conf.on Decision and Control, Kobe, Japan (December 1996).
S.P. Meyn, The policy improvement algorithm for Markov decision processes with general state space, IEEE Trans. Automat. Control 42(1997) 191–196.
S.P. Meyn, Stability and optimization of multiclass queueing networks and their fluid models, Lectures in Applied Mathematics, Vol. 33(Amer. Math. Soc., Providence, RI, 1997) pp. 175–199.
S.P. Meyn and R.L. Tweedie, Markov Chains and Stochastic Stability (Springer, London, 1993).
I.C. Paschalidis D. Bertsimas and J.N. Tsitsiklis, Scheduling of multiclass queueing networks: Bounds on achievable performance, in: Workshop on Hierarchical Control for Real-Time Scheduling of Manufacturing Systems, Lincoln, NH (October 16–18, 1992).
M.L. Puterman, Markov Decision Processes (Wiley, New York, 1994).
A.N. Rybko and A.L. Stolyar, On the ergodicity of stohastic processes describing the operation of open queueing networks, Problemy Peredachi Informatsii 28(1992) 3–26.
L.I. Sennott, A new condition for the existence of optimal stationary policies in average cost Markov decision processes, Oper. Res. Lett. 5(1986) 17–23.
L.I. Sennott, The convergence of value iteration in average cost Markov decision chains, Oper. Res. Lett. 19(1996) 11–16.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Chen, RR., Meyn, S. Value iteration and optimization of multiclass queueing networks. Queueing Systems 32, 65–97 (1999). https://doi.org/10.1023/A:1019182903300
Issue Date:
DOI: https://doi.org/10.1023/A:1019182903300