The Annals of Applied Probability

Scheduling a multi class queue with many exponential servers: asymptotic optimality in heavy traffic

Rami Atar, Avi Mandelbaum, and Martin I. Reiman

Full-text: Open access


We consider the problem of scheduling a queueing system in which many statistically identical servers cater to several classes of impatient customers. Service times and impatience clocks are exponential while arrival processes are renewal. Our cost is an expected cumulative discounted function, linear or nonlinear, of appropriately normalized performance measures. As a special case, the cost per unit time can be a function of the number of customers waiting to be served in each class, the number actually being served, the abandonment rate, the delay experienced by customers, the number of idling servers, as well as certain combinations thereof. We study the system in an asymptotic heavy-traffic regime where the number of servers n and the offered load r are simultaneously scaled up and carefully balanced: $n\approx \mathbf{r}+\beta \sqrt{\mathbf{r}}$ for some scalar β. This yields an operation that enjoys the benefits of both heavy traffic (high server utilization) and light traffic (high service levels.)

We first consider a formal weak limit, through which our queueing scheduling problem gives rise to a diffusion control problem. We show that the latter has an optimal Markov control policy, and that the corresponding Hamilton–Jacobi–Bellman (HJB) equation has a unique classical solution. The Markov control policy and the HJB equation are then used to define scheduling control policies which we prove are asymptotically optimal for our original queueing system. The analysis yields both qualitative and quantitative insights, in particular on staffing levels, the roles of non-preemption and work conservation, and the trade-off between service quality and servers’ efficiency.

Article information

Ann. Appl. Probab., Volume 14, Number 3 (2004), 1084-1134.

First available in Project Euclid: 13 July 2004

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 60K25: Queueing theory [See also 68M20, 90B22] 68M20: Performance evaluation; queueing; scheduling [See also 60K25, 90Bxx] 90B22: Queues and service [See also 60K25, 68M20] 90B36: Scheduling theory, stochastic [See also 68M20] 49L20: Dynamic programming method

Multiclass queues multiserver queues queues with abandonment heavy traffic Halfin–Whitt (QED) regime call centers dynamic control diffusion approximation optimal control of diffusion HJB equation asymptotic optimality


Atar, Rami; Mandelbaum, Avi; Reiman, Martin I. Scheduling a multi class queue with many exponential servers: asymptotic optimality in heavy traffic. Ann. Appl. Probab. 14 (2004), no. 3, 1084--1134. doi:10.1214/105051604000000233.

Export citation


  • Armony, M. and Maglaras, C. (2004). Customer contact centers with multiple service channels. Oper. Res. To appear.
  • Atar, R., Mandelbaum, A. and Reiman, M. (2004). A Brownian control problem for a simple queueing system in the Halfin–Whitt regime. Systems Control Lett. 51 269–275.
  • Bell, S. L. and Williams, R. J. (2001). Dynamic scheduling of a system with two parallel servers in heavy traffic with resource pooling: Asymptotic optimality of a threshold policy. Ann. Appl. Probab. 11 608–649.
  • Billingsley, P. (1999). Convergence of Probability Measures, 2nd ed. Wiley, New York.
  • Birkhoff, G. and Rota, G.-C. (1962). Ordinary Differential Equations. Ginn and Company, Boston.
  • Borkar, V. S. (1989). Optimal Control of Diffusion Processes. Longman, Harlow.
  • Borst, S., Mandelbaum, A. and Reiman, M. (2004). Dimensioning large call centers. Oper. Res. To appear.
  • Borst, S. C. and Seri, P. (2000). Robust algorithms for sharing agents with multiple skills. Preprint.
  • Durrett, R. (1996). Probability: Theory and Examples, 2nd ed. Duxbury Press, Pacific Grover, CA.
  • Ethier, S. N. and Kurtz, T. G. (1986). Markov Processes. Characterization and Convergence. Wiley, New York.
  • Fleming, W. H. and Soner, H. M. (1993). Controlled Markov Processes and Viscosity Solutions. Springer, New York.
  • Fleming, P. J., Stolyar, A. and Simon, B. (1994). Heavy traffic limits for a mobile phone system loss model. In Proceedings of 2nd International Conference on Telecommunication Systems Modelling and Analysis.
  • Gans, N., Koole, G. and Mandelbaum, A. (2003). Telephone call centers: Tutorial, review and research prospects. Manufacturing and Service Operations Management 5 79–141.
  • Garnett, O. and Mandelbaum, A. (2000). An introduction to skills-based routing and its operational complexities. Teaching notes.
  • Garnett, O., Mandelbaum, A. and Reiman, M. (2002). Designing a call center with impatient customers. Manufacturing and Service Operations Management 4 208–227.
  • Gilbarg, D. and Trudinger, N. S. (1983). Elliptic Partial Differential Equations of Second Order, 2nd ed. Springer, Berlin.
  • Halfin, S. and Whitt, W. (1981). Heavy-traffic limits for queues with many exponential servers. Oper. Res. 29 567–588.
  • Harrison, J. M. (1988). Brownian models of queueing networks with heterogeneous customer populations. Math. Appl. 10 147–186.
  • Harrison, J. M. and Van Mieghem, J. A. (1997). Dynamic control of Brownian networks: State space collapse and equivalent workload formulations. Ann. Appl. Probab. 7 747–771.
  • Harrison, J. M. and Wein, L. M. (1990). Scheduling networks of queues: Heavy traffic analysis of a two-station closed network. Oper. Res. 38 1052–1064.
  • Harrison, J. M. and Zeevi, A. (2004). Dynamic scheduling of a multiclass queue in the Halfin–Whitt heavy traffic regime. Oper. Res. To appear.
  • Iglehart, D. L. and Whitt, W. (1971). The equivalence of functional central limit theorems for counting processes and associated partial sums. Ann. Math. Statist. 42 1372–1378.
  • Ishii, H. (1989). On uniqueness and existence of viscosity solutions of fully nonlinear second-order elliptic PDEs. Comm. Pure Appl. Math. 42 15–45.
  • Jagerman, D. L. (1974). Some properties of the Erlang loss function. Bell Syst. Tech. J. 53 525–551.
  • Karatzas, I. and Shreve, S. E. (1991). Brownian Motion and Stochastic Calculus, 2nd ed. Springer, New York.
  • Krichagina, E. V. and Taksar, M. I. (1992). Diffusion approximation for $GI/G/1$ controlled queues. Queueing Systems Theory Appl. 12 333–367.
  • Kumar, S. (2000). Two-server closed networks in heavy traffic: Diffusion limits and asymptotic optimality. Ann. Appl. Probab. 10 930–961.
  • Kurtz, T. G. and Protter, P. (1991). Weak limit theorems for stochastic integrals and stochastic differential equations. Ann. Probab. 19 1035–1070.
  • Kushner, H. J. and Chen, Y. N. (2000). Optimal control of assignment of jobs to processors under heavy traffic. Stochastics Stochastics Rep. 68 177–228.
  • Mandelbaum, A. and Stolyar, A. L. (2004). Scheduling flexible servers with convex delay costs: Heavy-traffic optimality of the generalized $c\mu$-rule. Oper. Res. To appear.
  • Martins, L. F., Shreve, S. E. and Soner, H. M. (1996). Heavy traffic convergence of a controlled, multiclass queueing system. SIAM J. Control Optim. 34 2133–2171.
  • Plambeck, E., Kumar, S. and Harrison, J. M. (2001). A multiclass queue in heavy traffic with throughput time constraints: Asymptotically optimal dynamic controls. Queueing Systems Theory Appl. 39 23–54.
  • Protter, P. (1990). Stochastic Integration and Differential Equations. A New Approach. Springer, Berlin.
  • Puhalskii, A. A. and Reiman, M. I. (2000). The multiclass $\mathit{GI}/\mathit{PH}/N$ queue in the Halfin–Whitt regime. Adv. in Appl. Probab. 32 564–595.
  • Reiman, M. I. (1984). Some diffusion approximations with state space collapse. Proc. of the Internat. Seminar on Modeling and Performance Evaluation Methodology. Lecture Notes in Control and Informational Science 209–240. Springer, New York.
  • Sze, D. Y. (1984). A queueing model for telephone operator staffing. Oper. Res. 32 229–249.
  • Van Mieghem, J. A. (1995). Dynamic scheduling with convex delay costs: The generalized $c\mu$ rule. Ann. Appl. Probab. 5 809–833.
  • Williams, R. J. (2000). On dynamic scheduling of a parallel server system with complete resource pooling. In Analysis of Communication Networks: Call Centres, Traffic and Performance (D. R. McDonald and S. R. E. Turner, eds.) 49–71. Amer. Math. Soc., Providence, RI.
  • Zohar, E., Mandelbaum, A. and Shimkin, N. (2002). Adaptive behavior of impatient customers in tele-queues: Theory and empirical support. Management Sci. 48 566–583.