Consider throwing $n$ balls at random into $m$ urns, each ball landing in urn $i$ with probability $p(i)$. Let $S$ be the resulting number of singletons, i.e., urns containing just one ball. We give an error bound for the Kolmogorov distance from the distribution of $S$ to the normal, and estimates on its variance. These show that if $n$, $m$ and $(p(i))$ vary in such a way that $n p(i)$ remains bounded uniformly in $n$ and $i$, then $S$ satisfies a CLT if and only if ($n$ squared) times the sum of the squares of the entries $p(i)$ tends to infinity, and demonstrate an optimal rate of convergence in the CLT in this case. In the uniform case with all $p(i)$ equal and with $m$ and $n$ growing proportionately, we provide bounds with better asymptotic constants. The proof of the error bounds is based on Stein's method via size-biased couplings.

Electron. J. Probab.
14:
2155-2181
(2009).
DOI: 10.1214/EJP.v14-699

A. D. Barbour and A. V. Gnedin. Small counts in the infinite occupancy scheme. Electron. J. Probab. 14 (2009), 365-384. 1189.60048 10.1214/EJP.v14-608A. D. Barbour and A. V. Gnedin. Small counts in the infinite occupancy scheme. Electron. J. Probab. 14 (2009), 365-384. 1189.60048 10.1214/EJP.v14-608

A. D. Barbour, L. Holst and S. Janson. Poisson Approximation. (1992) Oxford University Press, New York. 0746.60002A. D. Barbour, L. Holst and S. Janson. Poisson Approximation. (1992) Oxford University Press, New York. 0746.60002

S. Chatterjee. A new method of normal approximation. Ann. Probab. 36 (2008), 1584-1610. MR2435859 1159.62009 10.1214/07-AOP370 euclid.aop/1217360979S. Chatterjee. A new method of normal approximation. Ann. Probab. 36 (2008), 1584-1610. MR2435859 1159.62009 10.1214/07-AOP370 euclid.aop/1217360979

G. Englund. A remainder term estimate for the normal approximation in classical occupancy. Ann. Probab. 9 (1981), 684-692. 0464.60025 10.1214/aop/1176994376 euclid.aop/1176994376G. Englund. A remainder term estimate for the normal approximation in classical occupancy. Ann. Probab. 9 (1981), 684-692. 0464.60025 10.1214/aop/1176994376 euclid.aop/1176994376

W. Feller. An Introduction to Probability Theory and its Applications. Vol. I. 3rd ed. (1968) John Wiley and Sons, New York. 0155.23101W. Feller. An Introduction to Probability Theory and its Applications. Vol. I. 3rd ed. (1968) John Wiley and Sons, New York. 0155.23101

A. Gnedin, B. Hansen and J. Pitman. Notes on the occupancy problem with infinitely many boxes: general asymptotics and power laws. Probab. Surv. 4 (2007), 146-171. 1189.60050 10.1214/07-PS092A. Gnedin, B. Hansen and J. Pitman. Notes on the occupancy problem with infinitely many boxes: general asymptotics and power laws. Probab. Surv. 4 (2007), 146-171. 1189.60050 10.1214/07-PS092

H.-K. Hwang and S. Janson. Local limit theorems for finite and infinite urn models. Ann. Probab. 36 (2008), 992-1022. 1138.60027 10.1214/07-AOP350 euclid.aop/1207749088H.-K. Hwang and S. Janson. Local limit theorems for finite and infinite urn models. Ann. Probab. 36 (2008), 992-1022. 1138.60027 10.1214/07-AOP350 euclid.aop/1207749088

N. L. Johnson and S. Kotz. Urn Models and their Application: An approach to Modern Discrete Probability Theory. (1977) John Wiley and Sons, New York. MR0488211 0352.60001N. L. Johnson and S. Kotz. Urn Models and their Application: An approach to Modern Discrete Probability Theory. (1977) John Wiley and Sons, New York. MR0488211 0352.60001

Mikhailov, V. G. The central limit theorem for a scheme of independent allocation of particles by cells. (Russian) Number theory, mathematical analysis and their applications. Trudy Mat. Inst. Steklov. 157 (1981), 138-152.Mikhailov, V. G. The central limit theorem for a scheme of independent allocation of particles by cells. (Russian) Number theory, mathematical analysis and their applications. Trudy Mat. Inst. Steklov. 157 (1981), 138-152.

Quine, M. P. and Robinson, J. A Berry-Esseen bound for an occupancy problem. Ann. Probab. 10 (1982), 663-671. 0493.60034 10.1214/aop/1176993775 euclid.aop/1176993775Quine, M. P. and Robinson, J. A Berry-Esseen bound for an occupancy problem. Ann. Probab. 10 (1982), 663-671. 0493.60034 10.1214/aop/1176993775 euclid.aop/1176993775

Quine, M. P. and Robinson, J. Normal approximations to sums of scores based on occupancy numbers. Ann. Probab. 12 (1984), 794-804. 0584.60031 10.1214/aop/1176993228 euclid.aop/1176993228Quine, M. P. and Robinson, J. Normal approximations to sums of scores based on occupancy numbers. Ann. Probab. 12 (1984), 794-804. 0584.60031 10.1214/aop/1176993228 euclid.aop/1176993228

Steele, J. M. An Efron-Stein inequality for non-symmetric statistics. Ann. Statist. 14 (1986), 753-758. 0604.62017 10.1214/aos/1176349952 euclid.aos/1176349952Steele, J. M. An Efron-Stein inequality for non-symmetric statistics. Ann. Statist. 14 (1986), 753-758. 0604.62017 10.1214/aos/1176349952 euclid.aos/1176349952

Vatutin, V. A. and Mikhailov, V. G. Limit theorems for the number of empty cells in an equiprobable scheme for the distribution of particles by groups. Theory Probab. Appl. 27 (1982), 734-743. 0536.60017 10.1137/1127084Vatutin, V. A. and Mikhailov, V. G. Limit theorems for the number of empty cells in an equiprobable scheme for the distribution of particles by groups. Theory Probab. Appl. 27 (1982), 734-743. 0536.60017 10.1137/1127084