Bernoulli

  • Bernoulli
  • Volume 25, Number 1 (2019), 623-653.

Optimal rates of statistical seriation

Nicolas Flammarion, Cheng Mao, and Philippe Rigollet

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Given a matrix, the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.

Article information

Source
Bernoulli, Volume 25, Number 1 (2019), 623-653.

Dates
Received: January 2017
Revised: August 2017
First available in Project Euclid: 12 December 2018

Permanent link to this document
https://projecteuclid.org/euclid.bj/1544605258

Digital Object Identifier
doi:10.3150/17-BEJ1000

Mathematical Reviews number (MathSciNet)
MR3892331

Zentralblatt MATH identifier
07007219

Keywords
adaptation matrix estimation minimax estimation permutation learning shape constraints statistical seriation

Citation

Flammarion, Nicolas; Mao, Cheng; Rigollet, Philippe. Optimal rates of statistical seriation. Bernoulli 25 (2019), no. 1, 623--653. doi:10.3150/17-BEJ1000. https://projecteuclid.org/euclid.bj/1544605258


Export citation

References

  • [1] Amelunxen, D., Lotz, M., McCoy, M.B. and Tropp, J.A. (2014). Living on the edge: Phase transitions in convex programs with random data. Inf. Inference 3 224–294.
  • [2] Annexstein, F. and Swaminathan, R. (1998). On testing consecutive-ones property in parallel. Discrete Appl. Math. 88 7–28.
  • [3] Anuchina, N.N., Babenko, K.I., Godunov, S.K., Dmitriev, N.A., Dmitrieva, L.V., D’yachenko, V.F., Zabrodin, A.V., Lokutsievskiĭ, O.V., Malinovskaya, E.V., Podlivaev, I.F., Prokopov, G.P., Sofronov, I.D. and Fedorenko, R.P. (1979). Teoreticheskie Osnovy i Konstruirovanie Chislennykh Algoritmov Zadach Matematicheskoĭ Fiziki. Moscow: “Nauka”.
  • [4] Arabie, P., Schleutermann, S., Daws, J. and Hubert, L. (1988). Marketing applications of sequencing and partitioning of nonsymmetric and/or two-mode matrices. In Data, Expert Knowledge and Decisions: An Interdisciplinary Approach with Emphasis on Marketing Applications (W. Gaul and M. Schader, eds.) 215–224. Berlin, Heidelberg: Springer.
  • [5] Atkins, J.E., Boman, E.G. and Hendrickson, B. (1999). A spectral algorithm for seriation and the consecutive ones problem. SIAM J. Comput. 28 297–310.
  • [6] Ayer, M., Brunk, H.D., Ewing, G.M., Reid, W.T. and Silverman, E. (1955). An empirical distribution function for sampling with incomplete information. Ann. Math. Stat. 26 641–647.
  • [7] Barlow, R.E., Bartholomew, D.J., Bremner, J.M. and Brunk, H.D. (1972). Statistical Inference Under Order Restrictions. The Theory and Application of Isotonic Regression. London–New York–Sydney: Wiley.
  • [8] Bellec, P. and Tsybakov, A.B. (2015). Sharp oracle bounds for monotone and convex regression through aggregation. J. Mach. Learn. Res. 16 1879–1892.
  • [9] Bellec, P.C. (2015). Sharp oracle inequalities for least squares estimators in shape restricted regression. Preprint. Available at ArXiv:1510.08029.
  • [10] Bellec, P.C. (2016). Private communication.
  • [11] Berthet, Q. and Rigollet, P. (2013). Complexity theoretic lower bounds for sparse principal component detection. In COLT 2013 – The 26th Conference on Learning Theory, Princeton, NJ, June 1214, 2013 (S. Shalev-Shwartz and I. Steinwart, eds.). JMLR W&CP 30 1046–1066.
  • [12] Bickel, P.J. and Fan, J. (1996). Some problems on the estimation of unimodal densities. Statist. Sinica 6.
  • [13] Birgé, L. (1997). Estimation of unimodal densities without smoothness assumptions. Ann. Statist. 25.
  • [14] Birman, M.Š. and Solomjak, M.Z. (1967). Piecewise polynomial approximations of functions of classes $W_{p}^{\alpha}$. Mat. Sb. 73 (115) 331–355.
  • [15] Boyarshinov, V. and Magdon-Ismail, M. (2006). Linear time isotonic and unimodal regression in the $L_{1}$ and $L_{\infty}$ norms. J. Discrete Algorithms 4 676–691.
  • [16] Bro, R. and Sidiropoulos, N. (1998). Least squares algorithms under unimodality and non-negativity constraints. J. Chemom. 12 223–247.
  • [17] Chandrasekaran, V., Recht, B., Parrilo, P.A. and Willsky, A.S. (2012). The convex geometry of linear inverse problems. Found. Comput. Math. 12 805–849.
  • [18] Chatterjee, S. (2014). A new perspective on least squares under convex constraint. Ann. Statist. 42 2340–2381.
  • [19] Chatterjee, S. (2015). Matrix estimation by universal singular value thresholding. Ann. Statist. 43 177–214.
  • [20] Chatterjee, S., Guntuboyina, A. and Sen, B. (2015). On risk bounds in isotonic and other shape restricted regression problems. Ann. Statist. 43 1774–1800.
  • [21] Chatterjee, S., Guntuboyina, A. and Sen, B. (2018). On matrix estimation under monotonicity constraints. Bernoulli 24 1072–1100.
  • [22] Chatterjee, S. and Lafferty, J. (2015). Adaptive risk bounds in unimodal regression. Preprint. Available at ArXiv:1512.02956.
  • [23] Chatterjee, S. and Mukherjee, S. (2016). On estimation in tournaments and graphs under monotonicity constraints. Preprint. Available at ArXiv:1603.04556.
  • [24] Collier, O. and Dalalyan, A.S. (2016). Minimax rates in permutation estimation for feature matching. J. Mach. Learn. Res. 17 1–32.
  • [25] Copeland, A.H. (1951). A reasonable social welfare function. In Mimeographed Notes from a Seminar on Applications of Mathematics to the Social Sciences, University of Michigan.
  • [26] Czekanowski, J. (1909). Zur differential Diagnose der Neandertalgruppe. Korrespondenzblatt der deutschen Gesellschaft für Anthropologie. Ethnologie und Urgeschichte 40 44–47.
  • [27] Daskalakis, C., Diakonikolas, I. and Servedio, R.A. (2012). Learning $k$-modal distributions via testing. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms 1371–1382. ACM, New York.
  • [28] Daskalakis, C., Diakonikolas, I., Servedio, R.A., Valiant, G. and Valiant, P. (2012). Testing $k$-modal distributions: Optimal algorithms via reductions. In Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms 1833–1852. SIAM, Philadelphia, PA.
  • [29] Davidson, D. and Marschak, J. (1959). Experimental tests of a stochastic decision theory. Measurement: Definitions and Theories.
  • [30] Donoho, D.L. (1990). Gel’fand $n$-widths and the method of least squares Statistics Technical Report No. 282, Univ. California, Berkeley.
  • [31] Eggermont, P.P.B. and LaRiccia, V.N. (2000). Maximum likelihood estimation of smooth monotone and unimodal densities. Ann. Statist. 28.
  • [32] Eisen, M.B., Spellman, P.T., Brown, P.O. and Botstein, D. (1998). Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95 14863–14868.
  • [33] Fishburn, P.C. (1973). Binary choice probabilities: On the varieties of stochastic transitivity. J. Math. Psych. 10.
  • [34] Flammarion, N., Mao, C. and Rigollet, P. (2017). Supplement to “Optimal rates of statistical seriation.” DOI:10.3150/17-BEJ1000SUPP.
  • [35] Fogel, F., Jenatton, R., Bach, F. and d’Aspremont, A. (2013). Convex Relaxations for Permutation Problems. In Advances in Neural Information Processing Systems 26 (C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. Q. Weinberger, eds.) 1016–1024. Curran Associates, Inc.
  • [36] Forsyth, E. and Katz, L. (1946). A matrix approach to the analysis of sociometric data: Preliminary report. Sociometry 9 340–347.
  • [37] Frisen, M. (1986). Unimodal regression. J. R. Stat. Soc., Ser. D Stat. 35 479–485.
  • [38] Fulkerson, D.R. and Gross, O.A. (1964). Incidence matrices with the consecutive $1$’s property. Bull. Amer. Math. Soc. 70 681–684.
  • [39] Gao, C., Lu, Y. and Zhou, H.H. (2015). Rate-optimal graphon estimation. Ann. Statist. 43 2624–2652.
  • [40] Geng, Z. and Shi, N.Z. (1990). Algorithm AS 257: Isotonic regression for umbrella orderings. J. R. Stat. Soc. Ser. C. Appl. Stat. 39 397–402.
  • [41] Gertzen, T.L. and Grötschel, M. (2012). Flinders Petrie, the travelling salesman problem, and the beginning of mathematical modeling in archaeology. Doc. Math. X 199–210.
  • [42] Hartigan, J.A. (1972). Direct clustering of a data matrix. J. Amer. Statist. Assoc. 67 123–129.
  • [43] Kendall, D.G. (1963). A statistical approach to Flinders Petrie’s sequence-dating. Bull. Inst. Int. Stat. 40 657–681.
  • [44] Kendall, D.G. (1969). Incidence matrices, interval graphs and seriation in archeology. Pacific J. Math. 28 565–570.
  • [45] Kendall, D.G. (1970). A mathematical approach to seriation. Philos. Trans. R. Soc. Lond. Ser. A, Math. Phys. Sci. 269 125–134.
  • [46] Kendall, D.G. (1971). Abundance matrices and seriation in archaeology. Z. Wahrsch. Verw. Gebiete 17 104–112.
  • [47] Köllmann, C., Bornkamp, B. and Ickstadt, K. (2014). Unimodal regression using Bernstein–Schoenberg splines and penalties. Biometrics 70 783–793.
  • [48] Ledoux, M. and Talagrand, M. (1991). Probability in Banach Spaces. Ergebnisse der Mathematik und Ihrer Grenzgebiete (3) [Results in Mathematics and Related Areas (3)] 23. Berlin: Springer.
  • [49] Liiv, I. (2010). Seriation and matrix reordering methods: An historical overview. Stat. Anal. Data Min. 3 70–91.
  • [50] Lim, C.H. and Wright, S. (2014). Beyond the Birkhoff polytope: Convex relaxations for vector permutation problems. In Advances in Neural Information Processing Systems 27 (Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence and K.Q. Weinberger, eds.) 2168–2176. Curran Associates, Inc.
  • [51] Loiola, E.M., Maia de Abreu, N.M., Boaventura-Netto, P.O., Hahn, P. and Querido, T. (2007). A survey for the quadratic assignment problem. European J. Oper. Res. 176 657–690.
  • [52] Ma, Z. and Wu, Y. (2015). Computational barriers in minimax submatrix detection. Ann. Statist. 43 1089–1116.
  • [53] Mammen, E. (1991). Estimating a smooth monotone regression function. Ann. Statist. 19 724–740.
  • [54] Mammen, E. and van de Geer, S. (1997). Locally adaptive regression splines. Ann. Statist. 25 387–413.
  • [55] Mendelson, S. (2015). Learning without concentration. J. ACM 62.
  • [56] Murtagh, F. (1989). Review of Book Data, Expert Knowledge and Decisions, W. Gaul and M. Schader (eds.), Springer-Verlag, 1988. J. Classification 6 129–132.
  • [57] Nemirovskiĭ, A.S., Polyak, B.T. and Tsybakov, A.B. (1985). The rate of convergence of nonparametric estimates of maximum likelihood type. Problemy Peredachi Informatsii 21 17–33.
  • [58] Petrie, W.M.F. (1899). Sequences in prehistoric remains. J. Anthropol. Inst. G.B. Irel. 29 295–301.
  • [59] Robertson, T., Wright, F.T. and Dykstra, R. (1988). Order Restricted Statistical Inference. Probability and Statistics Series. New York: Wiley.
  • [60] Robinson, W.S. (1951). A method for chronologically ordering archaeological deposits. Am. Antiq. 16 293–301.
  • [61] Shah, N.B., Balakrishnan, S., Guntuboyina, A. and Wainwright, M.J. (2017). Stochastically transitive models for pairwise comparisons: Statistical and computational issues. IEEE Trans. Inform. Theory 63 934–959.
  • [62] Shah, N.B., Balakrishnan, S. and Wainwright, M.J. (2016). Feeling the Bern: Adaptive estimators for Bernoulli probabilities of pairwise comparisons. Preprint. Available at ArXiv:1603.06881.
  • [63] Shoung, J.M. and Zhang, C.H. (2001). Least squares estimators of the mode of a unimodal regression function. Ann. Statist. 29.
  • [64] Sokal, R.R. (1963). The principles and practice of numerical taxonomy. Taxon 12 190–199.
  • [65] Stout, Q.F. (2008). Unimodal regression via prefix isotonic regression. Comput. Statist. Data Anal. 53 289–297.
  • [66] Tsybakov, A.B. (2009). Introduction to Nonparametric Estimation. Springer Series in Statistics. Berlin: Springer.
  • [67] Turnbull, B.C. and Ghosh, S.K. (2014). Unimodal density estimation using Bernstein polynomials. Comput. Statist. Data Anal. 72 13–29.
  • [68] van Handel, R. (2014). Probability in High Dimension. Lecture Notes (Princeton University).
  • [69] van de Geer, S. (1990). Estimating a regression function. Ann. Statist. 18.
  • [70] van de Geer, S. (1991). The entropy bound for monotone functions, Technical Report No. 91-10, Leiden Univ.
  • [71] van de Geer, S. (1993). Hellinger-consistency of certain nonparametric maximum likelihood estimators. Ann. Statist. 21.
  • [72] Zhang, C.-H. (2002). Risk bounds in isotonic regression. Ann. Statist. 30 528–555.

Supplemental materials

  • Supplement to “Optimal Rates of Statistical Seriation”. We include additional technical details in this supplement.