Institute of Mathematical Statistics Collections

Using statistical smoothing to date medieval manuscripts

Andrey Feuerverger, Peter Hall, Gelila Tilahun, and Michael Gervers

Full-text: Open access

Abstract

We discuss the use of multivariate kernel smoothing methods to date manuscripts dating from the 11th to the 15th centuries, in the English county of Essex. The dataset consists of some 3300 dated and 5000 undated manuscripts, and the former are used as a training sample for imputing dates for the latter. It is assumed that two manuscripts that are “close”, in a sense that may be defined by a vector of measures of distance for documents, will have close dates. Using this approach, statistical ideas are used to assess “similarity”, by smoothing among distance measures, and thus to estimate dates for the 5000 undated manuscripts by reference to the dated ones.

Chapter information

Source
N. Balakrishnan, Edsel A. Peña and Mervyn J. Silvapulle, eds., Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen (Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2008), 321-331

Dates
First available in Project Euclid: 1 April 2008

Permanent link to this document
https://projecteuclid.org/euclid.imsc/1207058283

Digital Object Identifier
doi:10.1214/193940307000000248

Mathematical Reviews number (MathSciNet)
MR2462216

Subjects
Primary: 62G99: None of the above, but in this section 62P99: None of the above, but in this section
Secondary: 62-07: Data analysis 62H20: Measures of association (correlation, canonical correlation, etc.)

Keywords
bandwidth calendaring dating deeds document kernel resemblance distance shingle

Rights
Copyright © 2008, Institute of Mathematical Statistics

Citation

Feuerverger, Andrey; Hall, Peter; Tilahun, Gelila; Gervers, Michael. Using statistical smoothing to date medieval manuscripts. Beyond Parametrics in Interdisciplinary Research: Festschrift in Honor of Professor Pranab K. Sen, 321--331, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2008. doi:10.1214/193940307000000248. https://projecteuclid.org/euclid.imsc/1207058283


Export citation

References

  • [1] Berry, M. W. (2001). Computational Information Retrieval. SIAM, Philadelphia.
  • [2] Berry, M. W. (2003). Survey of Text Mining: Clustering, Classification, and Retrieval. Springer, New York.
  • [3] Broder, A. Z., Glassman, S. C., Manasse, M. S. and Zweig, G. (1997). Syntactic clustering of the web. SRC Technical Note No. 1997-015, Digital Equipment Corporation. In Proceedings of the Sixth International World Wide Web Conference 391–404.
  • [4] Broder, A. Z. (1998). On the resemblance and containment of documents. In 1997 International Conference on Compression and Complexity of Sequences (SEQUENCES ’97), June 11–13 1997, Positano, Italy, 21–29. IEEE Computer Society, Los Alamitos, California.
  • [5] Cutting, D. R., Karger, D. R., Pedersen, J. O. and Tukey, J. W. (1992). Scatter/gather: a cluster-based approach to browsing large document collections. In Proc. Fifteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Copenhagen, Denmark, June 21–24 1992 (N. J. Belkin, P. Ingwersen and A. M. Pejtersen, eds.) 318–329. Association for Computing Machinery, New York.
  • [6] Djeraba, C. (2002). Multimedia Mining – A Highway to Intelligent Multimedia Documents. Kluwer, Boston.
  • [7] Feuerverger, A., Hall, P., Tilahun, G. and Gervers, M. (2005). Distance measures and smoothing methodology for imputing features of documents. J. Statist. Graph. Statist. 14 255–262.
  • [8] Fiallos, R. (2000a). An overview of the process of dating undated medieval charters: latest results and future developments. In Dating Undated Medieval Charters (M. Gervers, ed.) 37–48. Boydell Press, Woodbridge, UK.
  • [9] Fiallos, R. (2000b). Procedure for dating undated documents using a rational database. Manuscript.
  • [10] Gervers, M. (1989). The textile industry in Essex in the late 12th and 13th centuries: A study based on occupational names in charter sources. Essex Archaelogy and History 20 34–73.
  • [11] Gervers, M. (1982, 1996). The Cartulary of the Knights of St. John of Jerusalem in England, Parts 1, 2. Oxford Univ. Press, London.
  • [12] Gervers, M. (2000a). The DEEDS project and the development of a computerised methodology for dating undated English private charters of the twelfth and thirteenth centuries. In Dating Undated Medieval Charters (M. Gervers, ed.) 13–35. Boydell Press, Woodbridge, UK.
  • [13] Gervers, M. (2000b). The dating of medieval English private charters of the twelfth and thirteenth centuries. Manuscript.
  • [14] Härdle, W. and Gasser, T. (1984). Robust nonparametric function fitting. J. Roy. Statist. Soc. Ser. B 46, 42–51.
  • [15] Rees, U. (1975). The Cartulary of Shrewsbury Abbey 1. Aberystwyth.
  • [16] Rabin, M. O. (1981). Fingerprinting by random polynomials. Report TR-15-81, Center for Research in Computing Technology, Harvard Univ.
  • [17] Stenton, F. M. (1922). Transcripts of Charters relating to the Gilbertine Houses of Sixle, Ormsby, Catley, Bullington, and Alvingham. Publications of the Lincoln Record Society for 1920 18. Horncastle, UK.