Open Access
October, 1994 Critical Phenomena for Sequence Matching with Scoring
Amir Dembo, Samuel Karlin, Ofer Zeitouni
Ann. Probab. 22(4): 1993-2021 (October, 1994). DOI: 10.1214/aop/1176988492

Abstract

Consider two independent sequences $X_1,\ldots, X_n$ and $Y_1,\ldots, Y_n$. Suppose that $X_1,\ldots, X_n$ are i.i.d. $\mu X$ and $Y_1,\ldots, Y_n$ are i.i.d. $\mu_Y$, where $\mu_X$ and $\mu_Y$ are distributions on finite alphabets $\sum_X$ and $\sum_Y$, respectively. A score $F: \sum_X \times \sum_Y \rightarrow \mathbb{R}$ is assigned to each pair $(X_i, Y_j)$ and the maximal nonaligned segment score is $M_n = \max_{0\leq i, j \leq n - \Delta, \Delta \geq 0}\{\sum^\Delta_{l=1}F(X_{i+l}, Y_{j+l})\}$. Our result is that $M_n/\log n \rightarrow \gamma^\ast(\mu_X, \mu_Y)$ a.s. with $\gamma^\ast$ determined by a tractable variational formula. Moreover, the pair empirical measure of $(X_{i+l}, Y_{j+l})$ during the segment where $M_n$ is achieved converges to a probability measure $\nu^\ast$, which is accessible by the same formula. These results generalize to $X_i, Y_j$ taking values in any Polish space, to intrasequence scores under shifts, to long quality segments and to more than two sequences.

Citation

Download Citation

Amir Dembo. Samuel Karlin. Ofer Zeitouni. "Critical Phenomena for Sequence Matching with Scoring." Ann. Probab. 22 (4) 1993 - 2021, October, 1994. https://doi.org/10.1214/aop/1176988492

Information

Published: October, 1994
First available in Project Euclid: 19 April 2007

zbMATH: 0834.60031
MathSciNet: MR1331213
Digital Object Identifier: 10.1214/aop/1176988492

Subjects:
Primary: 60F10
Secondary: 60F15

Keywords: large deviations , large segmental sums , Sequence matching , strong laws

Rights: Copyright © 1994 Institute of Mathematical Statistics

Vol.22 • No. 4 • October, 1994
Back to Top