Title:
Comparison of sequences generated by a hidden Markov model

Thumbnail Image
Author(s)
Kerchev, George Georgiev
Authors
Advisor(s)
Houdré, Christian
Advisor(s)
Editor(s)
Associated Organization(s)
Organizational Unit
Organizational Unit
Series
Supplementary to
Abstract
The length $LC_n$ of the longest common subsequences of two strings $X = (X_1, \ldots, X_n)$ and $Y = (Y_1, \ldots, Y_n)$ is way to measure the similarity between $X$ and $Y$. We study the asymptotic behavior of $LC_n$ when the two strings are generated by a hidden Markov model $(Z, (X, Y))$. The latent chain $Z$ is an aperiodic time-homogeneous and irreducible finite state Markov chain and the pair $(X_i, Y_i)$ is generated according to a distribution depending of the state of $Z_i$ for every $i \geq 1$. The letters $X_i$ and $Y_i$ each take values in a finite alphabet $\mathcal{A}$. The goal of this work is to build upon asymptotic results for $LC_n$ obtained for sequences of iid random variables. Under some standard assumptions regarding the model we first prove convergence results with rates for $\mathbb{E}[LC_n]$. Then, versions of concentration inequalities for the transversal fluctuations of $LC_n$ are obtained. Finally, we have outlined a proof for a central limit theorem by building upon previous work and adapting a Stein's method estimate.
Sponsor
Date Issued
2019-03-26
Extent
Resource Type
Text
Resource Subtype
Dissertation
Rights Statement
Rights URI