Characterizing the redundancy of
universal source coding for finite-length sequences

Beirami, Ahmad

Title:

Characterizing the redundancy of universal source coding for finite-length sequences

Files

Beirami_Ahmad_201108_MS.pdf.pdf (391.67 KB)

Author(s)

Beirami, Ahmad

Advisor(s)

Fekri, Faramarz

Advisor(s)

Person

Fekri, Faramarz

Associated Organization(s)

Organizational Unit

School of Electrical and Computer Engineering

Organizational Unit

College of Engineering

Collections

Theses and Dissertations

Permanent Link

http://hdl.handle.net/1853/45750

Abstract

In this thesis, we first study what is the average redundancy resulting from the universal compression of a single finite-length sequence from an unknown source. In the universal compression of a source with d unknown parameters, Rissanen demonstrated that the expected redundancy for regular codes is asymptotically d/2 log n + o(log n) for almost all sources, where n is the sequence length. Clarke and Barron also derived the asymptotic average minimax redundancy for memoryless sources. The average minimax redundancy is concerned with the redundancy of the worst parameter vector for the best code. Thus, it does not provide much information about the effect of the different source parameter values. Our treatment in this thesis is probabilistic. In particular, we derive a lower bound on the probability measure of the event that a sequence of length n from an FSMX source chosen using Jeffreys' prior is compressed with a redundancy larger than a certain fraction of d/2 log n. Further, our results show that the average minimax redundancy provides good estimate for the average redundancy of most sources for large enough n and d. On the other hand, when the source parameter d is small the average minimax redundancy overestimates the average redundancy for small to moderate length sequences. Additionally, we precisely characterize the average minimax redundancy of universal coding when the coding scheme is restricted to be from the family of two--stage codes, where we show that the two--stage assumption incurs a negligible redundancy for small and moderate length n unless the number of source parameters is small. %We show that redundancy is significant in the compression of small sequences. Our results, collectively, help to characterize the non-negligible redundancy resulting from the compression of small and moderate length sequences. Next, we apply these results to the compression of a small to moderate length sequence provided that the context present in a sequence of length M from the same source is memorized. We quantify the achievable performance improvement in the universal compression of the small to moderate length sequence using context memorization.

Date Issued

2011-12-16

Resource Type

Text

Resource Subtype

Thesis

Full item page

Title:

Characterizing the redundancy of universal source coding for finite-length sequences

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Georgia Tech Library

Title: Characterizing the redundancy of universal source coding for finite-length sequences

Files

Author(s)

Authors

Advisor(s)

Advisor(s)

Editor(s)

Associated Organization(s)

Series

Collections

Supplementary to

Permanent Link

Abstract

Sponsor

Date Issued

Extent

Resource Type

Resource Subtype

Rights Statement

Rights URI

Title:

Characterizing the redundancy of universal source coding for finite-length sequences