Exploiting spatial and temporal redundancies for vector quantization of speech and images

Thumbnail Image
Meh Chu, Chu
Anderson, David V.
Associated Organization(s)
Supplementary to
The objective of the proposed research is to compress data such as speech, audio, and images using a new re-ordering vector quantization approach that exploits the transition probability between consecutive code vectors in a signal. Vector quantization is the process of encoding blocks of samples from a data sequence by replacing every input vector from a dictionary of reproduction vectors. Shannon’s rate-distortion theory states that signals encoded as blocks of samples have a better rate-distortion performance relative to when encoded on a sample-to-sample basis. As such, vector quantization achieves a lower coding rate for a given distortion relative to scalar quantization for any given signal. Vector quantization does not take advantage of the inter-vector correlation between successive input vectors in data sequences. It has been demonstrated that real signals have significant inter-vector correlation. This correlation has led to vector quantization approaches that encode input vectors based on previously encoded vectors. Some methods have been proposed in literature to exploit the dependence between successive code vectors. Predictive vector quantization, dynamic codebook re-ordering, and finite-state vector quantization are examples of vector quantization schemes that use intervector correlation. Predictive vector quantization and finite-state vector quantization predict the reproduction vector for a given input vector by using past input vectors. Dynamic codebook re-ordering vector quantization has the same reproduction vectors as standard vector quantization. The dynamic codebook re-ordering algorithm is based on the concept of re-ordering indices whereby existing reproduction vectors are assigned new channel indices according a structure that orders the reproduction vectors in an order of increasing dissimilarity. Hence, an input vector encoded in the standard vector quantization method is transmitted through a channel with new indices such that 0 is assigned to the closest reproduction vector to the past reproduction vector. Larger index values are assigned to reproduction vectors that have larger distances from the previous reproduction vector. Dynamic codebook re-ordering assumes that the reproduction vectors of two successive vectors of real signals are typically close to each other according to a distance metric. Sometimes, two successively encoded vectors may have relatively larger distances from each other. Our likelihood codebook re-ordering vector quantization algorithm exploits the structure within a signal by exploiting the non-uniformity in the reproduction vector transition probability in a data sequence. Input vectors that have higher probability of transition from prior reproduction vectors are assigned indices of smaller values. The code vectors that are more likely to follow a given vector are assigned indices closer to 0 while the less likely are given assigned indices of higher value. This re-ordering provides the reproduction dictionary a structure suitable for entropy coding such as Huffman and arithmetic coding. Since such transitions are common in real signals, it is expected that our proposed algorithm when combined with entropy coding algorithms such binary arithmetic and Huffman coding, will result in lower bit rates for the same distortion as a standard vector quantization algorithm. The re-ordering vector quantization approach on quantized indices can be useful in speech, images, audio transmission. By applying our re-ordering approach to these data types, we expect to achieve lower coding rates for a given distortion or perceptual quality. This reduced coding rate makes our proposed algorithm useful for transmission and storage of larger image, speech streams for their respective communication channels. The use of truncation on the likelihood codebook re-ordering scheme results in much lower compression rates without significantly distorting the perceptual quality of the signals. Today, texts and other multimedia signals may be benefit from this additional layer of likelihood re-ordering compression.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI