Title:
Information retrieval via universal source coding

dc.contributor.advisor Juang, Biing-Hwang
dc.contributor.author Bae, Soo Hyun en_US
dc.contributor.committeeMember Al Regib, Ghassan
dc.contributor.committeeMember Linda Wiils
dc.contributor.committeeMember Mersereau, Russell
dc.contributor.committeeMember Pappas, Thrasyvoulos
dc.contributor.department Electrical and Computer Engineering en_US
dc.date.accessioned 2009-01-22T15:47:00Z
dc.date.available 2009-01-22T15:47:00Z
dc.date.issued 2008-11-17 en_US
dc.description.abstract This dissertation explores the intersection of information retrieval and universal source coding techniques and studies an optimal multidimensional source representation from an information theoretic point of view. Previous research on information retrieval particularly focus on learning probabilistic or deterministic source models based on primarily two different types of source representations, e.g., fixed-shape partitions or uniform regions. We study the limitations of the conventional source representations on capturing the semantics of the given multidimensional source sequences and propose a new type of primitive source representation generated by a universal source coding technique. We propose a multidimensional incremental parsing algorithm extended from the Lempel-Ziv incremental parsing and its three component schemes for multidimensional source coding. The properties of the proposed coding algorithm are exploited under two-dimensional lossless and lossy source coding. By the proposed coding algorithm, a given multidimensional source sequence is parsed into a number of variable-size patches. We call this methodology a parsed representation. Based on the source representation, we propose an information retrieval framework that analyzes a set of source sequences under a linguistic processing technique and implemented content-based image retrieval systems. We examine the relevance of the proposed source representation by comparing it with the conventional representation of visual information. To further extend the proposed framework, we apply a probabilistic linguistic processing technique to modeling the latent aspects of a set of documents. In addition, beyond the symbol-wise pattern matching paradigm employed in the source coding and the image retrieval systems, we devise a robust pattern matching that compares the first- and second-order statistics of source patches. Qualitative and quantitative analysis of the proposed framework justifies the superiority of the proposed information retrieval framework based on the parsed representation. The proposed source representation technique and the information retrieval frameworks encourage future work in exploiting a systematic way of understanding multidimensional sources that parallels a linguistic structure. en_US
dc.description.degree Ph.D. en_US
dc.identifier.uri http://hdl.handle.net/1853/26573
dc.publisher Georgia Institute of Technology en_US
dc.subject Image retrieval en_US
dc.subject Image compression en_US
dc.subject Information retrieval en_US
dc.subject Machine learning en_US
dc.subject Universal source coding en_US
dc.subject Data compression en_US
dc.subject Incremental parsing en_US
dc.subject Pattern recognition en_US
dc.subject Natural language processing en_US
dc.subject.lcsh Pattern recognition systems
dc.subject.lcsh Pattern perception
dc.subject.lcsh Computer science
dc.subject.lcsh Information storage and retrieval systems
dc.subject.lcsh Multidimensional databases
dc.title Information retrieval via universal source coding en_US
dc.type Text
dc.type.genre Dissertation
dspace.entity.type Publication
local.contributor.advisor Juang, Biing-Hwang
local.contributor.corporatename School of Electrical and Computer Engineering
local.contributor.corporatename College of Engineering
relation.isAdvisorOfPublication 2818fb2c-1e00-4140-a090-68294889005d
relation.isOrgUnitOfPublication 5b7adef2-447c-4270-b9fc-846bd76f80f2
relation.isOrgUnitOfPublication 7c022d60-21d5-497c-b552-95e489a06569
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
bae_soohyun_200812_phd.pdf
Size:
3.83 MB
Format:
Adobe Portable Document Format
Description: