Title:
New paradigms for approximate nearest-neighbor search

dc.contributor.advisor Balcan, Maria-Florina
dc.contributor.advisor Gray, Alexander G.
dc.contributor.author Ram, Parikshit
dc.contributor.committeeMember Lebanon, Guy
dc.contributor.committeeMember Clarkson, Kenneth L.
dc.contributor.committeeMember Vempala, Santosh S.
dc.contributor.department Computational Science and Engineering
dc.date.accessioned 2013-09-20T13:30:03Z
dc.date.available 2013-09-20T13:30:03Z
dc.date.created 2013-08
dc.date.issued 2013-07-02
dc.date.submitted August 2013
dc.date.updated 2013-09-20T13:30:08Z
dc.description.abstract Nearest-neighbor search is a very natural and universal problem in computer science. Often times, the problem size necessitates approximation. In this thesis, I present new paradigms for nearest-neighbor search (along with new algorithms and theory in these paradigms) that make nearest-neighbor search more usable and accurate. First, I consider a new notion of search error, the rank error, for an approximate neighbor candidate. Rank error corresponds to the number of possible candidates which are better than the approximate neighbor candidate. I motivate this notion of error and present new efficient algorithms that return approximate neighbors with rank error no more than a user specified amount. Then I focus on approximate search in a scenario where the user does not specify the tolerable search error (error constraint); instead the user specifies the amount of time available for search (time constraint). After differentiating between these two scenarios, I present some simple algorithms for time constrained search with provable performance guarantees. I use this theory to motivate a new space-partitioning data structure, the max-margin tree, for improved search performance in the time constrained setting. Finally, I consider the scenario where we do not require our objects to have an explicit fixed-length representation (vector data). This allows us to search with a large class of objects which include images, documents, graphs, strings, time series and natural language. For nearest-neighbor search in this general setting, I present a provably fast novel exact search algorithm. I also discuss the empirical performance of all the presented algorithms on real data.
dc.description.degree Ph.D.
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/49112
dc.language.iso en_US
dc.publisher Georgia Institute of Technology
dc.subject Similarity search
dc.subject Nearest-neighbor search
dc.subject Computational geometry
dc.subject Algorithms and analysis
dc.subject.lcsh Nearest neighbor analysis (Statistics)
dc.subject.lcsh Approximation algorithms
dc.subject.lcsh Search theory
dc.title New paradigms for approximate nearest-neighbor search
dc.type Text
dc.type.genre Dissertation
dspace.entity.type Publication
local.contributor.corporatename College of Computing
local.contributor.corporatename School of Computational Science and Engineering
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication 01ab2ef1-c6da-49c9-be98-fbd1d840d2b1
thesis.degree.level Doctoral
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
RAM-DISSERTATION-2013.pdf
Size:
13.47 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
LICENSE_1.txt
Size:
3.87 KB
Format:
Plain Text
Description:
No Thumbnail Available
Name:
LICENSE.txt
Size:
3.87 KB
Format:
Plain Text
Description: