Title:
New paradigms for approximate nearest-neighbor search
New paradigms for approximate nearest-neighbor search
dc.contributor.advisor | Balcan, Maria-Florina | |
dc.contributor.advisor | Gray, Alexander G. | |
dc.contributor.author | Ram, Parikshit | |
dc.contributor.committeeMember | Lebanon, Guy | |
dc.contributor.committeeMember | Clarkson, Kenneth L. | |
dc.contributor.committeeMember | Vempala, Santosh S. | |
dc.contributor.department | Computational Science and Engineering | |
dc.date.accessioned | 2013-09-20T13:30:03Z | |
dc.date.available | 2013-09-20T13:30:03Z | |
dc.date.created | 2013-08 | |
dc.date.issued | 2013-07-02 | |
dc.date.submitted | August 2013 | |
dc.date.updated | 2013-09-20T13:30:08Z | |
dc.description.abstract | Nearest-neighbor search is a very natural and universal problem in computer science. Often times, the problem size necessitates approximation. In this thesis, I present new paradigms for nearest-neighbor search (along with new algorithms and theory in these paradigms) that make nearest-neighbor search more usable and accurate. First, I consider a new notion of search error, the rank error, for an approximate neighbor candidate. Rank error corresponds to the number of possible candidates which are better than the approximate neighbor candidate. I motivate this notion of error and present new efficient algorithms that return approximate neighbors with rank error no more than a user specified amount. Then I focus on approximate search in a scenario where the user does not specify the tolerable search error (error constraint); instead the user specifies the amount of time available for search (time constraint). After differentiating between these two scenarios, I present some simple algorithms for time constrained search with provable performance guarantees. I use this theory to motivate a new space-partitioning data structure, the max-margin tree, for improved search performance in the time constrained setting. Finally, I consider the scenario where we do not require our objects to have an explicit fixed-length representation (vector data). This allows us to search with a large class of objects which include images, documents, graphs, strings, time series and natural language. For nearest-neighbor search in this general setting, I present a provably fast novel exact search algorithm. I also discuss the empirical performance of all the presented algorithms on real data. | |
dc.description.degree | Ph.D. | |
dc.format.mimetype | application/pdf | |
dc.identifier.uri | http://hdl.handle.net/1853/49112 | |
dc.language.iso | en_US | |
dc.publisher | Georgia Institute of Technology | |
dc.subject | Similarity search | |
dc.subject | Nearest-neighbor search | |
dc.subject | Computational geometry | |
dc.subject | Algorithms and analysis | |
dc.subject.lcsh | Nearest neighbor analysis (Statistics) | |
dc.subject.lcsh | Approximation algorithms | |
dc.subject.lcsh | Search theory | |
dc.title | New paradigms for approximate nearest-neighbor search | |
dc.type | Text | |
dc.type.genre | Dissertation | |
dspace.entity.type | Publication | |
local.contributor.corporatename | College of Computing | |
local.contributor.corporatename | School of Computational Science and Engineering | |
relation.isOrgUnitOfPublication | c8892b3c-8db6-4b7b-a33a-1b67f7db2021 | |
relation.isOrgUnitOfPublication | 01ab2ef1-c6da-49c9-be98-fbd1d840d2b1 | |
thesis.degree.level | Doctoral |
Files
Original bundle
1 - 1 of 1
- Name:
- RAM-DISSERTATION-2013.pdf
- Size:
- 13.47 MB
- Format:
- Adobe Portable Document Format
- Description: