Title:
MMAP: Mining Billion-Scale Graphs on a PC with Fast, Minimalist Approach via Memory Mapping

dc.contributor.author Sabrin, Kaeser Md.
dc.contributor.author Lin, Zhiyuan
dc.contributor.author Chau, Duen Horng
dc.contributor.author Lee, Ho
dc.contributor.author Kang, U.
dc.contributor.corporatename Georgia Institute of Technology. College of Computing en_US
dc.contributor.corporatename Georgia Institute of Technology. School of Computational Science and Engineering en_US
dc.contributor.corporatename Korea Advanced Institute of Science and Technology. Dept. of Computer Science en_US
dc.date.accessioned 2013-10-17T19:43:19Z
dc.date.available 2013-10-17T19:43:19Z
dc.date.issued 2013
dc.description Research area: Graph Mining Algorithms en_US
dc.description.abstract Large graphs with billions of nodes and edges are increasingly common, calling for new kinds of scalable computation frameworks. State-of-the-art approaches such as GraphChi and TurboGraph recently demonstrated that a single PC can efficiently perform advanced computation on billion-node graphs. Although fast, they use sophisticated data structures, explicit memory management, and optimization techniques to achieve high speed and scalability. We propose a minimalist approach that forgoes such complexities, by leveraging the fundamental memory mapping (MMap) capability found on operating systems. We present multiple, major findings; we contribute: (1) our crucial insight that MMap can be a viable technique for creating fast, scalable graph algorithms that surpass some of the best techniques; (2) a counterintuitive result that we can do less and gain more ; MMap enables us to use a much simpler data structure (edge list) and algorithm design, and to defer memory management to the OS, while offering significantly faster or comparable performance as highly-optimized methods (e.g., 10 X as fast as GraphChi PageRank on 1.47 billion edge Twitter graph); (3) we performed extensive experiments on real and synthetic graphs, including the 6.6 billion edge YahooWeb graph, and show that MMap’s benefits sustain in most conditions. We hope this work will inspire others to explore how memory mapping may help improve other methods or algorithms to further increase their speed and scalability. en_US
dc.embargo.terms null en_US
dc.identifier.uri http://hdl.handle.net/1853/49226
dc.language.iso en_US en_US
dc.publisher Georgia Institute of Technology en_US
dc.relation.ispartofseries CSE Technical Reports ; GT-CSE-13-04 en_US
dc.subject Graph mining en_US
dc.subject Memory mapping en_US
dc.subject Scalable algorithms en_US
dc.subject Single machine en_US
dc.title MMAP: Mining Billion-Scale Graphs on a PC with Fast, Minimalist Approach via Memory Mapping en_US
dc.type Text
dc.type.genre Technical Report
dspace.entity.type Publication
local.contributor.author Chau, Duen Horng
local.contributor.corporatename College of Computing
local.contributor.corporatename School of Computational Science and Engineering
local.relation.ispartofseries College of Computing Technical Report Series
local.relation.ispartofseries School of Computational Science and Engineering Technical Report Series
relation.isAuthorOfPublication fb5e00ae-9fb7-475d-8eac-50c48a46ea23
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication 01ab2ef1-c6da-49c9-be98-fbd1d840d2b1
relation.isSeriesOfPublication 35c9e8fc-dd67-4201-b1d5-016381ef65b8
relation.isSeriesOfPublication 5a01f926-96af-453d-a75b-abc3e0f0abb3
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
GT-CSE-2013-04.pdf
Size:
687.92 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.13 KB
Format:
Item-specific license agreed upon to submission
Description: