Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 43
  • Item
    UnMask: Adversarial Detection and Defense in Deep Learning Through Building-Block Knowledge Extraction
    (Georgia Institute of Technology, 2019) Freitas, Scott ; Chen, Shang-Tse ; Chau, Duen Horng
    Deep learning models are being integrated into a wide range of high-impact, security-critical systems, from self-driving cars to biomedical diagnosis. However, recent research has demonstrated that many of these deep learning architectures are highly vulnerable to adversarial attacks—highlighting the vital need for defensive techniques to detect and mitigate these attacks before they occur. To combat these adversarial attacks, we developed UnMask, a knowledge-based adversarial detection and defense framework. The core idea behind UnMask is to protect these models by verifying that an image’s predicted class (“bird”) contains the expected building blocks (e.g., beak, wings, eyes). For example, if an image is classified as “bird”, but the extracted building blocks are wheel, seat and frame, the model may be under attack. UnMask detects such attacks and defends the model by rectifying the misclassification, re-classifying the image based on its extracted building blocks. Our extensive evaluation shows that UnMask (1) detects up to 92.9% of attacks, with a false positive rate of 9.67% and (2) defends the model by correctly classifying up to 92.24% of adversarial images produced by the current strongest attack, Projected Gradient Descent, in the gray-box setting. Our proposed method is architecture agnostic and fast. To enable reproducibility of our research, we have anonymously open-sourced our code and large newly-curated dataset (~5GB) on GitHub (https://github.com/unmaskd/UnMask).
  • Item
    Single-tree GMM training
    (Georgia Institute of Technology, 2015-05-27) Curtin, Ryan R.
  • Item
    Leveraging Memory Mapping for Fast and Scalable Graph Computation on a PC
    (Georgia Institute of Technology, 2013-08) Lin, Zhiyuan ; Chau, Duen Horng
    Large graphs with billions of nodes and edges are increasingly common, calling for new kinds of scalable computation frameworks. Although popular, distributed approaches can be expensive to build, or require many resources to manage or tune. State-of-the-art approaches such as GraphChi and TurboGraph recently have demonstrated that a single machine can efficiently perform advanced computation on billion-node graphs. Although fast, they both use sophisticated data structures, memory management, and optimization techniques. We propose a minimalist approach that forgoes such complexities, by leveraging the memory mapping capability found on operating systems. Our experiments on large datasets, such as a 1.5 billion edge Twitter graph, show that our streamlined approach achieves up to 26 times faster than GraphChi, and comparable to TurboGraph. We con- tribute our crucial insight that by leveraging memory mapping, a fundamental operating system capability, we can outperform the latest graph computation techniques.
  • Item
    To Gather Together for a Better World: Understanding and Leveraging Communities in Micro-lending Recommendation
    (Georgia Institute of Technology, 2013) Choo, Jaegul ; Lee, Daniel ; Dilkina, Bistra ; Zha, Hongyuan ; Park, Haesun
    Micro-finance organizations provide non-profit lending opportunities to mitigate poverty by financially supporting impoverished, yet skilled entrepreneurs who are in desperate need of an institution that lends to them. In Kiva.org, a widely-used crowd-funded micro-financial service, a vast amount of micro-financial activities are done by lending teams, and thus, understanding their diverse characteristics is crucial in maintaining a healthy micro-finance ecosystem. As the first step for this goal, we model different lending teams by using a maximum-entropy distribution approach based on a wealthy set of heterogeneous information regarding micro-financial transactions available at Kiva. Based on this approach, we achieved a competitive performance of 0.84 AUC value in predicting the lending activities for the top 200 teams. Furthermore, we provide deep insight about the characteristics of lending teams by analyzing the resulting team-specific lending models. We found that lending teams are generally more careful in selecting loans by a loan’s geolocation, a borrower’s gender, a field partner’s reliability, etc., when compared to lenders without team affiliations. In addition, we identified interesting lending behaviors of different lending teams based on lenders’ background and interest such as their ethnic, religious, linguistic, educational, regional, and occupational aspects. Finally, using our proposed model, we tackled a novel problem of lending team recommendation and showed its promising performance results.
  • Item
    PIVE: A Per-Iteration Visualization Environment for Supporting Real-time Interactions with Computational Methods
    (Georgia Institute of Technology, 2013) Choo, Jaegul ; Lee, Changhyun ; Park, Haesun
    Visual analytics has been gaining increasing interest due to its fascinating characteristic that leverages both humans’ visual perception and the power of computing. Although various computational methods are being proposed, they do not properly support visual analytics. One of the biggest obstacles towards their real-time visual analytic integration is their high computational complexity. As a way to tackle this problem, this paper presents PIVE, a Per-Iteration Visualization Environment for supporting real-time interactive visualization with computational methods. The main idea behind PIVE is that most advanced computational methods work by refining the solution iteratively. By visually delivering the result from each iteration to users, the proposed framework enables users to quickly acquire the information that the computational method provides as well as the ability to perform continuous interactions with them in real time. We show the effectiveness of PIVE in terms of real-time visualization and interaction capabilities by customizing various dimension reduction methods such as principal component analysis, multidimensional scaling, and t-distributed stochastic neighborhood embedding, and clustering method s such as k-means and latent Dirichlet allocation.
  • Item
    MMAP: Mining Billion-Scale Graphs on a PC with Fast, Minimalist Approach via Memory Mapping
    (Georgia Institute of Technology, 2013) Sabrin, Kaeser Md. ; Lin, Zhiyuan ; Chau, Duen Horng ; Lee, Ho ; Kang, U.
    Large graphs with billions of nodes and edges are increasingly common, calling for new kinds of scalable computation frameworks. State-of-the-art approaches such as GraphChi and TurboGraph recently demonstrated that a single PC can efficiently perform advanced computation on billion-node graphs. Although fast, they use sophisticated data structures, explicit memory management, and optimization techniques to achieve high speed and scalability. We propose a minimalist approach that forgoes such complexities, by leveraging the fundamental memory mapping (MMap) capability found on operating systems. We present multiple, major findings; we contribute: (1) our crucial insight that MMap can be a viable technique for creating fast, scalable graph algorithms that surpass some of the best techniques; (2) a counterintuitive result that we can do less and gain more ; MMap enables us to use a much simpler data structure (edge list) and algorithm design, and to defer memory management to the OS, while offering significantly faster or comparable performance as highly-optimized methods (e.g., 10 X as fast as GraphChi PageRank on 1.47 billion edge Twitter graph); (3) we performed extensive experiments on real and synthetic graphs, including the 6.6 billion edge YahooWeb graph, and show that MMap’s benefits sustain in most conditions. We hope this work will inspire others to explore how memory mapping may help improve other methods or algorithms to further increase their speed and scalability.
  • Item
    Mage: Expressive Pattern Matching in Richly-Attributed Graphs
    (Georgia Institute of Technology, 2013) Pienta, Robert ; Tamersoy, Acar ; Tong, Hanghang ; Chau, Duen Horng
    Given a large graph with millions of nodes and edges, say a social graph where both the nodes and edges can have multiple different kinds of attributes (e.g., job titles, tie strengths), how do we quickly find matches for subgraphs of interest (e.g., a ring of businessmen with strong ties)? We propose MAGE, Multiple Attribute Graph Engine, a subgraph matching framework that pushes the envelope of graph matching capabilities and performance, through several major innovations: (i) with line graph transformation, MAGE works for graphs with both node and edge attributes and return both exact as well as near matches — other techniques often support only node attributes and return only exact matches; (ii) MAGE supports a plethora of queries, including multiple attributes for each node or edge, wild-cards as attribute values (i.e., match any permissible value), and continuous attributes via multiple discretization strategies; (iii) MAGE leverages a novel technique based on memory mapping to compute random walk with restart probabilities, which provides a speedup of more than 2 orders of magnitude on large graphs. We evaluated MAGE’s effectiveness and scalability with real and synthetic graphs with up to 2.3 million edges. Experimental results on the DBLP authorship graph and the Rotten Tomatoes movie graph illustrate the effectiveness and exploratory functionality of our contributions to graph querying. By devising query-centric innovations, our work improves the ease with which a user can explore their graph data.
  • Item
    VisIRR: Interactive Visual Information Retrieval and Recommendation for Large-scale Document Data
    (Georgia Institute of Technology, 2013) Choo, Jaegul ; Lee, Changhyun ; Clarkson, Edward ; Liu, Zhicheng ; Lee, Hanseung ; Chau, Duen Horng ; Li, Fuxin ; Kannan, Ramakrishnan ; Stolper, Charles D. ; Inouye, David ; Mehta, Nishant ; Ouyang, Hua ; Som, Subhojit ; Gray, Alexander ; Stasko, John T. ; Park, Haesun
    We present a visual analytics system called VisIRR, which is an interactive visual information retrieval and recommendation system for document discovery. VisIRR effectively combines both paradigms of passive pull through a query processes for retrieval and active push that recommends the items of potential interest based on the user preferences. Equipped with efficient dynamic query interfaces for a large corpus of document data, VisIRR visualizes the retrieved documents in a scatter plot form with their overall topic clusters. At the same time, based on interactive personalized preference feedback on documents, VisIRR provides recommended documents reaching out to the entire corpus beyond the retrieved sets. Such recommended documents are represented in the same scatter space of the retrieved documents so that users can perform integrated analyses of both retrieved and recommended documents seamlessly. We describe the state-of-the-art computational methods that make these integrated and informative representations as well as real time interaction possible. We illustrate the way the system works by using detailed usage scenarios. In addition, we present a preliminary user study that evaluates the effectiveness of the system.
  • Item
    Visualize It-Wise! An Iteration-Wise Computational Framework for Real-Time Visual Analytics
    (Georgia Institute of Technology, 2013) Choo, Jaegul ; Lee, Changhyun ; Park, Haesun
    Abstract Visual analytics has been gaining increasing interest due to its fascinating characteristic that leverages both humans’ visual perception and the power of computing. Although various computational methods are being proposed, they do not properly support visual analytics. One of the biggest obstacles towards their real-time visual analytic integration is their high computational complexity. As a way to tackle this problem, this paper presents an iteration-wise computational framework, motivated by the fact that most advanced computational methods work by refining the solution iteratively. By visually delivering the results for each iteration to users, the proposed framework enables users to quickly acquire the information that the computational method provides as well as the ability to interact with them in real time. We show the benefits of the proposed framework by using various dimension reduction and clustering methods.
  • Item
    A Better World for All: Understanding and Promoting Micro-finance Activities in Kiva.org
    (Georgia Institute of Technology, 2013) Choo, Jaegul ; Lee, Changhyun ; Lee, Daniel ; Zha, Hongyuan ; Park, Haesun
    Micro-finance organizations provide non-profit loaning opportunities to eradicate poverty by financially equipping impoverished, yet skilled entrepreneurs who are in desperate need of an institution that lends to those who have little. Kiva.org, a widely-used crowd-funded micro-financial service, provides researchers with an extensive amount of openly downloadable data containing a wealthy set of heterogeneous information regarding micro-financial transactions. Our objective is to identify the key factors that encourage people to make micro-financing donations, and ultimately, to keep them actively involved. In our contribution to further promote a healthy micro-finance ecosystem, we detail our personalized loan recommendation system which we formulate as a supervised learning problem where we try to predict how likely a given lender will fund a new loan. We construct the features for each data item by utilizing the available connectivity relationships in order to integrate all the available Kiva data types. For those lenders with no such relationships, e.g., first-time lenders, we propose a novel method of feature construction by computing joint nonnegative matrix factorization. By using a gradient boosting tree, a state-of-the-art prediction model, we are able to achieve up to 0.92 AUC (area under the curve) value, which shows that our work is ready for use in practice. Finally, we reveal various interesting knowledge about lenders’ social behaviors in micro-finance activities.