Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 6 of 6
  • Item
    Human-centered AI through scalable visual data analytics
    (Georgia Institute of Technology, 2019-11-01) Kahng, Minsuk Brian
    While artificial intelligence (AI) has led to major breakthroughs in many domains, understanding machine learning models remains a fundamental challenge. How can we make AI more accessible and interpretable, or more broadly, human-centered, so that people can easily understand and effectively use these complex models? My dissertation addresses these fundamental and practical challenges in AI through a human-centered approach, by creating novel data visualization tools that are scalable, interactive, and easy to learn and to use. With such tools, users can better understand models by visually exploring how large input datasets affect the models and their results. Specifically, my dissertation focuses on three interrelated parts: (1) Unified scalable interpretation: developing scalable visual analytics tools that help engineers interpret industry-scale deep learning models at both instance- and subset-level (e.g., ActiVis deployed by Facebook); (2) Data-driven model auditing: designing visual data exploration tools that support discovery of insights through exploration of data groups over different analytics stages, such as model comparison (e.g., MLCube) and fairness auditing (e.g., FairVis); and (3) Learning complex models by experimentation: building interactive tools that broaden people's access to learning complex deep learning models (e.g., GAN Lab) and browsing raw datasets (e.g., ETable). My research has made significant impact to society and industry. The ActiVis system for interpreting deep learning models has been deployed on Facebook's machine learning platform. The GAN Lab tool for learning GANs has been open-sourced in collaboration with Google, with its demo used by more than 70,000 people from over 160 countries.
  • Item
    AI-infused security: Robust defense by bridging theory and practice
    (Georgia Institute of Technology, 2019-09-20) Chen, Shang-Tse
    While Artificial Intelligence (AI) has tremendous potential as a defense against real-world cybersecurity threats, understanding the capabilities and robustness of AI remains a fundamental challenge. This dissertation tackles problems essential to successful deployment of AI in security settings and is comprised of the following three interrelated research thrusts. (1) Adversarial Attack and Defense of Deep Neural Networks: We discover vulnerabilities of deep neural networks in real-world settings and the countermeasures to mitigate the threat. We develop ShapeShifter, the first targeted physical adversarial attack that fools state-of-the-art object detectors. For defenses, we develop SHIELD, an efficient defense leveraging stochastic image compression, and UnMask, a knowledge-based adversarial detection and defense framework. (2) Theoretically Principled Defense via Game Theory and ML: We develop new theories that guide defense resources allocation to guard against unexpected attacks and catastrophic events, using a novel online decision-making framework that compels players to employ ``diversified'' mixed strategies. Furthermore, by leveraging the deep connection between game theory and boosting, we develop a communication-efficient distributed boosting algorithm with strong theoretical guarantees in the agnostic learning setting. (3) Using AI to Protect Enterprise and Society: We show how AI can be used in real enterprise environment with a novel framework called Virtual Product that predicts potential enterprise cyber threats. Beyond cybersecurity, we also develop the Firebird framework to help municipal fire departments prioritize fire inspections. Our work has made multiple important contributions to both theory and practice: our distributed boosting algorithm solved an open problem of distributed learning; ShaperShifter motivated a new DARPA program (GARD); Virtual Product led to two patents; and Firebird was highlighted by National Fire Protection Association as a best practice for using data to inform fire inspections.
  • Item
    Adaptive visual network analytics: Algorithms, interfaces, and systems for exploration and querying
    (Georgia Institute of Technology, 2017-10-04) Pienta, Robert S.
    Large graphs are now commonplace, amplifying the fundamental challenges of exploring, navigating, and understanding massive data. Our work tackles critical aspects of graph sensemaking, to create human-in-the-loop network exploration tools. This dissertation is comprised of three research thrusts, in which we combine techniques from data mining, visual analytics, and graph databases to create scalable, adaptive, interaction-driven graph sensemaking tools. (1) Adaptive Local Graph Exploration: our FACETS system introduces an adaptive exploration paradigm for large graphs to guide user towards interesting and surprising content, based on a novel measurement of surprise and subjective user interest using feature-entropy and the Jensen-Shannon divergence. (2) Interactive Graph Querying: VISAGE empowers analysts to create and refine queries in a visual, interactive environment, without having to write in a graph querying language, outperforming conventional query writing and refinement. Our MAGE algorithm locates high quality approximate subgraph matches and scales to large graphs. (3) Summarizing Subgraph Discovery: we introduce VIGOR, a novel system for summarizing graph querying results, providing practical tools and addressing research challenges in interpreting, grouping, comparing, and exploring querying results. This dissertation contributes to visual analytics, data mining, and their intersection through: interactive systems and scalable algorithms; new measures for ranking content; and exploration paradigms that overcome fundamental challenges in visual analytics. Our contributions work synergistically by utilizing the strengths of visual analytics and graph data mining together to forward graph analytics.
  • Item
    Geometric feature extraction in support of the single digital thread approach to detailed design
    (Georgia Institute of Technology, 2016-12-08) Gharbi, Aroua
    Aircraft design is a multi-disciplinary and complicated process that takes a long time and requires a large number of trade-offs between customer requirements, various types of constraints and market competition. Particularly detailed design is the phase that takes most of the time due to the high number of iterations between the component design and the structural analysis that need to be run before reaching an optimal design. In this thesis, an innovative approach for detailed design is suggested. It promotes a collaborative framework in which knowledge from the small scale level of components is shared and transferred to the subsystems and systems level leading to more robust and real time decisions that speed up the design time. This approach is called the Single Digital Thread Approach to Detailed Design or shortly STAnDD. The implementation of this approach is laid over a bottom-up plan, starting from the component level up to the aircraft level. In the component level and from a detailed design perspective, three major operations need to be executed in order to deploy the Single Digital Thread approach. The first one is the automatic geometric extraction of component features from a solid with no design history, the second phase is building an optimizer around the design and analysis iterations and the third one is the automatic update of the solid. This thesis suggests a methodology to implement the first phase. Extracting geometric features automatically from a solid with no history(also called dumb solid) is not an easy process especially in aircraft industry where most of the components have very complex shapes. Innovative techniques from Machine Learning were used allowing a consistent and robust extraction of the data.
  • Item
    Graph-based algorithms and models for security, healthcare, and finance
    (Georgia Institute of Technology, 2016-04-15) Tamersoy, Acar
    Graphs (or networks) are now omnipresent, infusing into many aspects of society. This dissertation contributes unified graph-based algorithms and models to help solve large-scale societal problems affecting millions of individuals' daily lives, from cyber-attacks involving malware to tobacco and alcohol addiction. The main thrusts of our research are: (1) Propagation-based Graph Mining Algorithms: We develop graph mining algorithms to propagate information between the nodes to infer important details about the unknown nodes. We present three examples: AESOP (patented) unearths malware lurking in people's computers with 99.61% true positive rate at 0.01% false positive rate; our application of ADAGE on malware detection (patent-pending) enables to detect malware in a streaming setting; and EDOCS (patent-pending) flags comment spammers among 197 thousand users on a social media platform accurately and preemptively. (2) Graph-induced Behavior Characterization: We derive new insights and knowledge that characterize certain behavior from graphs using statistical and algorithmic techniques. We present two examples: a study on identifying attributes of smoking and drinking abstinence and relapse from an addiction cessation social media community; and an exploratory analysis of how company insiders trade. Our work has already made impact to society: deployed by Symantec, AESOP is protecting over 120 million people worldwide from malware; EDOCS has been deployed by Yahoo and it guards multiple online communities from comment spammers.
  • Item
    Efficient inference algorithms for network activities
    (Georgia Institute of Technology, 2015-01-08) Tran, Long Quoc
    The real social network and associated communities are often hidden under the declared friend or group lists in social networks. We usually observe the manifestation of these hidden networks and communities in the form of recurrent and time-stamped individuals' activities in the social network. The inference of relationship between users/nodes or groups of users/nodes could be further complicated when activities are interval-censored, that is, when one only observed the number of activities that occurred in certain time windows. The same phenomenon happens in the online advertisement world where the advertisers often offer a set of advertisement impressions and observe a set of conversions (i.e. product/service adoption). In this case, the advertisers desire to know which advertisements best appeal to the customers and most importantly, their rate of conversions. Inspired by these challenges, we investigated inference algorithms that efficiently recover user relationships in both cases: time-stamped data and interval-censored data. In case of time-stamped data, we proposed a novel algorithm called NetCodec, which relies on a Hawkes process that models the intertwine relationship between group participation and between-user influence. Using Bayesian variational principle and optimization techniques, NetCodec could infer both group participation and user influence simultaneously with iteration complexity being O((N+I)G), where N is the number of events, I is the number of users, and G is the number of groups. In case of interval-censored data, we proposed a Monte-Carlo EM inference algorithm where we iteratively impute the time-stamped events using a Poisson process that has intensity function approximates the underlying intensity function. We show that that proposed simulated approach delivers better inference performance than baseline methods. In the advertisement problem, we propose a Click-to-Conversion delay model that uses Hawkes processes to model the advertisement impressions and thinned Poisson processes to model the Click-to-Conversion mechanism. We then derive an efficient Maximum Likelihood Estimator which utilizes the Minorization-Maximization framework. We verify the model against real life online advertisement logs in comparison with recent conversion rate estimation methods. To facilitate reproducible research, we also developed an open-source software package that focuses on various Hawkes processes proposed in the above mentioned works and prior works. We provided efficient parallel (multi-core) implementations of the inference algorithms using the Bayesian variational inference framework. To further speed up these inference algorithms, we also explored distributed optimization techniques for convex optimization under the distributed data situation. We formulate this problem as a consensus-constrained optimization problem and solve it with the alternating direction method for multipliers (ADMM). It turns out that using bipartite graph as communication topology exhibits the fastest convergence.