Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 4 of 4
  • Item
    Adaptive visual network analytics: Algorithms, interfaces, and systems for exploration and querying
    (Georgia Institute of Technology, 2017-10-04) Pienta, Robert S.
    Large graphs are now commonplace, amplifying the fundamental challenges of exploring, navigating, and understanding massive data. Our work tackles critical aspects of graph sensemaking, to create human-in-the-loop network exploration tools. This dissertation is comprised of three research thrusts, in which we combine techniques from data mining, visual analytics, and graph databases to create scalable, adaptive, interaction-driven graph sensemaking tools. (1) Adaptive Local Graph Exploration: our FACETS system introduces an adaptive exploration paradigm for large graphs to guide user towards interesting and surprising content, based on a novel measurement of surprise and subjective user interest using feature-entropy and the Jensen-Shannon divergence. (2) Interactive Graph Querying: VISAGE empowers analysts to create and refine queries in a visual, interactive environment, without having to write in a graph querying language, outperforming conventional query writing and refinement. Our MAGE algorithm locates high quality approximate subgraph matches and scales to large graphs. (3) Summarizing Subgraph Discovery: we introduce VIGOR, a novel system for summarizing graph querying results, providing practical tools and addressing research challenges in interpreting, grouping, comparing, and exploring querying results. This dissertation contributes to visual analytics, data mining, and their intersection through: interactive systems and scalable algorithms; new measures for ranking content; and exploration paradigms that overcome fundamental challenges in visual analytics. Our contributions work synergistically by utilizing the strengths of visual analytics and graph data mining together to forward graph analytics.
  • Item
    Geometric feature extraction in support of the single digital thread approach to detailed design
    (Georgia Institute of Technology, 2016-12-08) Gharbi, Aroua
    Aircraft design is a multi-disciplinary and complicated process that takes a long time and requires a large number of trade-offs between customer requirements, various types of constraints and market competition. Particularly detailed design is the phase that takes most of the time due to the high number of iterations between the component design and the structural analysis that need to be run before reaching an optimal design. In this thesis, an innovative approach for detailed design is suggested. It promotes a collaborative framework in which knowledge from the small scale level of components is shared and transferred to the subsystems and systems level leading to more robust and real time decisions that speed up the design time. This approach is called the Single Digital Thread Approach to Detailed Design or shortly STAnDD. The implementation of this approach is laid over a bottom-up plan, starting from the component level up to the aircraft level. In the component level and from a detailed design perspective, three major operations need to be executed in order to deploy the Single Digital Thread approach. The first one is the automatic geometric extraction of component features from a solid with no design history, the second phase is building an optimizer around the design and analysis iterations and the third one is the automatic update of the solid. This thesis suggests a methodology to implement the first phase. Extracting geometric features automatically from a solid with no history(also called dumb solid) is not an easy process especially in aircraft industry where most of the components have very complex shapes. Innovative techniques from Machine Learning were used allowing a consistent and robust extraction of the data.
  • Item
    Graph-based algorithms and models for security, healthcare, and finance
    (Georgia Institute of Technology, 2016-04-15) Tamersoy, Acar
    Graphs (or networks) are now omnipresent, infusing into many aspects of society. This dissertation contributes unified graph-based algorithms and models to help solve large-scale societal problems affecting millions of individuals' daily lives, from cyber-attacks involving malware to tobacco and alcohol addiction. The main thrusts of our research are: (1) Propagation-based Graph Mining Algorithms: We develop graph mining algorithms to propagate information between the nodes to infer important details about the unknown nodes. We present three examples: AESOP (patented) unearths malware lurking in people's computers with 99.61% true positive rate at 0.01% false positive rate; our application of ADAGE on malware detection (patent-pending) enables to detect malware in a streaming setting; and EDOCS (patent-pending) flags comment spammers among 197 thousand users on a social media platform accurately and preemptively. (2) Graph-induced Behavior Characterization: We derive new insights and knowledge that characterize certain behavior from graphs using statistical and algorithmic techniques. We present two examples: a study on identifying attributes of smoking and drinking abstinence and relapse from an addiction cessation social media community; and an exploratory analysis of how company insiders trade. Our work has already made impact to society: deployed by Symantec, AESOP is protecting over 120 million people worldwide from malware; EDOCS has been deployed by Yahoo and it guards multiple online communities from comment spammers.
  • Item
    Efficient inference algorithms for network activities
    (Georgia Institute of Technology, 2015-01-08) Tran, Long Quoc
    The real social network and associated communities are often hidden under the declared friend or group lists in social networks. We usually observe the manifestation of these hidden networks and communities in the form of recurrent and time-stamped individuals' activities in the social network. The inference of relationship between users/nodes or groups of users/nodes could be further complicated when activities are interval-censored, that is, when one only observed the number of activities that occurred in certain time windows. The same phenomenon happens in the online advertisement world where the advertisers often offer a set of advertisement impressions and observe a set of conversions (i.e. product/service adoption). In this case, the advertisers desire to know which advertisements best appeal to the customers and most importantly, their rate of conversions. Inspired by these challenges, we investigated inference algorithms that efficiently recover user relationships in both cases: time-stamped data and interval-censored data. In case of time-stamped data, we proposed a novel algorithm called NetCodec, which relies on a Hawkes process that models the intertwine relationship between group participation and between-user influence. Using Bayesian variational principle and optimization techniques, NetCodec could infer both group participation and user influence simultaneously with iteration complexity being O((N+I)G), where N is the number of events, I is the number of users, and G is the number of groups. In case of interval-censored data, we proposed a Monte-Carlo EM inference algorithm where we iteratively impute the time-stamped events using a Poisson process that has intensity function approximates the underlying intensity function. We show that that proposed simulated approach delivers better inference performance than baseline methods. In the advertisement problem, we propose a Click-to-Conversion delay model that uses Hawkes processes to model the advertisement impressions and thinned Poisson processes to model the Click-to-Conversion mechanism. We then derive an efficient Maximum Likelihood Estimator which utilizes the Minorization-Maximization framework. We verify the model against real life online advertisement logs in comparison with recent conversion rate estimation methods. To facilitate reproducible research, we also developed an open-source software package that focuses on various Hawkes processes proposed in the above mentioned works and prior works. We provided efficient parallel (multi-core) implementations of the inference algorithms using the Bayesian variational inference framework. To further speed up these inference algorithms, we also explored distributed optimization techniques for convex optimization under the distributed data situation. We formulate this problem as a consensus-constrained optimization problem and solve it with the alternating direction method for multipliers (ADMM). It turns out that using bipartite graph as communication topology exhibits the fastest convergence.