Organizational Unit:
School of Computational Science and Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 3 of 3
  • Item
    Methodologies for co-designing supercomputer-scale systems and deep learning software
    (Georgia Institute of Technology, 2024-04-27) Isaev, Mikhail
    This dissertation introduces new methodologies to co-design deep learning software and supercomputer hardware in the setting of large-scale training. The first is an analytical performance model for exploring the co-design space of parallel algorithms for large language models (LLMs) and potential supercomputer architectures during the early phases of the co-design process. On the algorithm side, we consider diverse implementation strategies, including data, tensor, and pipeline parallelism, communication-computation overlap, and memory optimization. The hardware aspect includes hierarchical memory systems, multiple interconnection networks, and parameterized efficiencies based on operation size. Our open-source tool, Calculon, implements this model. Its analytical nature enables rapid evaluation, estimating performance for billions of strategy and architecture combinations. This facilitates co-design-space exploration for future LLMs with trillions of parameters, yielding insights into optimal system characteristics and the interplay between algorithmic and architectural decisions. As models scale beyond 100 trillion parameters, two bottlenecks become especially critical to address: memory capacity and network speed. For the former, Calculon suggests a hardware solution involving the addition of slower capacity-tier memory for intermediate tensors and model parameters, optimizing faster memory for current layer computation. For the latter, we present novel distributed-memory parallel matrix multiplication algorithms capable of hiding communication entirely, potentially achieving perfect scaling. Looking ahead, we foresee a need to model artificial intelligence (AI) applications beyond LLMs and perform detailed system simulations in later design stages. To meet these demands, we introduce ParaGraph, a tool bridging the gap between applications and network hardware simulators. ParaGraph features a high-level graph representation of parallel programs, automatically extracted from compiled applications and a runtime environment for emulator-based dynamic execution. Case studies on deep learning workloads extracted from JAX and TensorFlow programs illustrate ParaGraph's utility for software-hardware co-design workflows, including communication optimization, hardware bottleneck identification, and simulation validation.
  • Item
    Accurate and Trustworthy Recommender Systems: Algorithms and Findings
    (Georgia Institute of Technology, 2024-04-15) Oh, Sejoon
    The exponential growth of information on the Web has led to the problem of ”information overload,” which has been addressed through the use of recommender systems. Modern recommender systems use deep learning algorithms trained with user-item interaction data to generate recommendations. However, current recommender systems still face diverse challenges with respect to accuracy, personalization, and robustness. In this thesis, we investigate such challenges and provide insights and solutions to them. This thesis is divided into two parts: (1) making recommender systems accurate and personalized, and (2) making recommender systems robust and trustworthy. First, we study session-based recommender systems (SBRSs) and user intent-aware recommender systems, which have been proposed to enhance accuracy and personalization via modeling users’ short-term and evolving interests. Existing recommender systems face two significant limitations. First, they cannot incorporate session contexts or user intents (i.e., high-level interests) into their models, which could improve the next-item prediction. To address it, we propose a novel SBRS: ISCON to assign precise and meaningful implicit contexts to sessions via node embedding and clustering algorithms. By leveraging the session contexts found by ISCON, we can offer more personalized recommendations to end users. We also propose a new recommendation framework: INTENTREC that predicts a user’s intent on Netflix and uses that as one of the input features of the next-item prediction of the user. The user intents obtained by INTENTREC can be used for diverse applications such as real-time recommendations, personalized UI and notifications, etc. Second, existing recommender systems cannot scale to large real-world recommendation datasets. To handle the scalability issue, we propose M2TREC, a metadata-aware multi-task Transformer model that uses only item attributes to learn item representations and is completely item-ID free. With M2TREC, we can achieve faster convergence, higher accuracy, and robust recommendations with fewer training data. Sparse training data can cause recommendation models to produce incorrect and popularity-biased recommendations. It has been well-known that most recommendation datasets are extremely large and sparse, limiting the ability of models to generate effective representations for cold-start users or items with few interactions. To address the sparsity issue, we devise an influence-guided data augmentation technique DAIN that augments important data points for reducing training loss to the original data. With DAIN, we can enhance the recommendation model’s generalization ability and mitigate cold-start and popularity-bias problems. Apart from accuracy and personalization, we also analyze the robustness of existing recommender systems against input perturbations and devise a solution to enhance the robustness of the recommenders. Deep learning-based recommender systems have shown sensitivity to arbitrary and adversarial input perturbations, resulting in drastic alterations of recommendation lists after perturbations. The sensitivity disproportionately affects low-accuracy user groups compared to high-accuracy groups, making the models unreliable and detrimental to both users and service providers, particularly in high-stakes applications such as healthcare, education, and housing. Despite its importance, the stability of recommender systems has not been studied thoroughly. Thus, we first introduce two Rank List Sensitivity (RLS) metrics that allow us to measure changes in recommendations against perturbations, and we propose two training data perturbation mechanisms (random and CASPER) for recommender systems. We show that existing sequential recommenders are highly vulnerable against CASPER and even random perturbations. We further introduce a fine-tuning mechanism called FINEST that can stabilize predictions of sequential recommender systems against training data perturbations. FINEST simulates perturbations during the fine-tuning and utilizes a rank-preserving loss function to ensure stable recommendations. With FINEST, any sequential recommenders become more robust against interaction-level perturbations. Finally, we investigate the robustness of text-aware recommender systems against adversarial text rewriting. Our proposed text rewriting framework (ATR) can generate optimal product descriptions via two-phase fine-tuning of language models. Such rewritten product descriptions can significantly boost the ranks of target items, and the attackers can exploit the vulnerability of text-aware recommenders to promote their own items on diverse web platforms such as e-commerce. Our work highlights the importance of studying the robustness of existing recommenders and the need for inventing a defense mechanism against the text rewriting attack: ATR. Overall, we proposed next-generation recommendation frameworks as per accuracy, personalization, and robustness. We also suggest several ongoing and future works including a unified robustness benchmark of existing recommender systems, adversarial attacks/defenses against multimodal recommenders, and leveraging emerging large language models to maximize the accuracy, personalization, and interpretability of recommender systems.
  • Item
    Robust Representation Learning and Real-Time Serving of Deep Models for Health Time Series
    (Georgia Institute of Technology, 2023-04-25) Xu, Yanbo
    Modern Electronic Health Record (EHR) systems provide large amount of data that enables machine learning (ML) researchers to develop ML methods to improve healthcare. However, development in a clinical setting presents unique challenges in ML model training and serving. For example, EHR data are usually captured from multiple sources over time in noisy environments such as in Intensive Care Units (ICUs). As a result, data are generated in the form of time series with multiple issues including heterogeneity, missingness, irregulrity, etc. Although ML methods such as deep neural networks have been successfully developed for many predictive health tasks, improvements are still in need for learning robust and efficient predictive models to harness such multi-modal, noisy, and massive time series data. In this dissertation, we aim to tackle the following fundamental problems in developing ML models for health time series: 1. Multiple modalities in time series. Clinical time series are often generated on different devices at different frequencies. A typical ICU monitoring dataset can contain continuous signals like electrocardiogram (ECG), evenly charted tabular data like vital signs, and sparse discrete events like lab tests and medications. Simple binning methods on values can reduce rich information in dense data and mask important information in sparse data. To address this, we design an efficient ensembling algorithm for reweighting the models that are individualized for each data modality. Then for better capturing the underlying heterogeneity behind the multimodal data, we further design individualized embeddings per modality and fit self-attention Transformer on top of them for more robustly fusing the EHR time series. 2. Missing observations at random time steps. Data collection is often noisy in HER systems. Missing data or mis-timestamped data happens due to random device disconnections, patient’s body movement, human errors, etc. Models without considerations on input missingness and noises can lead to overfit and biased predictions. We incorporate stochastic differential equations into spatial temporal modeling, enabling imputations on randomly missing fields in structural time series with support of uncertainty quantification. We further propose score-based diffusion models for generating missing data and denoising the observed discrete event sequences. 3. Large unlabelled data available across different sites. True labels are expensive to obtain in clinical applications. Although input signals can be easily collected in EHR systems, many labels of interest still require manual annotations and data reviews from clinical experts retrospectively. Thus large amount of unlabelled data, which can be collected across several different hospitals, become available to researchers whereas only a few are labelled. To address this challenge, we investigate self-supervised learning in deep models and learn robust representations from the large unlabelled data that can be later adapted and fine tuned for downstream tasks. 4. Timely serving in resource-limited systems. In clinical environments such as ICUs, care practitioners need to make appropriate decision in a timely manner. Thus far deep learning models have been mainly developed for increasing prediction accuracy in heathcare, but few of them consider whether or how they can be served in real time given a resource constrained deployment environment. To bridge the gap, we design cost-aware prediction pipelines that can cascade to differently sized models for balancing between prediction accuracy and serving cost.