Organizational Unit:
School of Computer Science

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 10 of 10
Thumbnail Image
Item

A First Look at Autonomous Systems Recurrently Causing BGP Origination Conflicts

2024-05-06 , Bemba, Olivier

The Border Gateway Protocol (BGP) plays a key role in the Internet as it provides the path for packets to travel between independent networks (Autonomous Systems) on the Internet. However, it also allows multiple networks to announce reachability for the same prefix, which makes it vulnerable to attacks and misconfigurations that modify Internet traffic. This is known as an origin conflict. According to the Global Routing Intelligence Platform (GRIP), a software that detects this kind of event, less than 1\% of all Internet networks is responsible for almost 40\% of the most suspicious origin conflicts detected between January 1, 2020, and January 1, 2023. Therefore, it is important to try and understand whether these networks are not causing these conflicts for malicious purposes, or whether it is a matter of new routing habits, or simply misclassification from GRIP. As a first step, we leverage GRIP to isolate autonomous systems (ASes) that have been involved in at least one origin conflict on 50 different days. Then, for all these ASes, we collect data about their organization, their location, their type, and their GRIP events. In parallel, we also retrieve routing data using the RIPE Stat API to provide more context. Finally, we combine these data and analyze them to find indicators of malicious activity, configuration error, or legitimate behavior. Thanks to this first look at GRIP data from an AS perspective, we were able to observe some already seen legitimate behavior, such as the use of private ASNs or Internet exchange point prefixes. These two use cases need to be added to the GRIP classification system. Next, we also observed several business relationships, such as hosting providers with their customers, IP space lessors with their lessees, and DDoS mitigation providers with their customers. These business relationships need to be studied in greater depth to better characterize and detect them. We also present two use cases: a global mobile operator that is known to make frequent configuration errors, and a hosting provider that is known to provide services to malicious customers. This study is a first building block towards finding new insights into the routing habits of Internet operators and thus contributes to improving the global routing monitoring system.

Thumbnail Image
Item

Impressions: Understanding Visual Semiotics and Aesthetic Impact

2024-04-29 , Kruk, Julia

Is aesthetic impact different from beauty? Is visual salience a reflection of its capacity for effective communication? We present Impressions, a novel dataset through which to investigate the semiotics of images, and how specific visual features and design choices can elicit specific emotions, thoughts and beliefs. We posit that the impactfulness of an image extends beyond formal definitions of aesthetics, to its success as a communicative act, where style contributes as much to meaning formation as the subject matter. However, prior image captioning datasets are not designed to empower state-of-the-art architectures to model potential human impressions or interpretations of images. To fill this gap, we design an annotation task heavily inspired by image analysis techniques in the Visual Arts to collect 1,440 image-caption pairs and 4,320 unique annotations exploring impact, pragmatic image description, impressions, and aesthetic design choices. We show that existing multimodal image captioning and conditional generation models struggle to simulate plausible human responses to images. However, this dataset significantly improves their ability to model impressions and aesthetic evaluations of images through fine-tuning and few-shot adaptation.

Thumbnail Image
Item

Language Models: Generator and Labeler

2023-05-02 , Rungta, Mukund

Over the last few years, there has been remarkable progress in the capabilities of language models. These models have been trained on massive amounts of data using advanced learning algorithms, enabling them to perform a wide range of Natural Language Processing (NLP) tasks with great accuracy. This has made them highly reliable and robust, with state-of-the-art or comparable performance on various NLP benchmarks. In this thesis, I explore two uncharted territories of using large language models: hierarchical text classification and generating training data without human supervision. The proposed approach and design choices for both tasks demonstrate superior performance over different baselines. To support these claims, I examine different components of the model and analyze their contributions to the overall improvements. Although there are several limitations to using language models directly for generating training data, such as ensuring label accuracy and preserving dataset diversity, this work can inspire further research on exploiting dataset-generation-based zero-shot learning using large pre-trained language models.

Thumbnail Image
Item

Flexible access control for campus and enterprise networks

2010-04-07 , Nayak, Ankur Kumar

We consider the problem of designing enterprise network security systems which are easy to manage, robust and flexible. This problem is challenging. Today, most approaches rely on host security, middleboxes, and complex interactions between many protocols. To solve this problem, we explore how new programmable networking paradigms can facilitate fine-grained network control. We present Resonance, a system for securing enterprise networks , where the network elements themselves en- force dynamic access control policies through state changes based on both flow-level information and real-time alerts. Resonance uses programmable switches to manipulate traffic at lower layers; these switches take actions (e.g., dropping or redirecting traffic) to enforce high-level security policies based on input from both higher-level security boxes and distributed monitoring and inference systems. Using our approach, administrators can create security applications by first identifying a state machine to represent different policy changes and then, translating these states into actual network policies. Earlier approaches in this direction (e.g., Ethane, Sane) have remained low-level requiring policies to be written in languages which are too detailed and are difficult for regular users and administrators to comprehend. As a result, significant effort is needed to package policies, events and network devices into a high-level application. Resonance abstracts out all the details through its state-machine based policy specification framework and presents security functions which are close to the end system and hence, more tractable. To demonstrate how well Resonance can be applied to existing systems, we consider two use cases. First relates to "Network Admission Control" problem. Georgia Tech dormitories currently use a system called START (Scanning Technology for Automated Registration, Repair, and Response Tasks) to authenticate and secure new hosts entering the network [23]. START uses a VLAN-based approach to isolate new hosts from authenticated hosts, along with a series of network device interactions. VLANs are notoriously difficult to use, requiring much hand-holding and manual configuration. Our interactions with the dorm network administrators have revealed that this existing system is not only difficult to manage and scale but also inflexible, allowing only coarse-grained access control. We implemented START by expressing its functions in the Resonance framework. The current system is deployed across three buildings in Georgia Tech with both wired as well as wireless connectivities. We present an evaluation of our system's scalability and performance. We consider dynamic rate limiting as the second use case for Resonance. We show how a network policy that relies on rate limiting and traffic shaping can easily be implemented using only a few state transitions. We plan to expand our deployment to more users and buildings and support more complex policies as an extension to our ongoing work. Main contributions of this thesis include design and implementation of a flexible access control model, evaluation studies of our system's scalability and performance, and a campus-wide testbed setup with a working version of Resonance running. Our preliminary evaluations suggest that Resonance is scalable and can be potentially deployed in production networks. Our work can provide a good platform for more advanced and powerful security techniques for enterprise networks.

Thumbnail Image
Item

Multiagent debate among vision-language models improves multimodal reasoning

2024-04-29 , Murugappan, Ganesh Meyyappan

We propose a framework for improving the multimodal reasoning capabilities of vision-language models through multiagent debate, where multiple models engage in a structured debate process, taking opposing perspectives and exchanging arguments about a given multimodal input containing text and images. Through this iterative debate, the models can complement each other's strengths, surface relevant evidence across modalities, and arrive at more robust and well-reasoned conclusions compared to using a single model. Evaluated on the ScienceQA dataset, models involved in a debate significantly outperformed their individual baselines, with prompting strategies resulting in further improvement. The debate process allows models to identify flaws, provide additional evidence, and negotiate stronger final answers by combining diverse skills, highlighting the potential of constructive disagreement and debate for overcoming limitations in current multimodal AI systems.

Thumbnail Image
Item

Integrating LASCO Corona Images into Spatio-Temporal Attention Model for Improved Solar Wind Speed Prediction During ICMEs

2024-04-29 , Perlman, Zachary Adam

This thesis explores enhancing machine learning-based solar wind speed prediction models by incorporating C2 images from the Large Angle and Spectrometric Coronagraph (LASCO). Solar wind predictions are critical as they significantly impact Earth, disrupting everything from power grids to communication systems. Traditional prediction models have used images from the Atmospheric Imaging Assembly (AIA) and historical wind data. This research introduces integrating an additional data source—LASCO C2 images—which provide a unique perspective of the solar corona and highlight solar anomalies like interplanetary coronal mass ejections (ICMEs). To accomplish this, we augment a baseline multimodal prediction model (which includes convolutional, recurrent, and attentional components) with a new processing branch for LASCO C2 images, which parallels the existing branch for AIA images. This integration aims to leverage the distinct characteristics of LASCO images to boost the predictive capabilities of the model. The performance of this modified model was evaluated against the baseline through the root mean square error of the predicted solar wind speeds during identified ICME events from 2011 to 2016. Results indicate that the LASCO-enhanced model achieves a noticeably lower error compared to the baseline model during ICME-dominated periods. This improvement highlights the model's elevated capability during these critical times and represents a promising step forward in improving the accuracy of predictions during the most impactful solar events, potentially improving how we prepare for and mitigate the hazardous effects of space weather.

Thumbnail Image
Item

Effective qubit mapping, routing and scheduling for trapped-ion quantum computers

2023-05-02 , Gupta, Vima

Trapped-Ion Linear Tape (TILT) architectures offer a scalable way to realize ion-trapped quantum computers through tape-based shuttling and routing operations. Modulo a cost model for tape movement and gate application, the quality of qubit mapping and routing (QMR) targeting TILT architectures has a tangible impact on circuit fidelity. State-of-the-art QMR techniques either don’t account for the cost of tape movement or rely on heuristic-based approaches. To address the shortcomings of existing qubit mapping and routing (QMR) techniques, this thesis introduces and evaluates MALT, a comprehensive extension of a MaxSAT based QMR technique by Molavi et. al. MALT generates efficient swap insertion and tape movement schedule for shuttling based architectures, beating the SOTA heuristic for all configurations and benchmarks. The thesis also addresses the issues in scaling a pure constraint-based approach and showcases the optimizations made to scale. This technique incorporates noise-awareness in terms of adding a swap gate versus moving the tape, allowing the user to customize for their noise model.

Thumbnail Image
Item

Improving Real-world Aerial Scene Understanding With a Synthetic Dataset

2024-04-29 , Khose, Sahil Santosh

Real-world aerial scene understanding is limited by a lack of datasets that contain densely annotated images curated under a diverse set of conditions. Due to inherent challenges in obtaining such images in controlled real-world settings, this thesis introduces SkyScenes, a synthetic dataset of densely annotated aerial images captured from Unmanned Aerial Vehicle (UAV) perspectives. We carefully curate SkyScenes images from CARLA to comprehensively capture diversity across layouts (urban and rural maps), weather conditions, times of day, pitch angles and altitudes with corresponding semantic, instance and depth annotations. Through experiments using SkyScenes, this thesis demonstrates that (1) models trained on SkyScenes generalize well to different real-world scenarios, (2) augmenting training on real images with SkyScenes data can improve real-world performance, (3) controlled variations in SkyScenes can offer insights into how models respond to changes in viewpoint conditions, and (4) incorporating additional sensor modalities (depth) can improve aerial scene understanding.

Thumbnail Image
Item

Domain generalization in vision models for aerial imagery datasets

2023-05-04 , Agarwal, Aayushi

In this work, we present SKYSCAPES, a large-scale densely-annotated high-resolution synthetic dataset of aerial images captured from an oblique UAV perspective, for two aerial scene understanding tasks – semantic segmentation and object detection. SKYSCAPES has been designed by keeping in mind several important desiderata specific to aerial images in mind – scale, diversity, class-representation, etc. Experiments conducted on SKYSCAPES demonstrate that (1) SKYSCAPES coupled with a real dataset can serve as a syn-to-real generalization benchmark, (2) SKYSCAPES synthetic data can augment real-data for better generalization and (3) intra-source weather and daytime variations in SKYSCAPES can be used to systematically study robustness of aerial vision models. We hope our dataset and subsequent experiments enable the development of improved visual scene understanding models for aerial viewpoint images.

Thumbnail Image
Item

Robust clustering algorithms

2011-04-05 , Gupta, Pramod

One of the most widely used techniques for data clustering is agglomerative clustering. Such algorithms have been long used across any different fields ranging from computational biology to social sciences to computer vision in part because they are simple and their output is easy to interpret. However, many of these algorithms lack any performance guarantees when the data is noisy, incomplete or has outliers, which is the case for most real world data. It is well known that standard linkage algorithms perform extremely poorly in presence of noise. In this work we propose two new robust algorithms for bottom-up agglomerative clustering and give formal theoretical guarantees for their robustness. We show that our algorithms can be used to cluster accurately in cases where the data satisfies a number of natural properties and where the traditional agglomerative algorithms fail. We also extend our algorithms to an inductive setting with similar guarantees, in which we randomly choose a small subset of points from a much larger instance space and generate a hierarchy over this sample and then insert the rest of the points to it to generate a hierarchy over the entire instance space. We then do a systematic experimental analysis of various linkage algorithms and compare their performance on a variety of real world data sets and show that our algorithms do much better at handling various forms of noise as compared to other hierarchical algorithms in the presence of noise.