Series
ML@GT Seminar Series

Series Type
Event Series
Description
Associated Organization(s)
Associated Organization(s)
Organizational Unit
Organizational Unit

Publication Search Results

Now showing 1 - 10 of 20
  • Item
    ML@GT Lab presents LAB LIGHTNING TALKS 2020
    ( 2020-12-04) AlRegib, Ghassan ; Chau, Duen Horng ; Chava, Sudheer ; Cohen, Morris B. ; Davenport, Mark A. ; Desai, Deven ; Dovrolis, Constantine ; Essa, Irfan ; Gupta, Swati ; Huo, Xiaoming ; Kira, Zsolt ; Li, Jing ; Maguluri, Siva Theja ; Pananjady, Ashwin ; Prakash, B. Aditya ; Riedl, Mark O. ; Romberg, Justin ; Xie, Yao ; Zhang, Xiuwei
    Labs affiliated with the Machine Learning Center at Georgia Tech (ML@GT) will have the opportunity to share their research interests, work, and unique aspects of their lab in three minutes or less to interested graduate students, Georgia Tech faculty, and members of the public. Participating labs include: Yao’s Group - Yao Xie, H. Milton Stewart School of Industrial Systems and Engineering (ISyE); Huo Lab - Xiaoming Huo, ISyE; LF Radio Lab – Morris Cohen, School of Electrical Computing and Engineering (ECE); Polo Club of Data Science – Polo Chau, CSE; Network Science – Constantine Dovrolis, School of Computer Science; CLAWS – Srijan Kumar, CSE; Control, Optimization, Algorithms, and Randomness (COAR) Lab – Siva Theja Maguluri, ISyE; Entertainment Intelligence Lab and Human Centered AI Lab – Mark Riedl, IC; Social and Language Technologies (SALT) Lab – Diyi Yang, IC; FATHOM Research Group – Swati Gupta, ISyE; Zhang's CompBio Lab – Xiuwei Zhang, CSE; Statistical Machine Learning - Ashwin Pananjady, ISyE and ECE; AdityaLab - B. Aditya Prakash, CSE; OLIVES - Ghassan AlRegib, ECE; Robotics Perception and Learning (RIPL) – Zsolt Kira, IC; Eye-Team - Irfan Essa, IC; and Mark Davenport, ECE.
  • Item
    Bringing Visual Memories to Life
    ( 2020-12-02) Huang, Jia-Bin
    Photography allows us to capture and share memorable moments of our lives. However, 2D images appear flat due to the lack of depth perception and may suffer from poor imaging conditions such as taking photos through reflecting or occluding elements. In this talk, I will present our recent efforts to overcome these limitations. Specifically, I will cover our recent for creating compelling 3D photography, removing unwanted obstructions seamlessly from images or videos, and estimating consistent video depth for advanced video-based visual effects. I will conclude the talk with some ongoing research and research challenges ahead.
  • Item
    Let’s Talk about Bias and Diversity in Data, Software, and Institutions
    ( 2020-11-20) Deng, Tiffany ; Desai, Deven ; Gontijo Lopes, Raphael ; Isbell, Charles L.
    Bias and lack of diversity have long been deep-rooted problems across industries. We discuss how these issues impact data, software, and institutions, and how we can improve moving forward. The panel will feature thought leaders from Google, Georgia Tech, and Queer in AI, who will together answer questions like "What implications and problems exist or will exist if the tech workforce does not become more diverse?" and "How does anyone make sure they are not introducing their bias into a given system? What questions should we be asking or actions should we be taking to avoid this?"
  • Item
    Towards High Precision Text Generation
    ( 2020-11-11) Parikh, Ankur
    Despite large advances in neural text generation in terms of fluency, existing generation techniques are prone to hallucination and often produce output that is unfaithful or irrelevant to the source text. In this talk, we take a multi-faceted approach to this problem from 3 aspects: data, evaluation, and modeling. From the data standpoint, we propose ToTTo, a tables-to-text-dataset with high quality annotator revised references that we hope can serve as a benchmark for high precision text generation task. While the dataset is challenging, existing n-gram based evaluation metrics are often insufficient to detect hallucinations. To this end, we propose BLEURT, a fully learnt end-to-end metric based on transfer learning that can quickly adapt to measure specific evaluation criteria. Finally, we propose a model based on confidence decoding to mitigate hallucinations.
  • Item
    Applying Emerging Technologies In Service of Journalism at The New York Times
    ( 2020-10-30) Boonyapanachoti, Woraya (Mint) ; Dellaert, Frank ; Essa, Irfan ; Fleisher, Or ; Kanazawa, Angjoo ; Lavallee, Marc ; McKeague, Mark ; Porter, Lana Z.
    Emerging technologies, particularly within computer vision, photogrammetry, and spatial computing, are unlocking new forms of storytelling for journalists to help people understand the world around them. In this talk, members of the R&D team at The New York Times talk about their process for researching and developing new capabilities built atop emerging research. In particular, hear how they are embracing photogrammetry and spatial computing to create new storytelling techniques that allow a reader to experience an event as close to reality as possible. Learn about the process of collecting photos, generating 3D models, editing, and technologies used to scale up to millions of readers. The team will also share their vision for these technologies and journalism, their ethical considerations along the way, and a research wishlist that would accelerate their work. In its 169 year history, The New York Times has evolved with new technologies, publishing its first photo in 1896 with the rise of cameras, introducing the world’s first computerized news retrieval system in 1972 with the rise of the computer, and launching a website in 1996 with the rise of the internet. Since then, the pace of innovation has accelerated alongside the rise of smartphones, cellular networks, and other new technologies. The Times now has the world’s most popular daily podcast, a new weekly video series, and award-winning interactive graphics storytelling. Join us for a discussion about how our embrace of emerging technologies is helping us push the boundaries of journalism in 2020 and beyond.
  • Item
    Reasoning about Complex Media from Weak Multi-modal Supervision
    ( 2020-10-28) Kovashka, Adriana
    In a world of abundant information targeting multiple senses, and increasingly powerful media, we need new mechanisms to model content. Techniques for representing individual channels, such as visual data or textual data, have greatly improved, and some techniques exist to model the relationship between channels that are “mirror images” of each other and contain the same semantics. However, multimodal data in the real world contains little redundancy; the visual and textual channels complement each other. We examine the relationship between multiple channels in complex media, in two domains, advertisements and political articles. First, we collect a large dataset of advertisements and public service announcements, covering almost forty topics (ranging from automobiles and clothing, to health and domestic violence). We pose decoding the ads as automatically answering the questions “What should do viewer do, according to the ad” (the suggested action), and “Why should the viewer do the suggested action, according to the ad” (the suggested reason). We train a variety of algorithms to choose the appropriate action-reason statement, given the ad image and potentially a slogan embedded in it. The task is challenging because of the great diversity in how different users annotate an ad, even if they draw similar conclusions. One approach mines information from external knowledge bases, but there is a plethora of information that can be retrieved yet is not relevant. We show how to automatically transform the training data in order to focus our approach’s attention to relevant facts, without relevance annotations for training. We also present an approach for learning to recognize new concepts given supervision only in the form of noisy captions. Second, we collect a dataset of multimodal political articles containing lengthy text and a small number of images. We learn to predict the political bias of the article, as well as perform cross-modal retrieval despite large visual variability for the same topic. To infer political bias, we use generative modeling to show how the face of the same politician appears differently at each end of the political spectrum. To understand how image and text contribute to persuasion and bias, we learn to retrieve sentences for a given image, and vice versa. The task is challenging because unlike image-text in captioning, the images and text in political articles overlap in only a very abstract sense. We impose a loss requiring images that correspond to similar text to live closeby in a projection space, even if they appear very diverse purely visually. We show that our loss significantly improves performance in conjunction with a variety of existing recent losses. We also propose new weighting mechanisms to prioritize abstract image-text relationships during training.
  • Item
    Active Learning: From Linear Classifiers to Overparameterized Neural Networks
    ( 2020-10-07) Nowak, Robert
    The field of Machine Learning (ML) has advanced considerably in recent years, but mostly in well-defined domains using huge amounts of human-labeled training data. Machines can recognize objects in images and translate text, but they must be trained with more images and text than a person can see in nearly a lifetime. The computational complexity of training has been offset by recent technological advances, but the cost of training data is measured in terms of the human effort in labeling data. People are not getting faster nor cheaper, so generating labeled training datasets has become a major bottleneck in ML pipelines. Active ML aims to address this issue by designing learning algorithms that automatically and adaptively select the most informative examples for labeling so that human time is not wasted labeling irrelevant, redundant, or trivial examples. This talk explores the development of active ML theory and methods over the past decade, including a new approach applicable to kernel methods and neural networks, which views the learning problem through the lens of representer theorems. This perspective highlights the effect that adding a given training example has on the representation. The new approach is shown to possess a variety of desirable mathematical properties that allow active learning algorithms to learn good classifiers from few labeled examples.
  • Item
    Using rationales and influential training examples to (attempt to) explain neural predictions in NLP
    ( 2020-09-09) Wallace, Byron
    Modern deep learning models for natural language processing (NLP) achieve state-of-the-art predictive performance but are notoriously opaque. I will discuss recent work looking to address this limitation. I will focus specifically on approaches to: (i) Providing snippets of text (sometimes called "rationales") that support predictions, and; (ii) Identifying examples from the training data that influenced a given model output.
  • Item
    Global Optimality Guarantees for Policy Gradient Methods
    ( 2020-03-11) Russo, Daniel
    Policy gradients methods are perhaps the most widely used class of reinforcement learning algorithms. These methods apply to complex, poorly understood, control problems by performing stochastic gradient descent over a parameterized class of polices. Unfortunately, due to the multi-period nature of the objective, policy gradient algorithms face non-convex optimization problems and can get stuck in suboptimal local minima even for extremely simple problems. This talk with discus structural properties – shared by several canonical control problems – that guarantee the policy gradient objective function has no suboptimal stationary points despite being non-convex. Time permitting, I’ll also discuss (1) convergence rates that follow as a consequence of this theory and (2) consequences of this theory for policy gradient performed with highly expressive policy classes. * This talk is based on ongoing joint work with Jalaj Bhandari.
  • Item
    Solving the Flickering Problem in Modern Convolutional Neural Networks
    ( 2020-02-12) Sundaramoorthi, Ganesh
    Deep Learning has revolutionized the AI field. Despite this, there is much progress needed to deploy deep learning in safety critical applications (such as autonomous aircraft). This is because current deep learning systems are not robust to real-world nuisances (e.g., viewpoint, illumination, partial occlusion). In this talk, we take a step in constructing robust deep learning systems by addressing the problem that state-of-the-art Convolution Neural Networks (CNN) classifiers and detectors are vulnerable to small perturbations, including shifts of the image or camera. While various forms of specially engineered “adversarial perturbations” that fool deep learning systems have been well documented, modern CNNs can surprisingly change classification up to 30% probability even for simple 1-pixel shifts of the image. This lack of translational stability seems to be partially the cause of “flickering” in state-of-the-art object detectors applied to video. In this talk, we introduce this phenomena, propose a solution, prove it analytically, validate it empirically, and explain why existing CNNs exhibit this phenomena.