Person:

Essa, Irfan

Permanent Link

https://hdl.handle.net/1853/71242

Associated Organization(s)

Organizational Unit

School of Interactive Computing

Full item page

Publication Search Results

Now showing 1 - 10 of 58

ML@GT Lab presents LAB LIGHTNING TALKS 2020

( 2020-12-04) AlRegib, Ghassan ; Chau, Duen Horng ; Chava, Sudheer ; Cohen, Morris B. ; Davenport, Mark A. ; Desai, Deven ; Dovrolis, Constantine ; Essa, Irfan ; Gupta, Swati ; Huo, Xiaoming ; Kira, Zsolt ; Li, Jing ; Maguluri, Siva Theja ; Pananjady, Ashwin ; Prakash, B. Aditya ; Riedl, Mark O. ; Romberg, Justin ; Xie, Yao ; Zhang, Xiuwei

Labs affiliated with the Machine Learning Center at Georgia Tech (ML@GT) will have the opportunity to share their research interests, work, and unique aspects of their lab in three minutes or less to interested graduate students, Georgia Tech faculty, and members of the public. Participating labs include: Yao’s Group - Yao Xie, H. Milton Stewart School of Industrial Systems and Engineering (ISyE); Huo Lab - Xiaoming Huo, ISyE; LF Radio Lab – Morris Cohen, School of Electrical Computing and Engineering (ECE); Polo Club of Data Science – Polo Chau, CSE; Network Science – Constantine Dovrolis, School of Computer Science; CLAWS – Srijan Kumar, CSE; Control, Optimization, Algorithms, and Randomness (COAR) Lab – Siva Theja Maguluri, ISyE; Entertainment Intelligence Lab and Human Centered AI Lab – Mark Riedl, IC; Social and Language Technologies (SALT) Lab – Diyi Yang, IC; FATHOM Research Group – Swati Gupta, ISyE; Zhang's CompBio Lab – Xiuwei Zhang, CSE; Statistical Machine Learning - Ashwin Pananjady, ISyE and ECE; AdityaLab - B. Aditya Prakash, CSE; OLIVES - Ghassan AlRegib, ECE; Robotics Perception and Learning (RIPL) – Zsolt Kira, IC; Eye-Team - Irfan Essa, IC; and Mark Davenport, ECE.
Applying Emerging Technologies In Service of Journalism at The New York Times

( 2020-10-30) Boonyapanachoti, Woraya (Mint) ; Dellaert, Frank ; Essa, Irfan ; Fleisher, Or ; Kanazawa, Angjoo ; Lavallee, Marc ; McKeague, Mark ; Porter, Lana Z.

Emerging technologies, particularly within computer vision, photogrammetry, and spatial computing, are unlocking new forms of storytelling for journalists to help people understand the world around them. In this talk, members of the R&D team at The New York Times talk about their process for researching and developing new capabilities built atop emerging research. In particular, hear how they are embracing photogrammetry and spatial computing to create new storytelling techniques that allow a reader to experience an event as close to reality as possible. Learn about the process of collecting photos, generating 3D models, editing, and technologies used to scale up to millions of readers. The team will also share their vision for these technologies and journalism, their ethical considerations along the way, and a research wishlist that would accelerate their work. In its 169 year history, The New York Times has evolved with new technologies, publishing its first photo in 1896 with the rise of cameras, introducing the world’s first computerized news retrieval system in 1972 with the rise of the computer, and launching a website in 1996 with the rise of the internet. Since then, the pace of innovation has accelerated alongside the rise of smartphones, cellular networks, and other new technologies. The Times now has the world’s most popular daily podcast, a new weekly video series, and award-winning interactive graphics storytelling. Join us for a discussion about how our embrace of emerging technologies is helping us push the boundaries of journalism in 2020 and beyond.
The New Machine Leaming Center at GA Tech: Plans end Aspirations

(Georgia Institute of Technology, 2017-03-01) Essa, Irfan

The Interdisciplinary Research Center (IRC) for Machine Learning at Georgia Tech (ML@GT) was established in Summer 2016 to foster research and academic activities in and around the discipline of Machine Learning. This center aims to create a community that leverages true cross-disciplinarity across all units on campus, establishes a home for the thought leaders in the area of Machine Learning, and creates programs to train the next generation of pioneers. In this talk, I will introduce the center, describe how we got here, attempt to outline the goals of this center and lay out it’s foundational, application, and educational thrusts. The primary purpose of this talk is to solicit feedback about these technical thrusts, which will be the areas we hope to focus on in the upcoming years. I will also describe, in brief, the new Ph.D. program that has been proposed and is pending approval. We will discuss upcoming events and plans for the future.
Selfie-Presentation in Everyday Life: A Large-scale Characterization of Selfie Contexts on Instagram

(Georgia Institute of Technology, 2017) Deeb-Swihart, Julia ; Polack, Christopher ; Gilbert, Eric ; Essa, Irfan

Carefully managing the presentation of self via technology is a core practice on all modern social media platforms. Recently, selfies have emerged as a new, pervasive genre of identity performance. In many ways unique, selfies bring us full-circle to Goffman—blending the online and offline selves together. In this paper, we take an empirical, Goffman-inspired look at the phenomenon of selfies. We report a large-scale, mixed-method analysis of the categories in which selfies appear on Instagram—an online community comprising over 400M people. Applying computer vision and network analysis techniques to 2.5M selfies, we present a typology of emergent selfie categories which represent emphasized identity statements. To the best of our knowledge, this is the first large-scale, empirical research on selfies. We conclude, contrary to common portrayals in the press, that selfies are really quite ordinary: they project identity signals such as wealth, health and physical attractiveness common to many online media, and to offline life.
Towards Using Visual Attributes to Infer Image Sentiment Of Social Events

(Georgia Institute of Technology, 2017) Ahsan, Unaiza ; De Choudhury, Munmun ; Essa, Irfan

Widespread and pervasive adoption of smartphones has led to instant sharing of photographs that capture events ranging from mundane to life-altering happenings. We propose to capture sentiment information of such social event images leveraging their visual content. Our method extracts an intermediate visual representation of social event images based on the visual attributes that occur in the images going beyond sentiment-specific attributes. We map the top predicted attributes to sentiments and extract the dominant emotion associated with a picture of a social event. Unlike recent approaches, our method generalizes to a variety of social events and even to unseen events, which are not available at training time. We demonstrate the effectiveness of our approach on a challenging social event image dataset and our method outperforms state-of-the-art approaches for classifying complex event images into sentiments.
Leveraging Context to Support Automated Food Recognition in Restaurants

(Georgia Institute of Technology, 2015-01) Bettadapura, Vinay ; Thomaz, Edison ; Parnam, Aman ; Abowd, Gregory D. ; Essa, Irfan

The pervasiveness of mobile cameras has resulted in a dramatic increase in food photos, which are pictures re- flecting what people eat. In this paper, we study how tak- ing pictures of what we eat in restaurants can be used for the purpose of automating food journaling. We propose to leverage the context of where the picture was taken, with ad- ditional information about the restaurant, available online, coupled with state-of-the-art computer vision techniques to recognize the food being consumed. To this end, we demon- strate image-based recognition of foods eaten in restaurants by training a classifier with images from restaurant’s on- line menu databases. We evaluate the performance of our system in unconstrained, real-world settings with food im- ages taken in 10 restaurants across 5 different types of food (American, Indian, Italian, Mexican and Thai).
A Practical Approach for Recognizing Eating Moments With Wrist-Mounted Inertial Sensing

(Georgia Institute of Technology, 2015) Thomaz, Edison ; Essa, Irfan ; Abowd, Gregory D.

Recognizing when eating activities take place is one of the key challenges in automated food intake monitoring. Despite progress over the years, most proposed approaches have been largely impractical for everyday usage, requiring multiple on-body sensors or specialized devices such as neck collars for swallow detection. In this paper, we describe the implementation and evaluation of an approach for inferring eating moments based on 3-axis accelerometry collected with a popular off-the-shelf smartwatch. Trained with data collected in a semi-controlled laboratory setting with 20 subjects, our system recognized eating moments in two free-living condition studies (7 participants, 1 day; 1 participant, 31 days), with F-scores of 76.1% (66.7% Precision, 88.8% Recall), and 71.3% (65.2% Precision, 78.6% Recall). This work represents a contribution towards the implementation of a practical, automated system for everyday food intake monitoring, with applicability in areas ranging from health research and food journaling.
Inferring Meal Eating Activities in Real World Settings from Ambient Sounds: A Feasibility Study

(Georgia Institute of Technology, 2015) Thomaz, Edison ; Zhang, Cheng ; Essa, Irfan ; Abowd, Gregory D.

Dietary self-monitoring has been shown to be an effective method for weight-loss, but it remains an onerous task despite recent advances in food journaling systems. Semi-automated food journaling can reduce the effort of logging, but often requires that eating activities be detected automatically. In this work we describe results from a feasibility study conducted in-the-wild where eating activities were inferred from ambient sounds captured with a wrist-mounted device; twenty participants wore the device during one day for an average of 5 hours while performing normal everyday activities. Our system was able to identify meal eating with an F-score of 79.8% in a person-dependent evaluation, and with 86.6% accuracy in a person-independent evaluation. Our approach is intended to be practical, leveraging off-the-shelf devices with audio sensing capabilities in contrast to systems for automated dietary assessment based on specialized sensors.
Egocentric Field-of-View Localization Using First-Person Point-of-View Devices

(Georgia Institute of Technology, 2015-01) Bettadapura, Vinay ; Essa, Irfan ; Pantofaru, Caroline

We present a technique that uses images, videos and sensor data taken from first-person point-of-view devices to perform egocentric field-of-view (FOV) localization. We define egocentric FOV localization as capturing the visual information from a person’s field-of-view in a given environment and transferring this information onto a reference corpus of images and videos of the same space, hence determining what a person is attending to. Our method matches images and video taken from the first-person perspective with the reference corpus and refines the results using the first-person’s head orientation information obtained using the device sensors. We demonstrate single and multi-user egocentric FOV localization in different indoor and outdoor environments with applications in augmented reality, event understanding and studying social interactions.
Automated Assessment of Surgical Skills Using Frequency Analysis

(Georgia Institute of Technology, 2015) Zia, Aneeq ; Sharma, Yachna ; Bettadapura, Vinay ; Sarin, Eric L. ; Clements, Mark A. ; Essa, Irfan

We present an automated framework for visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis techniques for extracting motion quality via frequency coefficients are introduced. The framework is tested on videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.