Person:

Essa, Irfan

Permanent Link

https://hdl.handle.net/1853/71242

Associated Organization(s)

Organizational Unit

School of Interactive Computing

Full item page

Publication Search Results

Now showing 1 - 10 of 14

Localization and 3D Reconstruction of Urban Scenes Using GPS

(Georgia Institute of Technology, 2008) Kim, Kihwan ; Summet, Jay ; Starner, Thad ; Ashbrook, Daniel ; Kapade, Mrunal ; Essa, Irfan

Using off-the-shelf Global Positioning System (GPS) units, we reconstruct buildings in 3D by exploiting the reduction in signal to noise ratio (SNR) that occurs when the buildings obstruct the line-of-sight between the moving units and the orbiting satellites. We measure the size and height of skyscrapers as well as automatically constructing a density map representing the location of multiple buildings in an urban landscape. If deployed on a large scale, via a cellular service provider’s GPS-enabled mobile phones or GPS-tracked delivery vehicles, the system could provide an inexpensive means of continuously creating and updating 3D maps of urban environments.
Remixing Authorship: Reconfiguring the Author in Online Video Remix Culture

(Georgia Institute of Technology, 2007) Diakopoulos, Nicholas ; Luther, Kurt ; Medynskiy, Yevgeniy (Eugene) ; Essa, Irfan

In an abstract sense, authorship entails the constrained selection or generation of media and the organization and layout of that media in a larger structure. But authorship is more than just selection and organization; it is a complex construct incorporating concepts of originality, authority, intertextuality, and attribution. In this paper we explore these concepts as they relate to authorship and ask how they are changing in light of modes of collaborative authorship in remix culture. A detailed qualitative study of an online video remixing site is presented to help understand how the constraints of that environment are impacting authorial constructs. We discuss users’ self-conceptions as authors, and how values related to authorship are reflected to users through the interface and design of the site’s remixing and community tools. Finally, we present some implications of this work for the design of online communities for collaborative media creation and remixing.
Career: developing and evaluating a spatio-temporal representation for analysis, modeling, recognition and synthesis of facial expressions

(Georgia Institute of Technology, 2005-07-01) Essa, Irfan
Aware Home: Sensing, Interpretation, and Recognition of Everday Activities

(Georgia Institute of Technology, 2005-03-29) Essa, Irfan

The Aware Home project is a unique living laboratory for exploration of ubiquitous computing in a domestic setting. Dr Essa's Talk will present ongoing research in the area of developing technologies within a residential setting that will affect our everyday living - specifically concentrating on the sensing and perception technologies that can enable a home environment to be aware of the whereabouts and activities of its occupants. The discussion will include the use of computer vision, audition work and other efforts in computational perception to track and monitor the residents, as well as methods being developed to recognize the residents' activities over short and extended periods. The technological, design and engineering research challenges inherent in this problem domain, and the focus on awareness to help maintain independence and quality of life for an aging population will also be explored. The project is located in the Georgia Tech Broadband Institute's Residential Laboratory.
A Bayesian View of Boosting and Its Extension

(Georgia Institute of Technology, 2005) Bobick, Aaron F. ; Essa, Irfan ; Shi, Yifan

In this paper, we provide a Bayesian perspective of boosting framework, which we refer to as Bayesian Integration. Through this perspective, we prove the standard ADABOOST is a special case of the naive Bayesian tree with a mapped conditional probability table and a particular weighting schema. Based on this perspective, we introduce a new algorithm ADABOOST.BAYES by taking the dependency between the weak classifiers into account, which extends the boosting framework into non-linear combinations of weak classifiers. Compared with standard ADABOOST, ADABOOST.BAYES requires less training iterations but exhibits stronger tendency to overfit. To leverage on both ADABOOST and ADABOOST. BAYES, we introduce a simple switching schema ADABOOST. SOFTBAYES to integrate ADABOOST and ADABOOST.BAYES. Experiments on synthetic data and the UCI data set prove the validity of our framework.
TV Watcher: Distributed Media Analysis and Correlation

(Georgia Institute of Technology, 2004-07-08) Hilley, David Byron ; El-Helw, Ahmed ; Wolenetz, Matthew David ; Essa, Irfan ; Hutto, Phillip W. ; Starner, Thad ; Ramachandran, Umakishore

The explosion of available content in broadcast media has created a desperate need for applications and prerequisite system architectures to support automatic capture, filtration, categorization, correlation, and higher level inferencing of streaming data from distributed sources. We present TV Watcher, an archetypical example of such an application. TV Watcher performs user-controlled correlation of live television feeds and allows the user to automatically navigate through the available channels based on content of interest. We introduce the Symphony architecture for distributed real-time media analysis and delivery to meet the system requirements for applications with such needs. TV Watcher is built on top of the Symphony architecture, and currently uses closed-captioning information to correlate television programming. We present the results of a user study that shows the correlation engine is consistently able to pick significantly useful and relevant content.
Spectral Partitioning for Structure from Motion

(Georgia Institute of Technology, 2003-10) Steedly, Drew ; Essa, Irfan ; Dellaert, Frank

We propose a spectral partitioning approach for large-scale optimization problems, specifically structure from motion. In structure from motion, partitioning methods reduce the problem into smaller and better conditioned subproblems which can be efficiently optimized. Our partitioning method uses only the Hessian of the reprojection error and its eigenvectors. We show that partitioned systems that preserve the eigenvectors corresponding to small eigenvalues result in lower residual error when optimized. We create partitions by clustering the entries of the eigenvectors of the Hessian corresponding to small eigenvalues. This is a more general technique than relying on domain knowledge and heuristics such as bottom-up structure from motion approaches. Simultaneously, it takes advantage of more information than generic matrix partitioning algorithms.
Choreography Driven Characters

(Georgia Institute of Technology, 2002) Sternberg, Daniel ; Essa, Irfan

High-level control of an articulated humanoid character for animation is much desired by animators. Current options of key-framing, motion capture and simulation either give too much or too little control to the animator in dealing with general motions. The main reason for this lack of action level, higher form of control is that low-level representations, mostly driven by data or samples are used by current systems. Though high-level representations of motion do exist, it is difficult to incorporate them into systems for animation. To facilitate this, we first introduce a representation based on dance notation. We then introduce a second notation based on L-systems. We show how the latter representation falls in the middle of the range of notations, allowing us to rotate, encode, and synthesize various movements. We then show the applicability of these representations by presenting animations created by an input of dance notation.
Exemplar Based Non-parametric BRDFs

(Georgia Institute of Technology, 2002) Haro, Antonio ; Essa, Irfan

Realistic rendering of computer modeled three dimensional surfaces typically involves building a parameterized model of the bidirectional reflectance distribution function (BRDF) of the desired surface material. We present a technique to render these surfaces with proper illumination and material properties using only a photograph of a sphere of the desired material under desired lighting conditions. Capitalizing on the fact that the geometry of the material in the photograph is known, we sample pixels of the sphere's reflectance to create photo-realistic renderings of computer models with the same material properties. The reflectance is sampled using texture synthesis techniques that compensate for the fact that very little of the BRDF observed in the photograph is known. The technique uses the limited observations of the function to create a plausible realistic rendering of the surface that can be composited onto a background plate easily.
Visual Coding and Tracking of Speech Related Facial Motion

(Georgia Institute of Technology, 2001) Reveret, Lionel ; Essa, Irfan

This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear manifold. This manifold is initially generated as a statistical model, obtained by analyzing accurate 3D data of a reference human subject. The FSP are then associated to the linear modes of this statistical model, resulting in a 3D parametric facial mesh. We have tested the speaker-independent hypothesis of this manifold with a model-based video tracking task applied on different subjects. Firstly, the parametric model is adapted and aligned to a subject's face for a single shape. Then the face motion is tracked by optimally aligning the incoming video frames with the face model, textured with the first image, and deformed by varying the FSP, head rotations, and translations. We show results of the tracking for different subjects using our method. Finally, we demonstrate the facial activity encoding into the four FSP values to represent speaker-independent phonetic information.