Person:

Essa, Irfan

Permanent Link

https://hdl.handle.net/1853/71242

Associated Organization(s)

Organizational Unit

School of Interactive Computing

Full item page

Publication Search Results

Now showing 1 - 10 of 12

Localization and 3D Reconstruction of Urban Scenes Using GPS

(Georgia Institute of Technology, 2008) Kim, Kihwan ; Summet, Jay ; Starner, Thad ; Ashbrook, Daniel ; Kapade, Mrunal ; Essa, Irfan

Using off-the-shelf Global Positioning System (GPS) units, we reconstruct buildings in 3D by exploiting the reduction in signal to noise ratio (SNR) that occurs when the buildings obstruct the line-of-sight between the moving units and the orbiting satellites. We measure the size and height of skyscrapers as well as automatically constructing a density map representing the location of multiple buildings in an urban landscape. If deployed on a large scale, via a cellular service provider’s GPS-enabled mobile phones or GPS-tracked delivery vehicles, the system could provide an inexpensive means of continuously creating and updating 3D maps of urban environments.
Remixing Authorship: Reconfiguring the Author in Online Video Remix Culture

(Georgia Institute of Technology, 2007) Diakopoulos, Nicholas ; Luther, Kurt ; Medynskiy, Yevgeniy (Eugene) ; Essa, Irfan

In an abstract sense, authorship entails the constrained selection or generation of media and the organization and layout of that media in a larger structure. But authorship is more than just selection and organization; it is a complex construct incorporating concepts of originality, authority, intertextuality, and attribution. In this paper we explore these concepts as they relate to authorship and ask how they are changing in light of modes of collaborative authorship in remix culture. A detailed qualitative study of an online video remixing site is presented to help understand how the constraints of that environment are impacting authorial constructs. We discuss users’ self-conceptions as authors, and how values related to authorship are reflected to users through the interface and design of the site’s remixing and community tools. Finally, we present some implications of this work for the design of online communities for collaborative media creation and remixing.
Career: developing and evaluating a spatio-temporal representation for analysis, modeling, recognition and synthesis of facial expressions

(Georgia Institute of Technology, 2005-07-01) Essa, Irfan
A Bayesian View of Boosting and Its Extension

(Georgia Institute of Technology, 2005) Bobick, Aaron F. ; Essa, Irfan ; Shi, Yifan

In this paper, we provide a Bayesian perspective of boosting framework, which we refer to as Bayesian Integration. Through this perspective, we prove the standard ADABOOST is a special case of the naive Bayesian tree with a mapped conditional probability table and a particular weighting schema. Based on this perspective, we introduce a new algorithm ADABOOST.BAYES by taking the dependency between the weak classifiers into account, which extends the boosting framework into non-linear combinations of weak classifiers. Compared with standard ADABOOST, ADABOOST.BAYES requires less training iterations but exhibits stronger tendency to overfit. To leverage on both ADABOOST and ADABOOST. BAYES, we introduce a simple switching schema ADABOOST. SOFTBAYES to integrate ADABOOST and ADABOOST.BAYES. Experiments on synthetic data and the UCI data set prove the validity of our framework.
TV Watcher: Distributed Media Analysis and Correlation

(Georgia Institute of Technology, 2004-07-08) Hilley, David Byron ; El-Helw, Ahmed ; Wolenetz, Matthew David ; Essa, Irfan ; Hutto, Phillip W. ; Starner, Thad ; Ramachandran, Umakishore

The explosion of available content in broadcast media has created a desperate need for applications and prerequisite system architectures to support automatic capture, filtration, categorization, correlation, and higher level inferencing of streaming data from distributed sources. We present TV Watcher, an archetypical example of such an application. TV Watcher performs user-controlled correlation of live television feeds and allows the user to automatically navigate through the available channels based on content of interest. We introduce the Symphony architecture for distributed real-time media analysis and delivery to meet the system requirements for applications with such needs. TV Watcher is built on top of the Symphony architecture, and currently uses closed-captioning information to correlate television programming. We present the results of a user study that shows the correlation engine is consistently able to pick significantly useful and relevant content.
Choreography Driven Characters

(Georgia Institute of Technology, 2002) Sternberg, Daniel ; Essa, Irfan

High-level control of an articulated humanoid character for animation is much desired by animators. Current options of key-framing, motion capture and simulation either give too much or too little control to the animator in dealing with general motions. The main reason for this lack of action level, higher form of control is that low-level representations, mostly driven by data or samples are used by current systems. Though high-level representations of motion do exist, it is difficult to incorporate them into systems for animation. To facilitate this, we first introduce a representation based on dance notation. We then introduce a second notation based on L-systems. We show how the latter representation falls in the middle of the range of notations, allowing us to rotate, encode, and synthesize various movements. We then show the applicability of these representations by presenting animations created by an input of dance notation.
Exemplar Based Non-parametric BRDFs

(Georgia Institute of Technology, 2002) Haro, Antonio ; Essa, Irfan

Realistic rendering of computer modeled three dimensional surfaces typically involves building a parameterized model of the bidirectional reflectance distribution function (BRDF) of the desired surface material. We present a technique to render these surfaces with proper illumination and material properties using only a photograph of a sphere of the desired material under desired lighting conditions. Capitalizing on the fact that the geometry of the material in the photograph is known, we sample pixels of the sphere's reflectance to create photo-realistic renderings of computer models with the same material properties. The reflectance is sampled using texture synthesis techniques that compensate for the fact that very little of the BRDF observed in the photograph is known. The technique uses the limited observations of the function to create a plausible realistic rendering of the surface that can be composited onto a background plate easily.
Visual Coding and Tracking of Speech Related Facial Motion

(Georgia Institute of Technology, 2001) Reveret, Lionel ; Essa, Irfan

This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear manifold. This manifold is initially generated as a statistical model, obtained by analyzing accurate 3D data of a reference human subject. The FSP are then associated to the linear modes of this statistical model, resulting in a 3D parametric facial mesh. We have tested the speaker-independent hypothesis of this manifold with a model-based video tracking task applied on different subjects. Firstly, the parametric model is adapted and aligned to a subject's face for a single shape. Then the face motion is tracked by optimally aligning the incoming video frames with the face model, textured with the first image, and deformed by varying the FSP, head rotations, and translations. We show results of the tracking for different subjects using our method. Finally, we demonstrate the facial activity encoding into the four FSP values to represent speaker-independent phonetic information.
Real-time, Photo-realistic, Physically Based Rendering of Fine Sacle Human Skin Structure

(Georgia Institute of Technology, 2001) Haro, Antonio ; Guenter, Brian K. ; Essa, Irfan

Skin is noticeably bumpy in character, which is clearly visible in close-up shots in a film or game. Methods that rely on simple texture-mapping of faces lack such high frequency shape detail, which makes them look non-realistic. More specifically, this detail is usually ignored in real-time applications, or is drawn in manually by an artist. In this paper, we present techniques for capturing and rendering the fine scale structure of human skin. First, we present a method for creating normal maps of skin with a high degree of accuracy from physical data. We also present techniques inspired by texture synthesis to "grow" skin normal maps to cover the face. Finally, we demonstrate how such skin models can be rendered in real-time on consumer-end graphics hardware.
Mandatory Human Participation: A New Scheme for Building Secure Systems

(Georgia Institute of Technology, 2001) Essa, Irfan ; Sung, Min-Ho ; Lipton, Richard J. ; Xu, Jun

Mandatory Human Participation (MHP) is a novel authentication scheme that asks the question "are you human?" (instead of "who are you?"), and upon the correct answer to this question, can prove a principal to be a human being instead of a computer program. MHP helps solve old and new problems in computer security that existing security measures can not address properly, including password (or PIN number) guessing attacks, automated service and information theft, and denial of service. A key component of this `are you human?'' authentication process is a character morphing algorithm that transforms a character string into its graphical form in such a way that a human being won't have any problem recognizing the original string, while a computer program (e.g., an Optical Character Recognition program), will not be able to decipher it or make a correct guess with non-negligible probability. The basic idea of the MHP scheme is to ask an agent to recognize the string before its login attempts or transaction requests can be honored. Here a protocol is needed to send a puzzle to an agent, check if the answer supplied by the agent is correct, and most importantly make sure that the agent can not cheat in the process. A number of system and security issues that relate to the protocol need to be addressed for the protocol to be secure, efficient, robust, and user-friendly. The MHP scheme contributes to the foundation of the computer security by faithfully implementing a novel security semantics, "human," which existing cryptographic measures can not express accurately. As many real-world security applications involve the interaction between a human and a computer, which naturally contains "human" as a part of its protocol semantics, we believe that the MHP scheme will find many new applications in the future.