Person:
Essa, Irfan

Associated Organization(s)
Organizational Unit
ORCID
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 10 of 21
Thumbnail Image
Item

Algorithms for Linguistic Robot Policy Inference from Demonstration of Assembly Tasks

2012 , Dantam, Neil , Essa, Irfan , Stilman, Mike

We describe several algorithms used for the inference of linguistic robot policies from human demonstration. First, tracking and match objects using the Hungarian Algorithm. Then, we convert Regular Expressions to Nondeterministic Finite Automata (NFA) using the McNaughton-Yamada-Thompson Algorithm. Next, we use Subset Construction to convert to a Deterministic Finite Automaton. Finally, we minimize finite automata using either Hopcroft's Algorithm or Brzozowski's Algorithm.

Thumbnail Image
Item

Career: developing and evaluating a spatio-temporal representation for analysis, modeling, recognition and synthesis of facial expressions

2005-07-01 , Essa, Irfan

Thumbnail Image
Item

Choreography Driven Characters

2002 , Sternberg, Daniel , Essa, Irfan

High-level control of an articulated humanoid character for animation is much desired by animators. Current options of key-framing, motion capture and simulation either give too much or too little control to the animator in dealing with general motions. The main reason for this lack of action level, higher form of control is that low-level representations, mostly driven by data or samples are used by current systems. Though high-level representations of motion do exist, it is difficult to incorporate them into systems for animation. To facilitate this, we first introduce a representation based on dance notation. We then introduce a second notation based on L-systems. We show how the latter representation falls in the middle of the range of notations, allowing us to rotate, encode, and synthesize various movements. We then show the applicability of these representations by presenting animations created by an input of dance notation.

Thumbnail Image
Item

Real-time, Photo-realistic, Physically Based Rendering of Fine Sacle Human Skin Structure

2001 , Haro, Antonio , Guenter, Brian K. , Essa, Irfan

Skin is noticeably bumpy in character, which is clearly visible in close-up shots in a film or game. Methods that rely on simple texture-mapping of faces lack such high frequency shape detail, which makes them look non-realistic. More specifically, this detail is usually ignored in real-time applications, or is drawn in manually by an artist. In this paper, we present techniques for capturing and rendering the fine scale structure of human skin. First, we present a method for creating normal maps of skin with a high degree of accuracy from physical data. We also present techniques inspired by texture synthesis to "grow" skin normal maps to cover the face. Finally, we demonstrate how such skin models can be rendered in real-time on consumer-end graphics hardware.

Thumbnail Image
Item

Localization and 3D Reconstruction of Urban Scenes Using GPS

2008 , Kim, Kihwan , Summet, Jay , Starner, Thad , Ashbrook, Daniel , Kapade, Mrunal , Essa, Irfan

Using off-the-shelf Global Positioning System (GPS) units, we reconstruct buildings in 3D by exploiting the reduction in signal to noise ratio (SNR) that occurs when the buildings obstruct the line-of-sight between the moving units and the orbiting satellites. We measure the size and height of skyscrapers as well as automatically constructing a density map representing the location of multiple buildings in an urban landscape. If deployed on a large scale, via a cellular service provider’s GPS-enabled mobile phones or GPS-tracked delivery vehicles, the system could provide an inexpensive means of continuously creating and updating 3D maps of urban environments.

Thumbnail Image
Item

A Bayesian View of Boosting and Its Extension

2005 , Bobick, Aaron F. , Essa, Irfan , Shi, Yifan

In this paper, we provide a Bayesian perspective of boosting framework, which we refer to as Bayesian Integration. Through this perspective, we prove the standard ADABOOST is a special case of the naive Bayesian tree with a mapped conditional probability table and a particular weighting schema. Based on this perspective, we introduce a new algorithm ADABOOST.BAYES by taking the dependency between the weak classifiers into account, which extends the boosting framework into non-linear combinations of weak classifiers. Compared with standard ADABOOST, ADABOOST.BAYES requires less training iterations but exhibits stronger tendency to overfit. To leverage on both ADABOOST and ADABOOST. BAYES, we introduce a simple switching schema ADABOOST. SOFTBAYES to integrate ADABOOST and ADABOOST.BAYES. Experiments on synthetic data and the UCI data set prove the validity of our framework.

Thumbnail Image
Item

Exemplar Based Non-parametric BRDFs

2002 , Haro, Antonio , Essa, Irfan

Realistic rendering of computer modeled three dimensional surfaces typically involves building a parameterized model of the bidirectional reflectance distribution function (BRDF) of the desired surface material. We present a technique to render these surfaces with proper illumination and material properties using only a photograph of a sphere of the desired material under desired lighting conditions. Capitalizing on the fact that the geometry of the material in the photograph is known, we sample pixels of the sphere's reflectance to create photo-realistic renderings of computer models with the same material properties. The reflectance is sampled using texture synthesis techniques that compensate for the fact that very little of the BRDF observed in the photograph is known. The technique uses the limited observations of the function to create a plausible realistic rendering of the surface that can be composited onto a background plate easily.

Thumbnail Image
Item

Remixing Authorship: Reconfiguring the Author in Online Video Remix Culture

2007 , Diakopoulos, Nicholas , Luther, Kurt , Medynskiy, Yevgeniy (Eugene) , Essa, Irfan

In an abstract sense, authorship entails the constrained selection or generation of media and the organization and layout of that media in a larger structure. But authorship is more than just selection and organization; it is a complex construct incorporating concepts of originality, authority, intertextuality, and attribution. In this paper we explore these concepts as they relate to authorship and ask how they are changing in light of modes of collaborative authorship in remix culture. A detailed qualitative study of an online video remixing site is presented to help understand how the constraints of that environment are impacting authorial constructs. We discuss users’ self-conceptions as authors, and how values related to authorship are reflected to users through the interface and design of the site’s remixing and community tools. Finally, we present some implications of this work for the design of online communities for collaborative media creation and remixing.

Thumbnail Image
Item

TV Watcher: Distributed Media Analysis and Correlation

2004-07-08 , Hilley, David Byron , El-Helw, Ahmed , Wolenetz, Matthew David , Essa, Irfan , Hutto, Phillip W. , Starner, Thad , Ramachandran, Umakishore

The explosion of available content in broadcast media has created a desperate need for applications and prerequisite system architectures to support automatic capture, filtration, categorization, correlation, and higher level inferencing of streaming data from distributed sources. We present TV Watcher, an archetypical example of such an application. TV Watcher performs user-controlled correlation of live television feeds and allows the user to automatically navigate through the available channels based on content of interest. We introduce the Symphony architecture for distributed real-time media analysis and delivery to meet the system requirements for applications with such needs. TV Watcher is built on top of the Symphony architecture, and currently uses closed-captioning information to correlate television programming. We present the results of a user study that shows the correlation engine is consistently able to pick significantly useful and relevant content.

Thumbnail Image
Item

Visual Coding and Tracking of Speech Related Facial Motion

2001 , Reveret, Lionel , Essa, Irfan

This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear manifold. This manifold is initially generated as a statistical model, obtained by analyzing accurate 3D data of a reference human subject. The FSP are then associated to the linear modes of this statistical model, resulting in a 3D parametric facial mesh. We have tested the speaker-independent hypothesis of this manifold with a model-based video tracking task applied on different subjects. Firstly, the parametric model is adapted and aligned to a subject's face for a single shape. Then the face motion is tracked by optimally aligning the incoming video frames with the face model, textured with the first image, and deformed by varying the FSP, head rotations, and translations. We show results of the tracking for different subjects using our method. Finally, we demonstrate the facial activity encoding into the four FSP values to represent speaker-independent phonetic information.