Person:
Essa, Irfan

Associated Organization(s)
Organizational Unit
ORCID
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 6 of 6
  • Item
    A Bayesian View of Boosting and Its Extension
    (Georgia Institute of Technology, 2005) Bobick, Aaron F. ; Essa, Irfan ; Shi, Yifan
    In this paper, we provide a Bayesian perspective of boosting framework, which we refer to as Bayesian Integration. Through this perspective, we prove the standard ADABOOST is a special case of the naive Bayesian tree with a mapped conditional probability table and a particular weighting schema. Based on this perspective, we introduce a new algorithm ADABOOST.BAYES by taking the dependency between the weak classifiers into account, which extends the boosting framework into non-linear combinations of weak classifiers. Compared with standard ADABOOST, ADABOOST.BAYES requires less training iterations but exhibits stronger tendency to overfit. To leverage on both ADABOOST and ADABOOST. BAYES, we introduce a simple switching schema ADABOOST. SOFTBAYES to integrate ADABOOST and ADABOOST.BAYES. Experiments on synthetic data and the UCI data set prove the validity of our framework.
  • Item
    Choreography Driven Characters
    (Georgia Institute of Technology, 2002) Sternberg, Daniel ; Essa, Irfan
    High-level control of an articulated humanoid character for animation is much desired by animators. Current options of key-framing, motion capture and simulation either give too much or too little control to the animator in dealing with general motions. The main reason for this lack of action level, higher form of control is that low-level representations, mostly driven by data or samples are used by current systems. Though high-level representations of motion do exist, it is difficult to incorporate them into systems for animation. To facilitate this, we first introduce a representation based on dance notation. We then introduce a second notation based on L-systems. We show how the latter representation falls in the middle of the range of notations, allowing us to rotate, encode, and synthesize various movements. We then show the applicability of these representations by presenting animations created by an input of dance notation.
  • Item
    Exemplar Based Non-parametric BRDFs
    (Georgia Institute of Technology, 2002) Haro, Antonio ; Essa, Irfan
    Realistic rendering of computer modeled three dimensional surfaces typically involves building a parameterized model of the bidirectional reflectance distribution function (BRDF) of the desired surface material. We present a technique to render these surfaces with proper illumination and material properties using only a photograph of a sphere of the desired material under desired lighting conditions. Capitalizing on the fact that the geometry of the material in the photograph is known, we sample pixels of the sphere's reflectance to create photo-realistic renderings of computer models with the same material properties. The reflectance is sampled using texture synthesis techniques that compensate for the fact that very little of the BRDF observed in the photograph is known. The technique uses the limited observations of the function to create a plausible realistic rendering of the surface that can be composited onto a background plate easily.
  • Item
    Visual Coding and Tracking of Speech Related Facial Motion
    (Georgia Institute of Technology, 2001) Reveret, Lionel ; Essa, Irfan
    This article present a visual characterization of facial motions inherent with speaking. We propose a set of four Facial Speech Parameters (FSP): jaw opening, lips rounding, lips closure, and lips raising, to represent the primary visual gestures of speech articulation into a multidimensional linear manifold. This manifold is initially generated as a statistical model, obtained by analyzing accurate 3D data of a reference human subject. The FSP are then associated to the linear modes of this statistical model, resulting in a 3D parametric facial mesh. We have tested the speaker-independent hypothesis of this manifold with a model-based video tracking task applied on different subjects. Firstly, the parametric model is adapted and aligned to a subject's face for a single shape. Then the face motion is tracked by optimally aligning the incoming video frames with the face model, textured with the first image, and deformed by varying the FSP, head rotations, and translations. We show results of the tracking for different subjects using our method. Finally, we demonstrate the facial activity encoding into the four FSP values to represent speaker-independent phonetic information.
  • Item
    Real-time, Photo-realistic, Physically Based Rendering of Fine Sacle Human Skin Structure
    (Georgia Institute of Technology, 2001) Haro, Antonio ; Guenter, Brian K. ; Essa, Irfan
    Skin is noticeably bumpy in character, which is clearly visible in close-up shots in a film or game. Methods that rely on simple texture-mapping of faces lack such high frequency shape detail, which makes them look non-realistic. More specifically, this detail is usually ignored in real-time applications, or is drawn in manually by an artist. In this paper, we present techniques for capturing and rendering the fine scale structure of human skin. First, we present a method for creating normal maps of skin with a high degree of accuracy from physical data. We also present techniques inspired by texture synthesis to "grow" skin normal maps to cover the face. Finally, we demonstrate how such skin models can be rendered in real-time on consumer-end graphics hardware.
  • Item
    Machine Learning for Video-Based Rendering
    (Georgia Institute of Technology, 2000) Schodl, Arno ; Essa, Irfan
    We recently introduced a new paradigm for computer animation, video textures, which allows us to use a recorded video to generate novel animations by replaying the video samples in a new order. Video sprites are a special type of video texture. Instead of storing whole images, the object of interest is separated from the background and the video samples are stored as a sequence of alpha-matted sprites with associated velocity information. They can be rendered anywhere on the screen to create a novel animation of the object. To create such an animation, we have to find a sequence of sprite samples that is both visually smooth and shows the desired motion. In this paper, we address both problems. To estimate visual smoothness, we train a linear classifier to estimate visual similarity between video samples. If the motion path is known in advance, we then use a beam search algorithm to find a good sample sequence. We can also specify the motion interactively by precomputing a set of cost functions using Q-learning.