    Automated Assessment of Surgical Skills Using Frequency Analysis
    (Georgia Institute of Technology, 2015) Zia, Aneeq ; Sharma, Yachna ; Bettadapura, Vinay ; Sarin, Eric L. ; Clements, Mark A. ; Essa, Irfan
    We present an automated framework for visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis techniques for extracting motion quality via frequency coefficients are introduced. The framework is tested on videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.
    Decoding Children’s Social Behavior
    (Georgia Institute of Technology, 2013-06) Rehg, James M. ; Abowd, Gregory D. ; Rozga, Agata ; Romero, Mario ; Clements, Mark A. ; Sclaroff, Stan ; Essa, Irfan ; Ousley, Opal Y. ; Li, Yin ; Kim, Chanho ; Rao, Hrishikesh ; Kim, Jonathan C. ; Presti, Liliana Lo ; Zhang, Jianming ; Lantsman, Denis ; Bidwell, Jonathan ; Ye, Zhefan
    We introduce a new problem domain for activity recognition: the analysis of children’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1–2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3–5 minute child-adult interaction. In each session, the adult examiner followed a semistructured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.
    ITR-(NHS+ASE) automatic speech attribute transcription (ASAT):
    (Georgia Institute of Technology, 2011-02-28) Lee, Chin-Hui ; Clements, Mark A.
    Talker-to-listener distance effects on speech production and perception
    (Georgia Institute of Technology, 2008-06) Clements, Mark A. ; Kalgaonkar, Kaustubh ; Kim, Jonathan
    Simulating talker-to-listener distance (TLD) in virtual audio environments requires mimicking natural changes in vocal effort. Studies have identified several acoustic parameters manipulated by talkers when varying vocal effort. However, no systematic study has investigated vocal effort variations due to TLD, under natural conditions, and their perceptual consequences. This work examined the feasibility of varying the vocal effort cues for TLD in synthesized speech and real speech by (a) recording and analyzing single word tokens spoken at a range between 1 and 32 meters, (b) creating synthetic and modified speech tokens that vary in one or more acoustic parameters associated with vocal effort, and (c) conducting perceptual tests on the reference, synthetic, and modified tokens to identify salient cues for TLD perception. Measured changes in fundamental frequency, intensity, and formant frequencies of the reference tokens across TLD were similar to other reports in the literature. Perceptual experiments that asked listeners to estimate TLD showed that TLD estimation is most accurate with real speech; however significant standard deviations in the responses suggest that reliable judgments can only be made for gross changes in TLD.
    Identification and application of acoustic cues of vocal effort changes
    (Georgia Institute of Technology, 2007-02-12) Clements, Mark A.
    Improvements to the ABS/OLA sinusoidal model
    (Georgia Institute of Technology, 1995-07) Clements, Mark A.
    Research in digital signal processing
    (Georgia Institute of Technology, 1986) Clements, Mark A.