Person:
Clements, Mark A.

Associated Organization(s)
ORCID
ArchiveSpace Name Record

Publication Search Results

Now showing 1 - 8 of 8
Thumbnail Image
Item

Automated Assessment of Surgical Skills Using Frequency Analysis

2015 , Zia, Aneeq , Sharma, Yachna , Bettadapura, Vinay , Sarin, Eric L. , Clements, Mark A. , Essa, Irfan

We present an automated framework for visual assessment of the expertise level of surgeons using the OSATS (Objective Structured Assessment of Technical Skills) criteria. Video analysis techniques for extracting motion quality via frequency coefficients are introduced. The framework is tested on videos of medical students with different expertise levels performing basic surgical tasks in a surgical training lab setting. We demonstrate that transforming the sequential time data into frequency components effectively extracts the useful information differentiating between different skill levels of the surgeons. The results show significant performance improvements using DFT and DCT coefficients over known state-of-the-art techniques.

Thumbnail Image
Item

Talker-to-listener distance effects on speech production and perception

2008-06 , Clements, Mark A. , Kalgaonkar, Kaustubh , Kim, Jonathan

Simulating talker-to-listener distance (TLD) in virtual audio environments requires mimicking natural changes in vocal effort. Studies have identified several acoustic parameters manipulated by talkers when varying vocal effort. However, no systematic study has investigated vocal effort variations due to TLD, under natural conditions, and their perceptual consequences. This work examined the feasibility of varying the vocal effort cues for TLD in synthesized speech and real speech by (a) recording and analyzing single word tokens spoken at a range between 1 and 32 meters, (b) creating synthetic and modified speech tokens that vary in one or more acoustic parameters associated with vocal effort, and (c) conducting perceptual tests on the reference, synthetic, and modified tokens to identify salient cues for TLD perception. Measured changes in fundamental frequency, intensity, and formant frequencies of the reference tokens across TLD were similar to other reports in the literature. Perceptual experiments that asked listeners to estimate TLD showed that TLD estimation is most accurate with real speech; however significant standard deviations in the responses suggest that reliable judgments can only be made for gross changes in TLD.

Thumbnail Image
Item

Research in digital signal processing

1986 , Clements, Mark A.

Thumbnail Image
Item

Decoding Children’s Social Behavior

2013-06 , Rehg, James M. , Abowd, Gregory D. , Rozga, Agata , Romero, Mario , Clements, Mark A. , Sclaroff, Stan , Essa, Irfan , Ousley, Opal Y. , Li, Yin , Kim, Chanho , Rao, Hrishikesh , Kim, Jonathan C. , Presti, Liliana Lo , Zhang, Jianming , Lantsman, Denis , Bidwell, Jonathan , Ye, Zhefan

We introduce a new problem domain for activity recognition: the analysis of children’s social and communicative behaviors based on video and audio data. We specifically target interactions between children aged 1–2 years and an adult. Such interactions arise naturally in the diagnosis and treatment of developmental disorders such as autism. We introduce a new publicly-available dataset containing over 160 sessions of a 3–5 minute child-adult interaction. In each session, the adult examiner followed a semistructured play interaction protocol which was designed to elicit a broad range of social behaviors. We identify the key technical challenges in analyzing these behaviors, and describe methods for decoding the interactions. We present experimental results that demonstrate the potential of the dataset to drive interesting research questions, and show preliminary results for multi-modal activity recognition.

Thumbnail Image
Item

Identification and application of acoustic cues of vocal effort changes

2007-02-12 , Clements, Mark A.

Thumbnail Image
Item

Adaptation and development of speech recognition techniques as applied to improving communication for individuals with hearing impairments

1985 , Clements, Mark A.

Thumbnail Image
Item

ITR-(NHS+ASE) automatic speech attribute transcription (ASAT):

2011-02-28 , Lee, Chin-Hui , Clements, Mark A.

Thumbnail Image
Item

Improvements to the ABS/OLA sinusoidal model

1995-07 , Clements, Mark A.