CHAT Junior - Dolphin Whistle Recognition on the Edge

Author(s)
Mantri, Amogh Krish
Advisor(s)
Editor(s)
Associated Organization(s)
Supplementary to:
Abstract
The Cetacean Hearing Augmentation and Telemetry (CHAT) and Unsupervised Harvesting and Utilization of Recognizable Acoustics (UHURA) project works to study communication between humans and wild dolphins. The CHAT/UHURA researchers communicate with each other, and dolphins by playing and recognizing audio signals that are similar to whistles dolphins might emit. A platform for this function called Wear-A-CUDA (referred to as CHAT Senior) currently exists, but it suffers from higher power consumption, mobility restrictions, and a poor user interface. This paper presents a solution for these issues by creating an audio classification model that can be deployed on a mobile application that does not face the same issues. This solution, CHAT Junior, uses a convolutional neural network trained to recognize whistle audio signals for various objects that researchers and dolphins interact with underwater. This model takes advantage of powerful mobile devices that contain tensor processing units to compute model predictions quickly. This paper experiments with four model architectures of various sizes that preprocess incoming audio signals into spectrograms using a short-time Fourier transform and use a convolutional neural network to classify the spectrogram input. After evaluating the different architectures on both their accuracy and prediction speed, the largest model was chosen. The largest model had 80 percent accuracy on a continuous stream of audio data while executing each prediction in 40 milliseconds, half the initial target time of 80 milliseconds. Though it still has limitations in extremely noisy environments or when whistle signals look very similar to each other, it performs well in the field and consistently differentiates between whistle signal classes and background noise. The model is now deployed in an Android mobile application, which uses an intuitive user interface to provide a better experience for CHAT/UHURA researchers in the field.
Sponsor
Date
Extent
Resource Type
Text
Resource Subtype
Undergraduate Research Option Thesis
Rights Statement
Rights URI