Multi-object Tracking from the Classics to the Modern

Thumbnail Image
Kim, Chan Ho
Rehg, James M.
Clements, Mark A.
Associated Organizations
Supplementary to
Visual object tracking is one of the computer vision problems that has been researched extensively over the past several decades. Many computer vision applications, such as robotics, autonomous driving, and video surveillance, require the capability to track multiple objects in videos. The most popular solution approach to tracking multiple objects follows the tracking-by-detection paradigm in which the problem of tracking is divided into object detection and data association. In data association, track proposals are often generated by extending the object tracks from the previous frame with new detections in the current frame. The association algorithm then utilizes a track scorer or classifier in evaluating track proposals in order to estimate the correspondence between the object detections and object tracks. The goal of this dissertation is to design a track scorer and classifier that accurately evaluates track proposals that are generated during the association step. In this dissertation, I present novel track scorers and track classifiers that make a prediction based on long-term object motion and appearance cues and demonstrate its effectiveness in tracking by utilizing them within existing data association frameworks. First, I present an online learning algorithm that can efficiently train a track scorer based on a long-term appearance model for the classical Multiple Hypothesis Tracking (MHT) framework. I show that the classical MHT framework achieves competitive tracking performance even in modern tracking settings in which strong object detector and strong appearance models are available. Second, I present a novel Bilinear LSTM model as a deep, long-term appearance model which is a basis for an end-to-end learned track classifier. The architectural design of Bilinear LSTM is inspired by insights drawn from the classical recursive least squares framework. I incorporate this track classifier into the classical MHT framework in order to demonstrate its effectiveness in object tracking. Third, I present a novel multi-track pooling module that enables the Bilinear LSTM-based track classifier to simultaneously consider all the objects in the scene in order to better handle appearance ambiguities between different objects. I utilize this track classifier in a simple, greedy data association algorithm and achieve real-time, state-of-the-art tracking performance. I evaluate the proposed methods in this dissertation on public multi-object tracking datasets that capture challenging object tracking scenarios in urban areas.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI