Representing and Recognizing Temporal Sequences

Thumbnail Image
Shi, Yifan
Bobick, Aaron F.
Associated Organization(s)
Organizational Unit
Supplementary to
Activity recognition falls in general area of pattern recognition, but it resides mainly in temporal domain which leads to distinctive characteristics. We provide an extensive survey over existing tools including FSM, HMM, BNT, DBN, SCFG and Symbolic Network Approach (PNF-network). These tools are inefficient to meet many of the requirements of activity recognition, leading to this work to develop a new graphical model: Propagation Net (P-Net). Many activities can be represented by a partially ordered set of temporal intervals, each of which corresponds to a primitive motion. Each interval has both temporal and logical constraints that control the duration of the interval and its relationship with other intervals. P-Net takes advantage of such fundamental constraints that it provides an graphical conceptual model to describe the human knowledge and an efficient computational model to facilitate recognition and learning. P-Nets define an exponentially large joint distribution that standard bayesian inference cannot handle. We devise two approximation algorithms to interpret a multi-dimensional observation sequence of evidence as a multi-stream propagation process through P-Net. First, Local Maximal Search Algorithm (LMSA) is constructed with polynomial complexity; Second, we introduce a particle filter based framework, Discrete Condensation (D-Condensation) algorithm, which samples the discrete state space more efficiently then original Condensation. To construct a P-Net based system, we need two parts: P-Net and the corresponding detector set. Given topology information and detector library, P-Net parameters can be extracted easily from a relatively small number of positive examples. To avoid the tedious process of manually constructing the detector library, we introduce semi-supervised learning framework to build P-Net and the corresponding detectors together. Furthermore, we introduce the Contrast Boosting algorithm that forces the detectors to be as different as possible but not necessary to be non-overlapping. The classification and learning ability of P-Nets are verified on three data sets: 1)vision tracked indoor activity data set; 2)vision tracked glucose monitor calibration data set; 3)sensor data set on simple weight-lifting exercise. Comparison with standard SCFG and HMM prove a P-Net based system is easier to construct and has a superior ability to classify complex human activity and detect anomaly.
Date Issued
2142257 bytes
Resource Type
Resource Subtype
Rights Statement
Rights URI