School of Computer Science Technical Report Series

Series Type
Publication Series
Associated Organization(s)
Associated Organization(s)
Organizational Unit
Organizational Unit

Publication Search Results

Now showing 1 - 1 of 1
  • Item
    Deep Segments: Comparisons between Scenes and their Constituent Fragments using Deep Learning
    (Georgia Institute of Technology, 2014-09) Doshi, Jigar ; Mason, Celeste ; Wagner, Alan ; Kira, Zsolt
    We examine the problem of visual scene understanding and abstraction from first person video. This is an important problem and successful approaches would enable complex scene characterization tasks that go beyond classification, for example characterization of novel scenes in terms of previously encountered visual experiences. Our approach utilizes the final layer of a convolutional neural network as a high-level, scene specific, representation which is robust enough to noise to be used with wearable cameras. Researchers have demonstrated the use of convolutional neural networks for object recognition. Inspired by results from cognitive and neuroscience, we use output maps created by a convolutional neural network as a sparse, abstract representation of visual images. Our approach abstracts scenes into constituent segments that can be characterized by the spatial and temporal distribution of objects. We demonstrate the viability of the system on video taken from Google Glass. Experiments examining the ability of the system to determine scene similarity indicate ρ (384) = ±0:498 correlation to human evaluations and 90% accuracy on a category match problem. Finally, we demonstrate high-level scene prediction by showing that the system matches two scenes using only a few initial segments and predicts objects that will appear in subsequent segments.