Title:
Weakly Supervised Learning of Object Segmentations from Web-Scale Video
Weakly Supervised Learning of Object Segmentations from Web-Scale Video
dc.contributor.author | Hartmann, Glenn | |
dc.contributor.author | Grundmann, Matthias | |
dc.contributor.author | Hoffman, Judy | |
dc.contributor.author | Tsai, David | |
dc.contributor.author | Kwatra, Vivek | |
dc.contributor.author | Madani, Omid | |
dc.contributor.author | Vijayanarasimhan, Sudheendra | |
dc.contributor.author | Essa, Irfan | |
dc.contributor.author | Rehg, James M. | |
dc.contributor.author | Sukthankar, Rahul | |
dc.contributor.corporatename | Georgia Institute of Technology. College of Computing | en_US |
dc.contributor.corporatename | Georgia Institute of Technology. School of Interactive Computing | en_US |
dc.contributor.corporatename | Georgia Institute of Technology. Center for Robotics and Intelligent Machines | en_US |
dc.contributor.corporatename | University of California, Berkeley | en_US |
dc.contributor.corporatename | Google Research | en_US |
dc.date.accessioned | 2013-08-28T16:10:23Z | |
dc.date.available | 2013-08-28T16:10:23Z | |
dc.date.issued | 2012-10 | |
dc.description | ©2012 Springer-Verlag Berlin Heidelberg. The original publication is available at www.springerlink.com | en_US |
dc.description | DOI: 10.1007/978-3-642-33863-2_20 | |
dc.description.abstract | We propose to learn pixel-level segmentations of objects from weakly labeled (tagged) internet videos. Specifically, given a large collection of raw YouTube content, along with potentially noisy tags, our goal is to automatically generate spatiotemporal masks for each object, such as "dog", without employing any pre-trained object detectors. We formulate this problem as learning weakly supervised classifiers for a set of independent spatio-temporal segments. The object seeds obtained using segment-level classifiers are further refined using graphcuts to generate high-precision object masks. Our results, obtained by training on a dataset of 20,000 YouTube videos weakly tagged into 15 classes, demonstrate automatic extraction of pixel-level object masks. Evaluated against a ground-truthed subset of 50,000 frames with pixel-level annotations, we confirm that our proposed methods can learn good object masks just by watching YouTube. | en_US |
dc.embargo.terms | null | en_US |
dc.identifier.citation | Hartmann, G.; Grundmann, M.; Hoffman, J.; Tsai, D.; Kwatra, V.; Madani, O.; Vijayanarasimhan, S.; Essa, I.A.; Rehg, J.M.; & Sukthankar, R. (2012). “Weakly Supervised Learning of Object Segmentations from Web-Scale Video”. Computer Vision – ECCV 2012. Workshops and Demonstrations 7-13 October 2012. Proceedings, Part I. In Lecture Notes in Computer Science, 2012, Vol. 7583, pp. 198-208. | en_US |
dc.identifier.doi | 10.1007/978-3-642-33863-2_20 | |
dc.identifier.isbn | 978-3-642-33862-5 (Print) | |
dc.identifier.isbn | 978-3-642-33863-2 (Online) | |
dc.identifier.issn | 0302-9743 | |
dc.identifier.uri | http://hdl.handle.net/1853/48736 | |
dc.language.iso | en_US | en_US |
dc.publisher | Georgia Institute of Technology | en_US |
dc.publisher.original | Springer-Verlag Berlin / Heidelberg | |
dc.subject | Object masks | en_US |
dc.subject | Spatiotemporal segmentation | en_US |
dc.subject | Video segmentation | en_US |
dc.subject | Video stabilization | en_US |
dc.title | Weakly Supervised Learning of Object Segmentations from Web-Scale Video | en_US |
dc.type | Text | |
dc.type.genre | Book Chapter | |
dc.type.genre | Proceedings | |
dspace.entity.type | Publication | |
local.contributor.author | Essa, Irfan | |
local.contributor.author | Rehg, James M. | |
local.contributor.author | Hoffman, Judy | |
local.contributor.corporatename | Institute for Robotics and Intelligent Machines (IRIM) | |
relation.isAuthorOfPublication | 84ae0044-6f5b-4733-8388-4f6427a0f817 | |
relation.isAuthorOfPublication | af5b46ec-ffe2-4ce4-8722-1373c9b74a37 | |
relation.isAuthorOfPublication | 403cff3c-8f25-4db5-978b-ef617a9f8b6a | |
relation.isOrgUnitOfPublication | 66259949-abfd-45c2-9dcc-5a6f2c013bcf |