Title:
Behind the Scenes: Decoding Intent from First Person Video

dc.contributor.author Park, Hyun Soo
dc.contributor.corporatename Georgia Institute of Technology. Institute for Robotics and Intelligent Machines en_US
dc.contributor.corporatename University of Minnesota. Department of Computer Science and Engineering en_US
dc.date.accessioned 2017-02-08T12:02:34Z
dc.date.available 2017-02-08T12:02:34Z
dc.date.issued 2017-02-01
dc.description Presented on [date] from 12:00 pm - 1:00 pm in the [venue] en_US
dc.description Presented on February 1, 2017 from 12:00 p.m.-1:00 p.m. in the Marcus Nanotechnology Building, Rooms 1116-1118 on the Georgia Tech campus. en_US
dc.description Hyun Soo Park is an assistant professor in the Department of Computer Science and Engineering at the University of Minnesota. He is interested in understanding human visual sensorimotor behaviors from first-person cameras. Prior to joining the UMN, he was a postdoctoral fellow in the GRASP Lab at the University of Pennsylvanla. Park earned his Ph.D. from Camegle Mallon University. en_US
dc.description Runtime: 56:30 minutes en_US
dc.description.abstract A first person video records not only what is out in the environment but also what is in our head (intention and attention) at the time via social and physical interactions. It is invisible but it can be revealed by fixation, camera motion, and visual semantics. In this talk, I will present a computational model to decode our intention and attention from first person cameras when interacting with (1) scene and (2) people. A person exerts his/her intention through applying physical force and torque to scenes and objects, which effects in visual sensation. We leverage the first person visual sensation to precisely compute force and torque that the first person experienced by integrating visual semantics, 3D reconstruction, and inverse optimal control. Such visual sensation also allows associating with our past experiences that eventually provide a strong cue to predict future activities. When interacting with other people, social attention is a medium that controls group behaviors, e.g., how they form a group and move. We learn the geometric and visual relationship between group behaviors and social attention measured from first person cameras. Based on the learned relationship, we derive a predictive model to localize social attention from third person view. en_US
dc.format.extent 56:30 minutes
dc.identifier.uri http://hdl.handle.net/1853/56436
dc.language.iso en_US en_US
dc.publisher Georgia Institute of Technology en_US
dc.relation.ispartofseries IRIM Seminar Series
dc.subject Computer vision en_US
dc.subject Wearable camera en_US
dc.title Behind the Scenes: Decoding Intent from First Person Video en_US
dc.type Moving Image
dc.type.genre Lecture
dspace.entity.type Publication
local.contributor.corporatename Institute for Robotics and Intelligent Machines (IRIM)
local.relation.ispartofseries IRIM Seminar Series
relation.isOrgUnitOfPublication 66259949-abfd-45c2-9dcc-5a6f2c013bcf
relation.isSeriesOfPublication 9bcc24f0-cb07-4df8-9acb-94b7b80c1e46
Files
Original bundle
Now showing 1 - 2 of 2
No Thumbnail Available
Name:
park.mp4
Size:
453.74 MB
Format:
MP4 Video file
Description:
Download Video
No Thumbnail Available
Name:
park_videostream.html
Size:
962 B
Format:
Hypertext Markup Language
Description:
Streaming Video
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
3.13 KB
Format:
Item-specific license agreed upon to submission
Description:
Collections