IRIM Seminar Series

Series Type
Event Series
Associated Organization(s)
Associated Organization(s)

Publication Search Results

Now showing 1 - 10 of 13
  • Item
    Perceiving the 3D World from Images
    (Georgia Institute of Technology, 2013-11-15) Savarese, Silvio
    When we look at an environment such as a coffee shop, we don't just recognize the objects in isolation, but rather perceive a rich scenery of the 3D space, its objects and all the relations among them. This allows us to effortlessly navigate through the environment, or to interact and manipulate objects in the scene with amazing precision. The past several decades of computer vision research have, on the other hand, addressed the problems of 2D object recognition and 3D space reconstruction as two independent problems. Tremendous progress has been made in both areas. However, while methods for object recognition attempt to describe the scene as a list of class labels, they often make mistakes due to the lack of a coherent understanding of the 3D spatial structure. Similarly, methods for scene 3D modeling can produce accurate metric reconstructions but cannot put the reconstructed scene into a semantically useful form. A major line of work from my group in recent years has been to design intelligent visual models that understand the 3D world by integrating 2D and 3D cues, inspired by what humans do. In this talk I will introduce a novel paradigm whereby objects and 3D space are modeled in a joint fashion to achieve a coherent and rich interpretation of the environment. I will start by giving an overview of our research for detecting objects and determining their geometric properties such as 3D location, pose or shape. Then, I will demonstrate that these detection methods play a critical role for modeling the interplay between objects and space, which in turn, enable simultaneous semantic reasoning and 3D scene reconstruction. I will conclude this talk by demonstrating that our novel paradigm for scene understanding is potentially transformative in application areas such as autonomous or assisted navigation, robotics, automatic 3D modeling of urban environments and surveillance.
  • Item
    Algebraic Methods for Nonlinear Dynamics and Control
    (Georgia Institute of Technology, 2013-10-23) Tedrake, Russ
    Some years ago, experiments with passive dynamic walking convinced me that finding efficient algorithms to reason about the nonlinear dynamics of our machines would be the key to turning a lumbering humanoid into a graceful ballerina. For linear systems (and nearly linear systems), these algorithms already exist—many problems of interest for design and analysis can be solved very efficiently using convex optimization. In this talk, I'll describe a set of relatively recent advances using polynomial optimization that are enabling a similar convex-optimization-based approach to nonlinear systems. I will give an overview of the theory and algorithms, and demonstrate their application to hard control problems in robotics, including dynamic legged locomotion, humanoids and robotic birds. Surprisingly, this polynomial (aka algebraic) view of rigid body dynamics also extends naturally to systems with frictional contact—a problem which intuitively feels very discontinuous.
  • Item
    Software-enabled Cost-effective Robotics
    (Georgia Institute of Technology, 2013-10-09) Tardella, Neil ; English, James ; Enowaki, Takeshi
    Traditional industrial robots are built for heavy payloads, high accuracy, and fast execution. Though these robots are well suited for many automation tasks, the next generation of robots will be smaller, lighter weight, easier to use, and more cost-effective. This talk will explore the ways Energid Technologies is advancing control and machine-vision software to compensate for limitations in lower-cost hardware. We will discuss a new lightweight kinematically redundant robot arm being developed through Energid subsidiary Robai that takes advantage of these software advances.
  • Item
    Characterizing and Improving the Performance of Teleoperated Mobile Manipulators
    (Georgia Institute of Technology, 2013-09-11) Tilbury, Dawn M.
    Vehicles in racing simulation video games speed down virtual racecourses in excess of 100mph. However, teleoperated mobile manipulators in search and rescue operations inch along at an excruciatingly slow pace, even though time is of the essence. In both cases, the human operator is in the loop, giving control input to the vehicle. In the first case, however, the driver only needs to control the direction of the vehicle through a steering wheel or joystick; in the second case, the additional degrees of freedom of the manipulator arm are added. Of course, the environments are also different: a structured simulated world as opposed to a uncertain real disaster area. For multiple reasons including communications latency, actuator limitations, and inefficient human-robot interaction strategies, even basic robot teleoperation tasks are excruciatingly slow, both in robot mobility and manipulator arm control. For robots to become more useful tools for humans in the future, the speed at which robotassisted tasks can be completed must be increased. In this talk, I present a framework we have developed for characterizing and understanding the key factors that limit the performance of teleoperated mobile manipulators, where performance is defined as a combination of speed, accuracy and safety (lack of collisions). Our analysis framework depends on a having models of delay and performance for the different components of the system, and I describe some models that we have created based on user testing. We consider operator feedback using video and virtual reality, and compare a gamepad user input to a master-slave manipulator. Since adding semi-autonomous behaviors to a teleoperated robot can improve the performance, we describe our results in rollover prevention. I conclude with a discussion of future work in the area.
  • Item
    Magnetic Capsule Robots for Gastrointestinal Endoscopy and Abdominal Surgery
    (Georgia Institute of Technology, 2013-09-04) Valdastri, Pietro
    The talk will move from capsule robots for gastrointestinal endoscopy toward a new generation of surgical robots and devices, having a relevant reduction in invasiveness as the main driver for innovation. Wireless capsule endoscopy has already been extremely helpful for the diagnosis of diseases in the small intestine. Specific wireless capsule endoscopes have been proposed for colon inspection, but have never reached the diagnostic accuracy of standard colonoscopy. In the first part of the talk, we will discuss enabling technologies that have the potential to transform colonoscopy into a painless procedure. These technologies include magnetic manipulation of capsule endoscopes, real-time pose tracking, and intermagnetic force measurement. The second part of the talk will provide an overview about the development of novel robotic solutions for single incision robotic surgery. In particular, a novel surgical robotic platform based on local magnetic actuation will be presented as a possible approach to further minimize access trauma. The final part of the talk will introduce the novel concept of intraoperative wireless tissue palpation, presenting a capsule that can be directly manipulated by the surgeon to create a stiffness distribution map in real-time. This stiffness map can then be used to guide tissue resection with the goal of minimizing the healthy tissue being removed with the tumor.
  • Item
    From Learning Movement Primitives to Associative Skill Memories
    (Georgia Institute of Technology, 2013-08-21) Schaal, Stefan
    Skillful and goal-directed interaction with a dynamically changing world is among the hallmarks of human perception and motor control. Understanding the mechanisms of such skills and how they are learned is a long-standing question in both neuroscience and technology. This talk develops a general framework of how motor skills can be learned. At the heart of our work is a general representation of motor skills in terms of movement primitives as nonlinear attractor systems, the ability to generalize a motor skill to novel situations and to adjust it to sudden perturbations, and the ability to employ imitation learning, trial-and-error learning, and model-based learning to improve planning and control of a motor skill. Our framework has close connections to known phenomena in behavioral and neurosciences, and it also intuitively bridges between dynamic systems theory and optimization theory in motor control, two rather disjoint approaches. We evaluate our approach in various behavioral and robotic studies with anthropomorphic and humanoid robots. Finally, we discuss how to go beyond simple movement primitives to a more complete perception-action-learning system, and speculate on the concept of Associative Skill Memories as an interesting approach.
  • Item
    Fundamentals of Walking and Running: From Animal Experiments to Robot Demonstrations
    (Georgia Institute of Technology, 2013-04-17) Hurst, Jonathan W.
    Dynamic walking and running are high-dimensional, highly dynamic, self-stable behaviors that are best implemented by a system comprised of passive elements, such as springs, and active control from sensors and computing. Animals, our best example of this dynamical system, are able to negotiate terrain that varies widely in height as well as firmness, with excellent energy economy. Robots cannot yet approach animal performance, and we contend that this lack of ability by robots is a result of lack of scientific understanding of fundamental principles of legged locomotion rather than any technological limitation. We seek to answer these fundamental questions of how legged locomotion works, and to demonstrate discoveries by building robots and implementing principled controllers. We have found that simple controllers of the swing leg during flight can replicate observed behavior from animals, including the prioritization of injury avoidance over a steady gait in uneven terrain. Further, we have shown that a simple stance-phase force control method can explain observed biological features such as apparent leg stiffness changes or energy insertion on dissipative ground. The combination of these straightforward controllers allows a simple model to handle surprisingly variable terrain with no terrain knowledge. We currently are implementing these controllers on ATRIAS, our bipedal robot.
  • Item
    Integrated Task and Motion Planning in Belief Space
    (Georgia Institute of Technology, 2013-04-03) Lozano-Pérez, Tomás
    This talk describes an integrated strategy for planning, perception, state-estimation and action in complex mobile manipulation domains based on planning in the belief space of probability distributions over states, using hierarchical goal regression (pre-image back-chaining). We develop a vocabulary of logical expressions that describe sets of belief states, which are goals and subgoals in the planning process. We show that a relatively small set of symbolic operators can give rise to task oriented perception in support of the manipulation goals. An implementation of this method is demonstrated in simulation and on a real PR2 robot, showing robust, flexible solution of mobile manipulation problems with multiple objects and substantial uncertainty.
  • Item
    Towards Visual Route Following for Mobile Robots…Forever!
    (Georgia Institute of Technology, 2013-03-20) Barfoot, Tim
    In this talk I will describe a particular approach to visual route following for mobile robots that we have developed, called Visual Teach & Repeat (VT&R), and what I think the next steps are to make this system usable in real-world applications. We can think of VT&R as a simple form of simultaneous localization and mapping (without the loop closures) along with a path-tracking controller; the idea is to pilot a robot manually along a route once and then be able to repeat the route (in its own tracks) autonomously many, many times using only visual feedback. VT&R is useful for such applications as load delivery (mining), sample return (space exploration), and perimeter patrol (security). Despite having demonstrated this technique for over 300 km of driving on several different robots, there are still many challenges we must meet before we can say this technique is ready for real-world applications. These include (i) visual scene changes such as lighting, (ii) physical scene changes such as path obstructions, and (iii) vehicle changes such as tire wear. I’ll discuss our progress to date in addressing these issues and the next steps moving forward. There will be lots of videos.
  • Item
    Visual Search and Summarization
    (Georgia Institute of Technology, 2013-03-06) Grauman, Kristen
    Widespread visual sensors and unprecedented connectivity have left us awash with visual data---from online photo collections, home videos, news footage, medical images, or surveillance feeds. How can we efficiently browse image and video collections based on semantically meaningful criteria? How can we bring order to the data, beyond manually defined keyword tags? We are exploring these questions in our recent work on interactive visual search and summarization. I will first present a novel form of interactive feedback for visual search, in which a user helps pinpoint the content of interest by making visual comparisons between his envisioned target and reference images. The approach relies on a powerful mid-level representation of interpretable relative attributes to connect the user’s descriptions to the system’s internal features. Whereas traditional feedback limits input to coarse binary labels, the proposed “WhittleSearch” lets a user state precisely what about an image is relevant, leading to more rapid convergence to the desired content. Turning to issues in video browsing, I will then present our work on automatic summarization of egocentric videos. Given a long video captured with a wearable camera, our method produces a short storyboard summary. Whereas existing summarization methods define sampling-based objectives (e.g., to maximize diversity in the output summary), we take a “story-driven” approach that predicts the high-level importance of objects and their influence between subevents. We show this leads to substantially more accurate summaries, allowing a viewer to quickly understand the gist of a long video. This is work done with Adriana Kovashka, Yong Jae Lee, Devi Parikh, and Lu Zheng.