So I'm Zsolt Kira, I'm assistant professor here in the School of Interactive Computing and joined in 2018. In my research is really centered around the intersection of machine and deep learning and robotics. And specifically, we're tackling some of the limitations of machine-learning as they are bottleneck deployment on robotic systems. I have a large group of PhD students, as well as masters students. And I typically teach the deep learning class in spring. And so we're really focusing on a number of questions, especially again, addressing the limitations of machine learning. The first question being how can robots deal with changing environments? This is a key bottleneck that occurs because essentially you train machine-learning model and then you deploy it on a robot, inevitably things will change. My example is always with self-driving cars and scooters appeared at some point and now I think they're disappearing as well, or at least cluttering the streets. And so really the question is, how can we deal with that and how can we deal with that without essentially just gathering all of the world's data and annotating it through human labor. So can we leverage unlabeled data in order to address these generalization issues? And there's a whole bunch of buzzwords here, including semi and self-supervised learning, which are essentially covers a mixture of labeled and unlabeled data together. And how can we build more general, generalizable machine learning systems? Few-shot learning? Can we just have an human or someone else provide a few examples of new things and the robot can automatically learn them. And how can you do it in a continuous way. So continue learning where you're continuously adapting your representations. So the second kind of aspect of that is how can we scale. So in deep learning, scale, as you've seen in the media, has really been the key to driving a lot of new cool things that we can do. And robotics is much more difficult to scale because we have hardware. We have these things that are in the real-world experiments take a long time, they fail a lot and so on. And so some of the areas that we've been doing, including withdrew vitro and Mehta and other collaborators include how can we do better simulation of more photo-realistic as well as kind of physics enabled simulation. Can we use all of these self-supervised, semi-supervised methods in Situ, like in an embodied robot. And really I'm starting to get more interested in how can we combine it with language. Clearly, there's been a lot of NLP and other progress that's been made in that area. There's a whole other area that I'm also doing, Distributed Robotics, but I won't really focus too much on this talk. So the key thing, that message for this talk is that the key bottleneck of perception as it's applied to robotics is really a generalization. And this includes things like long-tail, things that are very rare. And I focus more on the perception. There are certainly, of course, more on the actuation side that still challenging. But on the perception side, these are really the key bottlenecks. And again, this artificial train tests that we do in machine learning. And unable to essentially test these things or allow them to generalize it. How can we develop these algorithms without actually having to deploy the robots in the real-world. And so again, the two things that I pause it can drive a lot of this is better simulation as well as kind of better perception algorithms to deal with the real-world distributions that robots actually get unlabeled and labeled. In the simulation side we have some work again with COVID vice student Andrew withdrew and Meta looking at habitat. So this is a simulation environment that has physics in it. So you can actually have a mobile manipulator picking things up. It's runs at tens of thousands of frames per second on a server or distributed cluster. And so that really allows us to push the boundaries of what we're exploring both on the perception side as well as on the decision-making side. There are several elements to this. This is a NURBS papers so you can feel free to check it out. But one aspect of is with something called the replica CAD, which is Meta, has replica, which is a real-world 3D reconstruction That's super dense and high fidelity across a number of home environments. And they've taken these assets and made them all applicable to a simulation environment where you can throw them in, as well as a bunch of engineering tricks to really speed this up such that it's hundreds of times faster than what we could do before. And now we can actually train robots in simulation to do these cool long horizon tasks. So we tried several things. The simplest thing is just to pick, pick-up objects. And so the general problem that we're interested in though, is the rearrangement problem. That is, you're given a bunch of objects in an environment and you want to put them into a different configuration. So you're given where all the objects are in the beginning, where you want them to be. For example, cleaning up a home is an example of this. And that's what you want the robot to do. We can apply a very simple reinforcement learning architectures where you just take the visual information, maybe your start state and your current position, and then you feed it through some neural network LSTM these days, transformers and so on and outputs both the valued function and a policy that decides how to act. And so you can train these now in very large scale. And this is again just to pick task. We've also done much more complicated long horizon tasks in the paper. But I want to talk too much about that. But you can train these for a long time. That paper actually took. Had 6 billion steps. And of course, this is a limitation of reinforcement learning and an opportunity to fix it, but also allows us again to investigate the perceptual aspects. What's cool is we also compare this reinforcement learning approach to a traditional sense plan act robotics approach, where you essentially have the joint goal sampler, like a planning trajectories sampler type method that generates a motion plan through RDDs and then inactivates them through normal controls based methods. So we can compare these two. For example, the robot can have immediate perception into the form of a point cloud from RGBD. It can apply this to our party to actually plan a trajectory and then actually enable this to happen. This is one example of this. And so now we can compare the two. We can compare a learning-based method and a traditional based methods. And by the way, we have distance plan act with privileged information, which is literally the simulator state, rather than getting it from RGBD. And what you can see here is that reinforcement learning is actually super awesome when it's applied to seeing data. However, it doesn't really generalize very well to new layouts, to new objects, as well as new types of receptacles that as containers. So clearly, perceptual generalization, aside from all the other research that you can do on the actuation side is really a bottleneck still in applying these things to real-world situations. And so again, kind of what we want is on the right side, which is to develop algorithms that we can iterate on fast through these fast simulators to improve this and specifically improve it using unlabeled data, using continual learning methods and so on. So I'm going to show a few examples of some work that we've done towards this. But it's still pretty early work that we can certainly push a lot forward. One of the key limitations of the way we do things in machine learning, as I mentioned, is we designed some architecture, throw it on some large training data, then we deploy it. But typically robots are in the real-world if we're going to have them in the real-world for long periods of time. They're going to continuously encounter new objects, new tasks, and so on. And so the key thing that we want to do instead is to have the robot actually learn in an embodied sense as it's acting and exploring the world. And this is something called continue learning. It studied in the machine learning literature, including by our group. But really those are very contrived. It's kind of settings that we're not trying to push two more realistic robotic settings. One of the key problems here is that if a neural network continues to learn over multiple things, over a sequence, it tends to learn things that it learned in the beginning. Sorry, forget to things that it learned in the beginning. And so the question is, how can we do this without catastrophic forgetting? We have some cool work on looking at this. And one of the key things that we do is take the machine learning settings that are very contrived and add unlabeled data. Because in robotics, you always have for free, unlabeled data, often multi-modal. And we also essentially make correlations between the distributions of data you see. For example, in a kitchen, you always see the same types of objects. And so this actually sounds like it won't matter, but what we show is that it actually does matter for machine learning. Contrived machine-learning settings don't account for these such correlations. And when we apply those algorithms to such correlations are correlated data it breaks, and then we need to develop new methods. So the question is, how can we leverage unlabeled data? How can we essentially deal with this correlated unlabeled data? I'm not gonna go into methods, but we have an architecture that does some other distribution detection to deal with new unknown things and then feeds into a bunch of different loss functions inspired by self supervised, semi-supervised learning. And we can show that you can measure forgetting here and higher is better. And our method can do much better compared to existing methods which actually store the data in order to prevent forgetting. So that's one example. Another example is how to actually discover new things. So what we showed before is continual learning where you actually are presented labeled data in a sequence. Now can we actually discover new things and unlabeled data that we haven't seen before without provided labels. This is something called kinda like open world learning where you're not given the list of categories approved beforehand. We have a bunch of older work that looked at this on how to essentially learn similarity functions on, on training data and then apply them to actually cluster and classify. So classifying your own things and cluster unknown things jointly. One of the key things that we're really working on these days is how to combine this with language. So one of the things that we always, even if you're clustering unknown things, how do you actually name them without a human in the loop? And these days, NLP has developed a really awesome multi-modal model is written and vision and language, something called clip, which if you're not familiar with essentially it has like 400 million paired image taxpayers. And it trains essentially to align the features using the pair, the positive pairs, and then separate the negative parents. We can actually use this to discover new categories much better if we leverage large scale multimodal training. So I'm not gonna go too much into it, but essentially, I think language is a key technique that we can, or capability that we can actually leverage and robotics to deal with unknown things. Because language by inherently internally represents the structure of the world, how categories are connected to each other, and so on. Okay, with that, so that's F doesn't bug me. I will wrap up. So in conclusion, There's lots of cool progress in learning for robotics. But I think there's still a lot of different challenges. Again, it's investigated in the machine-learning community, but often very contrived things. And what I really am pushing for is to investigate these things in the embodied setting where you actually have an agent in simulation, wandering across and exploring a bunch of different environments that can be offices, homes, outdoors and so on. And really deal with these types of issues such as, how can we continually learn and how can we learn in an open world by discovering new things with very small amounts of labeled data. All work, of course, this is done by my PhD students and I show fund, funding agencies as well. Cool, Thank you. Super awesome. Which Bob? Sorry that the agent was reward hacking. So of course, in order to get, okay, so the question is, were in the rearrangement problem where we have a bunch of objects in a particular configuration and we want to reconfigure it, let's say cleaning that we have to do reward hacking. And of course again, we're dealing with mostly the perception issue and so like do deal with reinforcement learning. Of course, you need to develop a large set of rewards that are used to train these types of things. Do we are, we do have some work on inverse reinforcement learning, of course, where you can have demonstrations from humans and maybe back out the reward function. But it certainly I agree that's like an unsolved problem of how to avoid that whenever you deal with reinforcement. And that's, of course, there's also on the sense plan act, traditional robotics. There's a lot of hacking there too, right? Like to get the RG stuff working. There's a lot of effort where you need a PhD that knows these kinds of things. And so how do you balance these things? There's no free lunch here so far. But there are some methods, inverse reinforcement learning and learning by demonstration that make it easier.