All right. Thank you Seth, for that introduction. It's wonderful to be here. It's one of my very first live presentations of any kind. It's I think number three and the last two years. So it's a unique experience. It's good to see so many of you back this is very well attended. Even so my students are here as well, which is great. All right, so my new semantic, I'm like, I'm an assistant professor in the School of Interactive Computing. And today I'd like to share with you new research we're conducting to characterize the human in human autonomy interaction and human-robot interactions that we can better understand how to design intelligent machines that understand support and team with humans. Pictured here is the USS Bunker Hill, a guided missile cruiser. Researchers had been attempting to develop automated decision-making tools, for example, in recommending how to coordinate which defenses, chaff, decoys, flares, etc, to deploy when, where, and how to protect the ship from enemy anti-ship missiles. It's a complex scheduling problem. Hover, transitioning autonomy to autonomy enabled decision support. It's not just as easy as showing the underlying algorithms make accurate decisions. It's not just good enough to show that the systems have good performance. Accuracy on MNIST is insufficient. As I've seen firsthand aboard this ship and others, many of the tools researchers deploy aboard the ships are run in the OFF mode. In our time of Bohr's shifts, we literally found tape over the buttons of automated decision support systems. Because the, the war fighters did not trust the system, they did not understand it, and they were not going to use it. Unfortunately, this comes because autonomy researchers typically fail to understand the humans that they seek to support. And this lack of a human centered design results in these systems that are ultimately abandoned by those operators. This is not a navy specific problem time and time again, we see autonomy design with the autonomy researcher in mind rather than Q AND operator, whether it's autonomous vehicles. Unfortunately, there have been a number of famous or infamous Tesla incidents, which I will actually defend Tesla if you want to argue about it later. But right now I'm going to criticize them. Or aircraft that handover manual control at the worst time, for example, the Boeing actually it was the Airbus incident a few years back, the crash over the Atlantic, if you might remember that incident with the icing on the PDO tubes, automated planners that still aren't trusted by nasa rover operators. No, they don't actually have planning tools to use. And then automated control algorithms that are found unusable by surgeons. We consistently find them autonomy fails to research potential because we did not keep our focus on the human and human machine interaction. In my lab called the cognitive optimization and relational robotics lab, we focus on understanding the human and human autonomy interaction and power humans through robot learning and developing explainable artificial intelligence techniques to click, to close the control loop on human machine interaction that will enable us to reach our goals for autonomy on earth and hopefully and Mars. We literally have a research project working with nasa, JPL to help with rover planning. So how do we go about understanding human and human robot interaction? Well, let's consider one form of human robot interaction. Teaching robot skills through learning from demonstration through a human demonstration of a task. Learning from demonstration seeks to enable robot to learn a policy Pi that's going to map the state of the world to the action that the robot should take to perform over time a certain task to accomplish the goal. And the goal here for the robot is to maximize the similarity between the humans policy, what actions do you mean? We take in each state and then what the robot is going to do. And that states, given a Markov decision process without a known reward function, the robot does not know the goal, has to infer that goal. So in our lab, we've designed learning from demonstration algorithms for real and users can enable robust to learn from sub-optimal demonstration. Pictured here don't mean to knock on Roe Han, but he's not Roger Federer. And it even gets worse when he's actually trying to teach a robot how to play ping pong. But he's, he's gotten pretty good at it over the years of teaching this Sawyer to play. But we've developed new state-of-the-art algorithm called self-supervised reward regression. Roe, Han pillage and laity and Chen work together to put this that got a best paper nomination at the conference on robot learning. The system was mostly autonomous and it's pretty cool algorithm. But when we go to actually deploy these systems in the real-world, maybe we have the technology to accommodate for sub optimality with real users. But the robots not always going to succeed. I hate to say it. We still have to stay up late at night, fiddling with hyper parameters to get the robot to work and still takes a few iterations. And maybe the robot, the demonstrations were just bad for some reason and they're unable to understand how to overcome it. And so while playing games with table tennis is fun. We want work such as our collaborations with Lockheed Martin to end with robust that workers can actually use to teach, to perform nonrecurring engineering work in other tasks in factories. So when a worker is going to use our system, what's going to happen when the system fails to learn from the teacher, they're going to shove in a closet. That's actually happened at Boeing plenty of times even when the robot succeeds. But what happens when the robot actually fails? So to answer this question, we designed a set of tasks for humans to teach a robot via demonstration in human subject experiments, we consider three modes of teaching. The first is going to be kinesthetic teaching, where a human actually physically moves the robot arm to perform the task. Second is tele operation. You're using like an Xbox or PS, whatever number controller to be able to, to manipulate the robot. And then finally a motion capture system where a human is going to wear some April tags or maybe there's some sophisticated syncing to tell where the person is. All of these systems have methods of teaching pros and cons. And our experiment design was a two-by-three within and between subjects, measurement. And we sought to understand three key questions. First, how does the mode of teaching impact the teacher in the context of success or failure? How do teachers react to robots they've failed to learn? And then finally, what are the predictors, the demographic predictors or other predictors of a teacher's resilience. C2, the robots failure to learning, that demonstrated task. Importantly, we want to design an experiment that has high internal validity, which means that if we get a result, we want to make sure that it's the result that we intended to get, that it's actually causal. And so rather than having an actual learning for demonstration algorithm flavor of the day, like some kind of Baskin-Robbins ice cream with so many people coming up with different algorithms. We just said forget that. We're going to pretend the robot learns. And we're going to flip a coin. And either the robot will play a prerecord it successful trajectory or a pre recorded failure trajectory. And we vary the types of failures that might have. And we can talk about kind of the different ways that, that might affect the outcome later. But right now, so we're controlling through randomization, whether the human actually succeeded or failed, independent of their own ability and independent of the robot's ability to learn. So first, we found the kinesthetic teaching significantly degrades the teachers confidence in her ability to teach the robot is finding it's worrisome because kinesthetic teaching the one we physically grab the robot is actually one of the easiest ways for the robot to learn. Resorting to motion capture is undesirable because then you have a challenging computational sensing problem. When you have the correspondence problem, the distance between my elbow and my, my wrist and then my elbow and my shoulder is not the same as the distances that the robot has between its joints and the joints might not even bend in the same way. So how do you map the human body to a robot, my desktop, until the operation. It's just really hard controlling one degree of freedom at a time. That's how bomb disposal robots work and it takes hours and hours. And bomb disposal texts will often just go and dismantle the bone themselves rather than using the robot because they hate the robot. So second, we found that utilizing tele operation to teach a robot resulted in significantly higher workload than motion capture. So what you end up with in a situation where there's some real trade-offs you that you have to really wrestle with. So turning our attention to the impact of learning to failure, learning a failure to learning on workload. Well, you might not be surprised to see that participants would stay in that subjectively they experience higher workload when teaching a robot and it failed. It's not surprising that like well, you're going to think it's worse. Some Azure CFL it, some questionnaire and you're like, Yeah, the whole thing socks because it failed, maybe that's why you set higher workload. But we use it nasa teal acts one of the components. The nasa teal x is actually a performance measure. And so when you actually remove that performance measure that we ask people about out of the survey, we still found that people rated that their workload to be higher, systole significantly higher in the condition where the robot failed. So this is completely decoupling the amount of effort you put in and the amount of effort that you're perceiving is now different. And that's seems to be caused by observing the robot fail. I think is pretty interesting about how we perceive workload here. Maybe it's like anticipatory workload. How much more work I think I'm going to have to put into it. I don't know. That's not the question we asked, but it is interesting. And then finally, we found that the data showed that the robot success or failure and learning from the participant did degrade the participants trust in their own ability to teach, but it was like way more complicated than that. So let's break down this complicated figure. So first, 91% of the effect of success or failure was actually captured by your belief about whether the system trusted you. Not it wasn't really, did you trust yourself? But what did you think about whether the robot, the robot learning system, did it trust you? And then we also found that there was an effect here that not only did your belief about whether the robot trusted you impacted you, but your belief about whether the rule I trusted you impacted your belief about whether the teacher track your estimation of the teachers trust in you as well. So there's all this theory of mind reasoning that's happening with inside the subject that seems to be driving the subjects own confidence as a teacher more than just your own self-assessment of yourself teaching, which I think is really cool. We also found some Demographic factors such as agreeableness. The degree to which you anthropomorphize Sawyer impacted some of this. Your openness, your trust and automation age also had important impacts here, which is pretty interesting. So the results this day provide clear guidance and how we need to rethink designing robot form factors and autonomy to support human teachers. First, researchers in developing Robot Autonomy and learning from demonstration cannot obviate the need to solve syncing, challenging sensing computation efficient problems. We really need to understand the correspondence problem because people do not like physically grabbing the robot and moving it around and they don't like tele operation. We really need to be able to watch humans and understand what they want from just watching them, understand their intent. We also might need robots to anticipate how much effort people are exerting, either cognitive or physical and say, Hey, human, Let's go take a break. Because you're about to get really frustrated because I'm not learning, you're not succeeding in teaching. Humans need, robots need that ability to estimate human workload. And then finally, that there's a lot of very important demographic factors that here that we need to understand and perhaps give people senses or have robust, be able to, to encourage people to exhibit certain personality traits or enhance those personality traits to help them be better robot teachers. Just in the last couple minutes, I want to talk about how I think operationalizing this insight. It's very important to me in and some of the work that I've done. Being able to operationalize work in terms of health care or at my time at MIT Lincoln Laboratory before I came here, I learned a lot this, some of the work that i've, I've have actually shared with a lot of people in the Navy to transition. Some of our apprenticeship learning work was work I did with my former PhD advisor, Julie Sha. And one of our studies we had healthcare professionals measure how whether they would trust to robotic decision support system as a function of how anthropomorphic that robot was. Actually have the robot purposefully give bad advice sometimes to make sure that people aren't over trusting our robot or over-relying on it. And so when I shared this idea with certain captains and the surface Navy or the Air, for the aviation community. I got screamed at. And it was something like, Thou shalt never give my war fighters bad advice. And I'm like, well then how do I make sure that your war fighters aren't taking stupid advice? I don't know. I need some delay of injecting some reliance calibration or mitigation strategy here. Interesting though, when I went to go share this with the submarine community, they were like, Yeah, we did that to each other all the time. Junior and senior officers are like give each other bad advice to like go and launch some missile or you don't blow up their nuclear reactor to see if the person is going to catch it. And if you don't catch it, you know the personal stop you before you in the world. And there'll be like you're in trouble and you don't want to be that person, you're definitely not getting your Twinkie. And so I thought, like, you know, there's a lot of reasons why the submarine community is the best. They've never had a nuclear disaster in the modern nuclear Navy, which is really goes back to Admiral recover and how he changed the culture there and had a culture of accountability. And that's what I want for my robots accountability. And so work that I did, I wanted to mention. So the work here, I did this with Venetian entourage on dry paper on this, studying this idea of, of trust and accountability, we actually designed robot behaviors that would be apologetic if you screwed up, if the robot screwed up or I'd be like, Hey, I caught you catching my, taking my bad advice, don't do that. And so here's an example real quick. No audio, but you can see the yeah, It's pretty fun though. A shaking his fists, hahaha caught you taking my bad advice. And so what's really interesting is that, yeah, we were actually able to modulate trust, reliance and compliance by manipulating the robot's behavior and how it gives decision support. And the best was when the robot helped you accountable. And then we can actually have the robot give you a prompt. I want to help you succeed. I may sometimes give you bad advice just to make sure that you're paying attention. And if you make a mistake, I'll pointing out to you, it's there. I can actually be friendly at the beginning, give you explain this behavior. And that actually will build trust. So you're actually able to build trust with this relationship-building mechanism and decrease inappropriate reliance. Getting some of the best of both worlds, which is pretty cool. So I don't have time to cover all the work that we're doing for, for robots to Mars. Some of the work with I have with Mariah for teaching planes had a fly if they break in mid air before they crash. And also explainable AI. But I have to talk about that later. I want to thank Erin headline and Michael Johnson who are the first authors on the failure learning from demonstration with failure in the Venetian entourage on the anthropomorphization study thing, my sponsors. And with that amount of time, so thank you. Yes, please. Yes, Yes. Yes. Yeah. So the goal standard is going to be the nasa TLS, but there's a lot of problems that people have identified with it. And not only just as it take a long time, but there's also, I think just people are bad at self-reporting things. I think you could try to look at physiological stress. So whether it's some form of calving, galvanic skin response or heart rate pupil allometry. Eeg measure is certain waveforms. But then again, people have such incredible variability. Then you have to baseline Pupil for a long time and then measure them relative to their baseline. It's hard to compare. So it is an ongoing, I think, area of research, but that's it. That's a great question. Yeah, so the question is like instead of the robot giving bad advice to make sure that you're not being tricked, that you're actually still engaged. What about asking a person to explain back to you as interesting thought what people have done, the FDA is somewhat many this for a particular started with unaware of where they will have an AI. Look at MRI images and then say like, oh, here's the tumor or there is no tumor. And what they are doing is they're forcing the doctor to go first. The doctor goes without decisions port than the AI will give its recommendation and then the doctor can resolve conflict. That's another approach that I think has some merit. Be I think that there's a lot of that that's a very interesting idea that that we could explore. Yes, please. Yeah. Yeah. Yeah. I'd say we're trying to tackle that it two ways. The first directly in our lab is we're developing explainable artificial intelligence methods that allow a person to literally simulate the decision-making. So it's a decision tree. Can you actually learn decision trees through gradient descent for reinforcement learning problems and alert policies. And then you can hand a person. Here is literally the instructions that the robot is going to follow. Do you buy that? And so some of the work that we've done is showing that, that it's beneficial, particularly for novices. And as long as the policy isn't too complicated, I'd see it the other way that I'm not going to give you a decision tree if my mind I can't. So the other way that I think is really important and this is what the military and I think a lot of communities are getting totally wrong, is you need to develop a relationship. You don't marry someone typically by choice on the first, second, meeting them, you actually, you know, there's a courtship process. You date them and then you slowly give them more trust. You, you suddenly make yourself more vulnerable to them as you understand more about them. But when we deploy an automated decision-making system on a warship, you don't use that until the war happens. You're being told to trust it, which is a kin I think to marrying a system on day, giving over all your money and everything on the first second of meeting somebody. So I think we need better ways of looking at longitudinal interaction and developing those trusting relationships through slowly increasing vulnerability to the system. Hopefully in a virtual training environments, like what doctors might do with robotic surgery on fake patients before going to real patients. So I think we need to do that.