Hello. Hello. Can you hear me? Okay, Good afternoon, everybody. Thanks for organizing this wonderful event. And it feels great to see so many old and the new faces today. Today we'll be talking about some research going on in our lab over the last five to ten years. Due to the time constraint, I won't go into any details. You are invited to read our papers if you're interested in any. And I'm also happy to discuss more offline. Okay, so this is a basically outline of what we do in our lab. We carry out the research in three different areas can show machine learning and robotics. I think just like many of you in this room here, we are doing similar thing. The long-term goal we are trying to achieve is we want to find better ways to understand machine-learning, why it works. And we want to, based on this, we want to enable machine learning applications in real-world applications. So such that it needs to be adaptive, low-pass, and can be used in robotics safely. Okay, these are some tools which are special tools that we utilize very often. Research. These are just some examples. This is a word cloud for research. And you can see basically what we do here is a very good way to rediscover what you have done. Okay, so the title of this talk is a distribution of framework or distributional perspective to control estimation. And the learning. Why do we mean by what do we mean by this? Basically means the several project that we have been doing. They are very related here to this concept of four distributions. So what is the distribution? So the most basic why it's a probability distribution. Probability distribution basically captures the statistics of the random variable. And this couldn't be the one most of you are familiar with. This is just a histogram, but the distribution will show up in many other areas. For instance, in signal processing, you have this power spectrums. This power spectrum basically captures the statistics of a stochastic signal, the time series and this value at different frequency. So this is the frequency, basically dropped it down to the energy level of the signal at that specific frequency. It just one way. Another one is data distributions. So everything about machine learning is about data distributions. In supervised learning, unsupervised learning, it's a lot easier. You would assume Heidi, in reinforcement learning, you need to worry about distribution shift. So many of the things, what eventually what they are doing is we are dealing with distributions and a lot of the odd things about this image processing, especially great image. So each image can be viewed as a two-dimensional distribution because at every pixel you have a scalar and a nonnegative scalar, which you represent the pixel value lie. So this is the distribution and the insistent and controls. We also studied collective dynamics. So just like this, a flock of birds here, what I mean is that if you have sudden death births, they are moving simultaneously. And if we want to study the behavior of these type of things, It's really complicated to understand the behavior of individual is not a realistic. So the very often you really just care about group behavior. And this way you can really model them as the distribution. Of course, these are just some examples. There are more. What am I going to present the next day? It's related to the uncertainty probability distribution than the data distribution and the population dynamics. That this is the outline of this talk. I'm going to go as fast as I can. Kava, six related topics we did. And the covariance can show distribution or reinforcement learning do our approach to optimal control in these three things. These three threats, the distribution and capture the uncertainty the distribution can show and the inference of aggregated measurements. The distribution capture the population dynamics and the last one, diffusion model, it captures the data distribution. Okay? So let's start with the covalence control. So this wasn't the main focus of my PhD study quite a long time ago. So in this problem, what they do is the problem setup is you want to, I'm control a dynamic systems Florida. A state to another state, party is uncertain state to another uncertain state. Okay? So this scenario happens in many applications. For instance, in the lending of spacecraft. Normally, for instance, if we want to land on the mass, there's a big region that as long as you land inside this region, yuck. Okay? So you can capture this by some type of uncertainty. Okay? There are many other examples. So mathematically, this is, the problem is very simple. I mean, the formulation is quite simple. So assume you have a dynamic system which looks like this. X equals your state. You is your action. Dw, just white noise. Okay? So the assumption is that we know, we don't know the exact initial state, but we know that statistics. So the mean is empty and it has some covariance, is sigma is yellow. And our goal is to find the sum control, feedback control his specialty. And that's what I minimize this cost function. And in the meantime, we want to make sure that control that do this job. So which basically want to drive the state to another state. As long as it has. These statistics is fine. And sometimes you can make this to be inequality. So as long as about covalence is less than or equal to, this is also okay. This looks very much like a standard optimal control problem. It turns out, is not. This is because of this terminal constraint is in that statistic level into water supplies glass, this was new at that time. So therefore, you have to develop different methods to deal with these problems. So I'm not going to go into details. So this is what you would observe if you solve this problem. So this is my initial state which is uncertain. So this is time. And you have a two-dimensional state-space. So this is the initial state. Then after you start this problem, you have some control feedback. Now, this ad stand close loop trajectories, as you can see, that really follows and specific covalence tube and then achieve what they want at the end. And you can also use this for the robotics like motion planning to regulate uncertainty, which is quite interesting. And this Lafley search has attracted quite a bit of tension in system can show and the robotics, especially Georgia Tech. So that's a really good start. The next one is also one focus of my PhD study is the distribution can show. Here we shifted to population dynamics. So you can think of this task off controlled as one robot. So we have many of them now say 101 thousand. And we assume they have helmet homogeneous dynamics. So we assume they're very simple, they're homogeneous. Our goal is to drive them from a one, from one configuration to another configuration. We design optimality condition. And mathematically you can see like this, they have an initial distribution and they have some dynamics. And you want to develop a joint the control policy thought that as long as they follow this policy, you would allege whatever you want. And sometimes you also want to have some optimality. And originally this framework, in this framework are we assume these individuals are the cardboard, which means they're independent of each other and this is the other most popular parent died recently. We generalize this to the caves to have inaction, which looks like this. So we assume each robot use a different notation here. Db is a white noise. We assume we have a lot of robot and each of them, they have some specific dynamics and they have independent control. But they do also have a inaction term. Index with each other. By the way, have the same goal. We want to find a joint or feedback control strategy. So that you can drive this to a specific target configuration or targeted distribution. Saw. Understand proper assumptions, you can reformulate. This adds some optimization problem. Sometimes is a non-convex, but we have developed some interesting algorithm to solve this. This is a listen to what? This is. This is a very high-level picture. So this, we just did the Savoy one-dimensional example. So this is the distribution of the robot as the initial time. And we want to drive this to a configuration like this that target the terminal time. If there's no interaction under the optimal control policy, this, this is what the distribution would look like if you have salary positive interaction, which is very reasonable because this is for a walk on collision avoidance. Now you will see some behavior like this. So you see some pattern that they would try to spread out before they converge. Eventually to delegate calculation just to for safety. Okay, Let's move to the estimation problem. So everyone knows what the estimation problems, especially in the controls that you know what they are doing is that you have sensors, you have a measurement, you are trying to estimate the underlying state. Okay? So here we move this paradigm to the population dynamics that yeah, So what that means is that we only have, let's say we have 1 thousand homogeneous dynamic systems. We measure the pattern or the measurement that we can make it. We cannot make measurements of each individual's. Whether we can measure is the distribution is basically the histogram. We know this partition, how many robots that we have there. So something like this idea, application of this is, for instance, if we have a birth migration problem, you are trying to track the bird. I'm iteration. There's no way you can know the identity of each birth. In this type of application, application, the measurement that you have is aggregated. You lose the identity of each individual. So how do we solve this problem and the estimator, the underlying state distribution. And this is what this project is about. In this project that we formatted this as the optimal transport problem, might imagine optimal transport the problem and a specific efficient algorithm to solve this. This is some results basically. Okay, I'll just skip this thing. And then the next one is reinforcement learning. Reinforcement learning, a very important function is called the Q function. This is the setup. I use x notation instead of as a notation. And the Q-function estimate that the expectation of the reward. So, but this gives you limited information. The key idea of distribution of reinforcement learning is really model this. You see this is the random variable because you don't have expectation here. So, and then you also have a similar Bellman equation here. So in this framework, you have that advantage to keep not only in the first order information, you'll also have the advantage to, you can model the high order information so that you can take into account the risk manifests. So it's quite interesting that this is a actor-critic framework or you can develop. It has almost the same complexity for as the standard actor-critic algorithm, but it has a better low-pass. Notice that this one, I would spend one minute is very easy. We have a dynamic system is more like a dynamic system control. It is standard to control people. This is the only app, no stability. If you can find that the abnormal function, if that is true, then you're stable, you are done. But it's not that well-known. There is a dual formulation of that. So if you can find a lymph node density, we call this the op note density, which is satisfied this condition. You're also okay. Okay, now equivalent to each other almost. But what's the advantage of this formulation with respect to this formulation? In this formulation, Let's say if we want to do a synthesis problem, so you not only want to verify the stability, you also want to find a controller. That's what do the job you want to optimize over. You know, you want to find a u and the law would satisfy this condition in the same time. In this formulation, if you to this change of variable, this becomes a convex optimization problem where if you do the same thing for the Eppendorf stability, it doesn't work. Okay? So that is the advantage and the width. This, you can use this condition in the optimal control problem. And you can get the convex formulation of optimal control. Now you can use various methods to solve it. Okay? Almost done. The last thing is active research we are doing over the last one again. This is a couldn't be the laughter hottest topic these days. Aim in machine learning is deficient models. One type of generative modeling technique, which is extremely powerful. So this stays, you've heard about imaging, about things. Everything is about diffusion model. So what is diffusion model? Is? The idea is very simple. It's also about distribution. So you have data distribution, okay? Now, if you start with this data distribution, if you keep adding noise to it, eventually this will be a Gaussian noise, okay? So you lose other information. So if you follow this process, this will drive you fill on data distribution to end competitive noise distribution. Okay, Now suppose we can develop a magical tool that can reverse this process. Then that's wonderful because now what you can do is you can just draw, draw some samples, random samples, Gaussian samples, then follow the backward process. Now you can recover or you can generate a very fancy images. That is the key idea of diffusion model. And they provide a very similar, now very interesting cost functions. So if you train this, this process would rarely reverse this process in the statistical level. And the solution to this is called a score function. And this is what happens over there. And the work we did is we basically developed to a method to accelerate this bottleneck of this algorithm is it requests a lot of iterations to solve this as the. So we developed two methods. One is called a diffusion flow. Another way, diffusion normalizing flow. Another one is called the DLS. So these are very interesting papers and we develop a more efficient algorithm to do that. As far as I know, these are the best algorithm. This stage is to solve the few sample phloem given distributed models. Bump large margin. Okay? So these are just some pictures you can do. Okay, So that's it. I like to think of my wonderful students and collaborators and ASL for the support. Thank you.