[00:00:05.01] so i'm michael and i'm really happy to be here talking in this colloquium [00:00:09.01] [00:00:09.01] i'm going to tell you about some work that we've been doing over the past couple of years on using [00:00:13.11] [00:00:13.11] machine learning to think about the solutions to pdes so we when we teach we tell our students that [00:00:19.07] [00:00:19.07] partial differential equations pervade the physical sciences and the reason for that is [00:00:23.14] [00:00:23.14] that whenever you try to impose that an equation is satisfied at every point in space [00:00:28.02] [00:00:28.02] then you of course are left with an equation at each point in space which leads to [00:00:32.08] [00:00:32.08] you know partial differential equations so of which there are many examples so my the the [00:00:37.05] [00:00:37.05] ones that will provide the canonical example for this talk are the navier-stokes equations [00:00:41.12] [00:00:41.12] which you know um give the equation for the velocity field at every point in space and what [00:00:48.06] [00:00:48.06] we teach our students is that the pdes nominally represent an infinite number of degrees of [00:00:53.01] [00:00:53.01] freedom because there's one at every point in space but we know very well from either [00:00:57.05] [00:00:57.05] looking at the world or from our experience that there really aren't an infinite number of degrees [00:01:01.20] [00:01:01.20] of freedom for any situation there are patterns there are things that occur in the solution and [00:01:06.19] [00:01:06.19] the the number of solutions that there are is very finite and you know really for you [00:01:12.10] [00:01:12.10] know decades for more than a century we've been trying to figure out ways you know to represent [00:01:18.02] [00:01:18.02] the solutions of equations in a fashion that they could be computable and you know the the sort of [00:01:23.12] [00:01:23.12] most accurate thing that we have at the moment is numerical simulation and this picture on the left [00:01:29.18] [00:01:29.18] is a is a simulation of the weather the picture on the right is a simulation of fluid convection [00:01:34.23] [00:01:34.23] and in these simulations what they do is that the authors take the navier-stokes equations at every [00:01:39.22] [00:01:39.22] point in space with as high of a resolution as they can compute and they try to get you know [00:01:45.07] [00:01:45.07] as fine features as they can and sort of what we're used to occurring is that under fairly [00:01:50.19] [00:01:50.19] simple situations like for example the one on the right for rayleigh bernard convection one can [00:01:55.22] [00:01:55.22] develop enormous complexity but so that and so the number of points that are required to basically um [00:02:03.03] [00:02:04.00] to describe this phenomenon are enormous there are 10 to the eight grid points in this simulation and [00:02:08.17] [00:02:08.17] the problem with that of course is that it's it's tremendously expensive to solve problems like the [00:02:13.07] [00:02:13.07] one on the right or the problems on the left and really super computers were born out of the need [00:02:18.00] [00:02:18.00] to do that so the numerical methods that we use to solve partial differential equations this day were [00:02:23.12] [00:02:23.12] developed during world war ii and they basically work by taking the the original equation and [00:02:29.22] [00:02:29.22] then they take the derivatives and you basically pretend that the solution exists on a mesh which [00:02:34.00] [00:02:34.00] is parameterized by some index i and then the um the second derivative for example in this case can [00:02:40.06] [00:02:40.06] be written as a difference this then reduces the pde into a set of ordinary differential equations [00:02:46.19] [00:02:46.19] which i've written here and so you have a different ordinary differential equation for every [00:02:50.04] [00:02:50.04] point in space now the resolution that is chosen the delta x has to be chosen to be small enough [00:02:56.17] [00:02:56.17] that it can resolve the smallest features of the solution the time steps must be small enough for [00:03:02.02] [00:03:02.02] the discrete dynamics to not destabilize and the sort of annoying thing is that both of these are [00:03:07.20] [00:03:07.20] requirements that require that both the space and time scales for the simulation are actually [00:03:13.22] [00:03:13.22] much smaller than those that actually underlie the actual physics and this is an inefficiency [00:03:18.21] [00:03:18.21] because we don't really know this gets back to you know the pdes being an infinite number [00:03:23.01] [00:03:23.01] of degrees of freedom and we don't really know how to parameterize them for particular problems [00:03:27.12] [00:03:28.10] in any case i'm just going along this so this is the sort of komograph formula and the smallest [00:03:33.09] [00:03:33.09] eddy in the system is gotten by setting u of d times d that's a number that has the dimensions [00:03:39.01] [00:03:39.01] of the viscosity equal to the kinematic viscosity and when one does that one derives chromograph [00:03:44.12] [00:03:44.12] derived a length scale which is called the chromograph scale which is this scale which [00:03:48.23] [00:03:48.23] basically is the size of the smallest eddy and how it depends on reynolds number so if you then [00:03:53.12] [00:03:53.12] want to count the number of degree of degrees of freedom there actually are in the turbulent flow [00:03:58.00] [00:03:58.00] then a simple estimate for how to do that is you just assume that all of the eddies are independent [00:04:02.10] [00:04:02.10] of each other and you compute l the macroscopic scale divided by the smallest eddy scale you [00:04:08.13] [00:04:08.13] cube it and that gives you reynolds numbers to nine quarters which is the sort of estimate the [00:04:13.03] [00:04:13.03] number of degrees of freedom isn't infinite it's it's sort of bounded by this number but of course [00:04:18.00] [00:04:18.00] this argument you know assumes that the eddies are all uncorrelated with each other that each eddie [00:04:23.05] [00:04:23.05] is an independent degree of freedom and you know if you sort of go with that for a moment you just [00:04:28.10] [00:04:28.10] take this law here's the law that i mentioned and you then start putting in numbers so if you [00:04:32.19] [00:04:32.19] say that you know every eddie in order to resolve it in a computer requires a thousand grid points [00:04:38.15] [00:04:38.15] so you know because maybe you need 10 grid points in each direction to be able to resolve it [00:04:42.23] [00:04:42.23] then you end up with the total number of grid points looking like this now this formula [00:04:47.16] [00:04:47.16] is an overestimate maybe it's not a thousand maybe it's a hundred maybe it's 50 but anyway [00:04:51.22] [00:04:51.22] you know it's something like this and if you then just start plugging in like i want to [00:04:55.18] [00:04:55.18] solve for a baseball pitch or i want to solve for a blue whale you end up with so many grid points [00:05:01.20] [00:05:01.20] that not only is it impossible to do this today it will always be impossible functionally it's not [00:05:07.16] [00:05:07.16] only that the problem exists within our lifetime it's that this is completely you know this is an [00:05:13.09] [00:05:13.09] inane way of trying to basically understand you know what is going on in this really rather simple [00:05:19.16] [00:05:19.16] simple situation and i think this this notion that this was sort of impossible was discovered [00:05:24.21] [00:05:24.21] in the 50s and 60s and you know we really you know the situation is so severe we've made progress but [00:05:31.16] [00:05:31.16] this is this is sort of never going to be a viable approach and um you know for that reason if you go [00:05:37.14] [00:05:37.14] and look at you know the resolution of you know simulations of the weather here's a you know you [00:05:42.23] [00:05:42.23] know here's a hurricane and here's the sort of a granular view of it with about 100 kilometer [00:05:47.22] [00:05:47.22] mesh which is the typical horizontal scale that is used to resolve it then you know there's just this [00:05:52.19] [00:05:52.19] is the um this is the sort of the relationship between what is happening and our ability to [00:05:57.18] [00:05:57.18] resolve it and of course there's a hell of a lot of physics that's happening in each of these boxes [00:06:02.13] [00:06:02.13] that we're not able to resolve and so this is sort of a fundamental problem in science that [00:06:08.13] [00:06:08.13] i think we the physics community and we the world really haven't figured out how to think about so [00:06:13.11] [00:06:13.11] now what i want to do is just for a moment i want to channel predrag and roman and stats and people [00:06:18.19] [00:06:18.19] at georgia tech i mean you guys if i channel you wrong then you should correct me because i [00:06:23.11] [00:06:23.11] um you know i'm just trying to channel use it how many degrees of freedom are there actually i mean [00:06:28.10] [00:06:28.10] you could go with this komodo estimate but like that that assumes that things are uncorrelated and [00:06:34.02] [00:06:34.15] of course they are if you look at flows if you look at the world there are actually correlations [00:06:38.23] [00:06:38.23] in the motion of the smallest scales and these correlations predrag has long you know has dubbed [00:06:44.06] [00:06:44.06] this word that i really like called bricks which he says there are space-time patterns of some sort [00:06:49.14] [00:06:49.14] in flows like this in situations like this that are repeating you know throughout the phase space [00:06:54.23] [00:06:54.23] of the flow and that what we really need to do as a community is figure out how to think about [00:07:00.13] [00:07:00.13] these bricks and um and how to basically piece them together and actually compute with them [00:07:05.09] [00:07:05.09] and if you could that is well even whether or not you could even if you can't the it's sort of the [00:07:10.13] [00:07:10.13] argument if you sort of go and look at details of what's happening in flows the number of degrees of [00:07:14.21] [00:07:14.21] freedom is clearly lower than these estimates but for practical considerations we've never [00:07:19.07] [00:07:19.07] figured out how to use them to you know solve say engineering problems and that's the that's the um [00:07:26.13] [00:07:26.13] that's the frontier that basically that that this talk and that i think many of us are trying to [00:07:32.00] [00:07:33.09] to go after and so the question that i want to ask and this is where machine learning is going [00:07:37.11] [00:07:37.11] to start to come in is is that you know if you look at if you look at the weather if you look [00:07:42.08] [00:07:42.08] at a flow if you look at the solutions of the non-linear pde no matter how many how [00:07:46.17] [00:07:46.17] complicated it is or how many degrees of freedom there are there are patterns that occur in the [00:07:52.12] [00:07:52.12] solution and one would think that there should be a way to exploit those patterns to better solve [00:07:58.10] [00:07:58.10] the equations and to really better um to better understand the physics and that is the um that [00:08:05.11] [00:08:05.11] this is the problem statement upon which i want to now go and tell you about um machine learning [00:08:11.11] [00:08:11.11] and i just want to say that you know this is i think really an important problem i mean [00:08:16.08] [00:08:16.08] it would have huge consequences to be able to do this if we could parametrize predrag's bricks then [00:08:22.08] [00:08:22.08] it would completely revolutionize and i should say this is a huge technical challenge this is not [00:08:27.18] [00:08:27.18] this is not a small technical challenge but if you could do this this would revolutionize numerical [00:08:31.14] [00:08:31.14] computation that is the the sort of estimate that i put at the beginning for the number of [00:08:36.02] [00:08:36.02] mesh points needed you know really should be far fewer if we could only figure out how to do this [00:08:40.06] [00:08:40.06] practically it would lead to sort of a revolution in understanding why you know how do we think [00:08:45.20] [00:08:45.20] about dynamical systems as they become very large and i also think that it has the opportunity as [00:08:51.03] [00:08:51.03] well to sort of change the way that experimental measurements are done in fields like this because [00:08:55.22] [00:08:55.22] the analog of you know you need to basically um you know compute at every point in space to figure [00:09:01.20] [00:09:01.20] out the state of a turbulent flow well at least at the moment you have to measure at every point [00:09:06.12] [00:09:06.12] in space as well and that's clearly impossible i mean we you know weather stations exist where they [00:09:11.12] [00:09:11.12] exist but we don't have them everywhere and you know one would really like to be able to sample [00:09:16.12] [00:09:16.12] experimental you know the state of you know a complicated system in such a way that that [00:09:21.16] [00:09:21.16] we can understand more from sparse measurements that we do and these are all sort of interrelated [00:09:27.03] [00:09:27.03] and all related to this problem so okay so that's the my introduction um and um maybe [00:09:34.12] [00:09:34.12] before i get to the next part i should pause and see if anyone has any comments i'm roman [00:09:40.10] [00:09:40.10] predrug i'm sure have something maybe you can correct my butchering of your field or something [00:09:45.16] [00:09:47.11] or not or you don't have to say anything i can just keep talking [00:09:50.08] [00:09:51.14] no okay i'll keep going then okay drink of water michael i have a question dan goldman here [00:09:59.11] [00:09:59.11] yep if it's time if it's question time what about the chunks of the world that don't have partial [00:10:04.13] [00:10:04.13] differential equations don't exist okay so i think that's a very interesting question um that is not [00:10:10.21] [00:10:10.21] the subject i'm going to answer this in the final way that is not the subject of this talk [00:10:14.12] [00:10:14.12] so this is carefully designed around partial differential equations and there's a reason for [00:10:18.12] [00:10:18.12] it that you will see um but let's talk why don't you ask that again afterwards if you if people are [00:10:23.16] [00:10:23.16] still if you're still interested i i think there's opportunities there too of course but it's there's [00:10:29.01] [00:10:29.01] something special about situations where you actually know the equations that you're studying [00:10:34.00] [00:10:34.00] it gives opportunities that aren't available and that's what we're trying to exploit here [00:10:38.13] [00:10:40.15] okay so the so the thesis of this talk and indeed of this research program is that machine learning [00:10:45.20] [00:10:45.20] gives a new way of discovering these patterns for specific problems and use it for computation [00:10:51.05] [00:10:51.05] so what i'm going to be interested in here is in for example developing numerical methods that are [00:10:55.14] [00:10:55.14] problem specific as opposed to general that is i'm going to be interested in figuring [00:11:00.00] [00:11:00.00] out ways of parameterizing the solution manifolds of equations in different ways and um [00:11:06.15] [00:11:06.15] and and and then the goal is to sort of use this i mean this is sort of a lofty goal and you know [00:11:11.07] [00:11:11.07] just to be very clear this is very far this is an unsuccessful research program as of now but this [00:11:15.20] [00:11:15.20] is just where we're going is that we really would like to try to use this and see if this can be [00:11:19.18] [00:11:19.18] used to um to advance the state of the field and i mean and the point is is that the key you know the [00:11:26.00] [00:11:26.00] the key innovation of machine learning that sort of changed the world that we live in is precisely [00:11:31.07] [00:11:31.07] about pattern discovery and so the thing that got me interested in this is that is that i would [00:11:36.04] [00:11:36.04] argue that this is a set of questions that is really completely and naturally synergistic with [00:11:41.05] [00:11:41.18] you know what the technology has done so okay so i'm gonna so my talk has four parts and i [00:11:46.21] [00:11:46.21] will not get through them all i i'm always i am too ambitious um in the thing but i so what i [00:11:54.12] [00:11:54.12] first want to do is just give a brief i'm sort of cognizant that this is a colloquium and so i [00:11:59.09] [00:11:59.09] have no idea who the audience is and so i'm going to just give a brief introduction into what do [00:12:04.19] [00:12:04.19] i mean by machine learning in this context because people mean lots of different things [00:12:09.01] [00:12:09.01] and in particular i want to point out where i think the major opportunities are which [00:12:13.14] [00:12:13.14] is or at least the way that we've been going after this um then i'm going to talk and i have three [00:12:18.13] [00:12:18.13] different topics that i will get through or not first i'm going to talk about using these ideas [00:12:23.16] [00:12:23.16] for um to develop algorithms for faster computation of nonlinear pds and then going [00:12:28.04] [00:12:28.04] to talk about there's a part of the talk about sort of driving approximate equations using [00:12:32.08] [00:12:32.08] these types of ideas and then the last part which will be very brief is really very recently it's [00:12:36.08] [00:12:36.08] work that i've been doing with schmool rubenstein on using and developing algorithms to interpret [00:12:41.01] [00:12:41.16] experimental measurements in a new way and actually the fourth part is literally the slides [00:12:45.18] [00:12:45.18] the plots were made in the last week so we'll see how far i get but that's my plan so okay so brief [00:12:51.22] [00:12:51.22] intro to machine learning so first i just want to show you one idea which is what inspired us to go [00:12:57.16] [00:12:57.16] down this path for pdes so there's a a subject which is called single image super resolution [00:13:03.12] [00:13:03.12] and the problem is the following is that you take a picture of something with your phone so like [00:13:09.14] [00:13:09.14] here i have a phone you know the phone so and you take a picture with your phone and of course the [00:13:14.10] [00:13:14.10] picture could be in high resolution but you don't want to store it on your phone in high resolution [00:13:20.00] [00:13:20.00] because that takes a lot of storage so a natural thing to do is when you take the picture you [00:13:24.04] [00:13:24.04] down sample it so you for example could take the original for example suppose this is the original [00:13:28.19] [00:13:28.19] and you could down sample it by a factor of four when you do that then of course the picture looks [00:13:33.07] [00:13:33.07] rather blurry and so you say okay so now the problem is how do i can i invent an algorithm [00:13:39.01] [00:13:39.01] that will take this blurry picture and turn it into a sharp picture and so a simple idea for [00:13:44.17] [00:13:44.17] doing this is that a picture of course is just three matrices right there's rgb and there are [00:13:50.02] [00:13:50.02] numerical values you know typically between zero and 255 you know on each of in each of the um [00:13:56.04] [00:13:57.07] in each of the matrices and one thing you could do is just try to interpolate you could say well i'm [00:14:00.19] [00:14:00.19] going to go back up to my original resolution and i'm just going to use something like [00:14:04.23] [00:14:04.23] cubic spline interpolation in both directions to try to do that and that's the second picture [00:14:09.12] [00:14:09.12] which you notice um is also blurry so um the the um so machine learning and in particular the use [00:14:16.13] [00:14:16.13] of neural networks with neural networks and also other methods neural networks are actually not [00:14:20.08] [00:14:20.08] the best way to do this what people have done is they've invented interpolation methods which [00:14:25.01] [00:14:25.01] are much better than that and this paper which is referenced here used a neural network to basically [00:14:30.06] [00:14:30.06] create an interpolation going from this picture to this picture which looks like this what you see is [00:14:34.15] [00:14:34.15] quite similar to the original one and the way that the neural network worked is that it was given [00:14:40.08] [00:14:40.08] large numbers of training examples large numbers of examples of pictures of the world that people [00:14:45.14] [00:14:45.14] would want to upscale and it then like looked at the different parts and it would draw take little [00:14:51.12] [00:14:51.12] local bits and it would invent an interpolant function to go from here to here that was based [00:14:56.21] [00:14:56.21] on his experience it wasn't based on the idea of cubic spline interpolation it was based on its [00:15:01.18] [00:15:01.18] experience of a large database of images now of course if you think about the images that exist [00:15:06.19] [00:15:06.19] in the world if you were you know actually going to do this for a phone or something like there are [00:15:11.11] [00:15:11.11] tons of images that people could take in different lighting conditions and different situations of [00:15:16.00] [00:15:16.00] different types of objects and to me it's a bit surprising actually that there would exist local [00:15:21.01] [00:15:21.01] transformations that were general enough that they could be used robustly in this fashion but in fact [00:15:26.15] [00:15:26.15] they do exist and they you know this is part of the technology that underlies our phones [00:15:31.07] [00:15:31.07] now if you think about this instead as images but you think about it as you know solutions of [00:15:36.21] [00:15:36.21] pdes suppose we're now so now i'd like to turn you into pde world our pde problem or our laws [00:15:42.12] [00:15:42.12] of physics problems are much simpler actually as complicated as you might think turbulence is the [00:15:48.06] [00:15:48.06] parameterization of turbulent flows is infinitely simpler than this because you know we know i mean [00:15:53.22] [00:15:53.22] i already presented to you the arguments for why there's a finite dimensional solution manifold for [00:15:58.19] [00:15:58.19] turbulent solutions of the navier-stokes equation or pick your favorite problem and so for that i [00:16:03.22] [00:16:03.22] surely could invent a training algorithm that would allow me to interpolate from this to this [00:16:09.09] [00:16:09.09] without actually distortion and you know from the point of view of numerical analysis if these were [00:16:14.04] [00:16:14.04] solutions to pdes every standard numerical method is essentially you know an interpolant something [00:16:20.00] [00:16:20.00] like this it might be more complicated than bicubic it might be more accurate but in any case [00:16:23.20] [00:16:23.20] it's using some sort of interpolate that is that is basically that is insensitive to the type of [00:16:29.22] [00:16:30.17] structure that's being interpolated to go whereas what this demonstrates is that by [00:16:35.01] [00:16:35.01] incorporating experience you can do much better and our dream was to basically use these types [00:16:40.17] [00:16:40.17] of ideas to solve pdes better so that's the sort of an example of machine learning as it relates [00:16:46.12] [00:16:46.12] to this now just taking a step back further i want to just sort of point out that the way i [00:16:53.03] [00:16:53.03] um i have been thinking about this um lately is that so there there's a lot you know if you read [00:16:59.18] [00:16:59.18] you know the tabloids or whatever machine learning means lots of different things [00:17:03.14] [00:17:03.14] it often means the use of neural networks to solve a problem which is what's happening here [00:17:08.23] [00:17:08.23] so i think a more productive way to think about it at least from the point of view of physicists [00:17:13.07] [00:17:13.07] is to ask ourselves what were the technological enablers of humans ability to do things like this [00:17:20.13] [00:17:20.13] and it turns out that there are two technological enablers which are actually very close to physics [00:17:25.22] [00:17:25.22] and well they're very close to physics and i think make it clear at least to me helped clarify [00:17:30.17] [00:17:30.17] where the opportunities might be and these enablers really have nothing to do with machine [00:17:35.20] [00:17:35.20] learning except that they enabled it and i want to emphasize what those are i'm in the hopes that it [00:17:40.13] [00:17:40.13] would be helpful for everyone so there's a um sort of classical method in solving inverse problems [00:17:47.11] [00:17:47.11] which is called the adjoint method and um the way the adjoint method works um is the following [00:17:53.05] [00:17:54.04] is that given so imagine that you have a function y is a function of x and w y is the output x is [00:18:01.07] [00:18:01.07] the input and w are the weights so um so if and so and and imagine that you were given the um the job [00:18:10.10] [00:18:10.10] of optimizing the weights so that y has certain characteristics so that this function basically [00:18:16.00] [00:18:16.00] has certain characteristics for example it could minimize some loss function which i might call [00:18:20.23] [00:18:20.23] l of y so l of y could for example be the function that makes the interpolation on the last slide [00:18:26.10] [00:18:26.10] be very um very accurate so if f this function f is difficult to evaluate um for example f could [00:18:35.03] [00:18:35.03] be the navier-stokes equations i should say so f could be start with you know an initial velocity [00:18:40.19] [00:18:40.19] and end with a final velocity or something so if it's difficult to evaluate and it has many [00:18:45.12] [00:18:45.12] parameters then the idea of the adjoint method is that what you do is you take the loss function [00:18:50.10] [00:18:50.10] which is the function that you want to minimize and you add to it lagrange multiplier equations [00:18:56.13] [00:18:56.13] of lagrange multipliers that whose job it is to um to enforce the constraint that the equations are [00:19:04.06] [00:19:04.06] satisfied and when you do this then what you end up with is that the lagrange multiplier the the [00:19:09.05] [00:19:09.05] derivatives of this you know of this loss function with the lagrange multiplier with respect to the [00:19:14.00] [00:19:14.13] the lambdas gives you the equations of motion the original f but when you but if you optimize [00:19:20.04] [00:19:20.04] but when you solve for the lambdas then you you it actually allows you to enforce these [00:19:25.12] [00:19:25.12] constraints as well and in order to solve this launch multiply a problem what always happens [00:19:29.16] [00:19:29.16] is is that you have to solve the equations twice you saw that first for the um to get the y i's you [00:19:36.19] [00:19:36.19] have to start with y zero and get y n but then in order to solve the for the for the lambdas [00:19:43.05] [00:19:43.05] you have to actually integrate backwards in time and this is you know a classical method like it's [00:19:47.22] [00:19:47.22] used for example if you're interested in knowing what the state of the concentration of methane [00:19:52.21] [00:19:52.21] is in the atmosphere given satellites above it then you take the equations of the atmosphere [00:19:57.22] [00:19:57.22] and you you essentially you know play a game like this where the equations are constraints [00:20:02.12] [00:20:02.12] you and you solve the adjoint problem to basically help do that optimization and this is sort of a [00:20:07.05] [00:20:07.05] classical method that's been known in physics for a long time and the um that is traditionally [00:20:12.06] [00:20:12.06] though in order to sort of write codes to do this was extraordinarily time consuming and so one of [00:20:17.22] [00:20:17.22] the things that happened and i guess you should note that that this problem that i've described [00:20:22.06] [00:20:22.06] to you here it both applies to um to um to like you know neural networks this y is equal to f of [00:20:30.08] [00:20:30.08] x and w could be layers of a neural network or it could be the navier-stokes equations and the um [00:20:35.22] [00:20:35.22] and so one of the things that's happened because of the machine learning revolution is that [00:20:39.16] [00:20:39.16] it's become you know just in terms of code there are like extremely high quality software packages [00:20:45.11] [00:20:45.11] that in that that um that that that have the the adjoint method encoded automatically and [00:20:52.04] [00:20:52.04] so for example here is a uh is a package named autograd that was developed by ryan adams when he [00:20:57.14] [00:20:57.14] was at harvard and um and this is a a problem on their on their website that you can look at which [00:21:04.00] [00:21:04.00] basically it simulates the navier-stokes equations starting from an initial condition and what this [00:21:08.19] [00:21:08.19] movie this gif is doing is it's solving for the initial condition um that basically makes the [00:21:14.17] [00:21:14.17] final solution be a p side and so if you look at this code it says simulate so there's um so vx [00:21:21.14] [00:21:21.14] updated and vi updated or are doing the advection step there's a projection step and then this thing [00:21:27.03] [00:21:27.03] smoke advect smoke is taking the velocity field and is infecting the smoke so this is the [00:21:32.12] [00:21:32.12] stimulating the smoke thing and then if you look down here this is sort of amazing sorry i'm going [00:21:37.11] [00:21:37.11] to make you look at code so there's an objective function which is what you're trying to minimize [00:21:42.00] [00:21:42.00] the parameters here are the parameters associated with the initial conditions and this thing is [00:21:47.01] [00:21:47.01] returning the distance from the target image um so what this is doing is it's telling you how far the [00:21:52.10] [00:21:52.10] smoke pattern is at the end of the simulation from the peace sign which is what you're trying to get [00:21:57.01] [00:21:57.01] and this is the objective function which we'd like to optimize and what these automatic packages do [00:22:02.17] [00:22:02.17] is they let you literally take the gradient of this objective function so this function where you [00:22:07.03] [00:22:07.03] just operate value and grad on this little piece of python code gives you the gradient back again [00:22:13.20] [00:22:13.20] so the inside this box used to take a graduate student year and now actually the the language [00:22:19.11] [00:22:19.11] is written in such a way that you can just code anything you want and you can compute gradients [00:22:23.14] [00:22:23.14] and optimize and that i feel is just an enormous technical advance that has made it possible to [00:22:30.06] [00:22:30.06] play with lots of in fact everything i'm going to tell you about in this talk was gotten because we [00:22:35.14] [00:22:35.14] were essentially using the ability to write such code and then we were able to we just experimented [00:22:41.01] [00:22:41.01] a lot and so i just wanted to point this out to you so this is the same thing that's going on when [00:22:45.12] [00:22:45.12] you optimize the weights of the neural network and the the other thing about it is that these [00:22:49.11] [00:22:50.00] these um languages were and they sort of in these languages tensorflow jacks pytorch are sort of [00:22:55.22] [00:22:55.22] thought of as machine learning languages but they're really languages for automatic [00:22:59.14] [00:22:59.14] differentiation and these things have been built to um these are high quality languages they're [00:23:05.18] [00:23:05.18] open source they're maintained you no longer have to pay matlab or whoever for code sorry for your [00:23:11.07] [00:23:11.07] lab aficionados in the audience or mathematica but but and also they were sort of built to work [00:23:16.13] [00:23:16.13] efficiently on hardware accelerators like gpus and tpus which essentially means that we can [00:23:20.23] [00:23:20.23] do larger problems without having to worry about the nonsense in the background and so this is like [00:23:26.04] [00:23:26.04] the major technological enabler that has been we've been using in our own work and i just wanted [00:23:31.01] [00:23:31.01] to highlight that there it's okay i've now gone through another moment to pause questions comments [00:23:38.02] [00:23:43.09] it's okay if there are no questions or comments maybe i've i mean either [00:23:46.12] [00:23:46.12] put everyone to sleep or you're hopefully having a nice coffee or [00:23:49.20] [00:23:50.17] wow so michael maybe a quick comment uh so edward is asking what happens if you run the original [00:23:56.12] [00:23:57.03] uh the algorithm uh you know when you discuss you know what happens when you take an image then you [00:24:03.20] [00:24:05.03] course grain it and then you recover the original what happens if you run the same [00:24:09.05] [00:24:10.12] machine learning algorithm on the original oh that's well so the the way that the the [00:24:16.21] [00:24:16.21] algorithm is set up is that you first down sample so the the algorithm takes a mesh which is [00:24:22.15] [00:24:22.15] say down scale by a factor of four and then it upscales it so it's literally filling in the [00:24:27.16] [00:24:27.16] points in between so you can't actually run it on the original right because the original has the [00:24:32.02] [00:24:32.02] points in between you have to delete some of the points and try to recover them but couldn't you [00:24:38.08] [00:24:38.08] i mean say that you originally recorded it at some finite resolution and try and guess what it would [00:24:43.18] [00:24:43.18] have looked like if you would you know obtained it even higher oh he's at a higher resolution [00:24:47.12] [00:24:48.17] oh well you could do that but these things tend to be trained for the resolution that they're at [00:24:53.09] [00:24:53.09] so they tend to be trained specifically for the resolution of death so you know if you say oh i'm [00:24:57.07] [00:24:57.07] going to discover atoms this way you know you you won't right that is it only it's an interpolator [00:25:02.08] [00:25:02.08] it's no more than an interpolator and that's true both for the photography case and it's also true [00:25:07.22] [00:25:07.22] for what i'm about to show you which is the sort of pde version of this like we're not this isn't [00:25:14.13] [00:25:15.16] there's nothing deep about the interpolation step the point though is is that to the extent [00:25:20.02] [00:25:20.02] that the functions that are occurring and the problems that you're looking at have regularities [00:25:24.02] [00:25:24.02] then we should use them to do what we were actually trying to do that's the only point really [00:25:29.16] [00:25:31.22] and that this this you know um obsession that we've had for centuries of using you know [00:25:37.18] [00:25:37.18] everything is calculus so we write down general algorithms that work for any smooth function [00:25:42.02] [00:25:42.02] well that's fine but like you might know more about the function than that [00:25:46.02] [00:25:46.02] and you know for the point of view of this part of what i'm about to tell you like if you know [00:25:49.14] [00:25:49.14] the equation the the function of based in the navier-stokes equation that's a huge constraint [00:25:54.00] [00:25:54.00] right and we might as well just assume that it does because that's what we're looking for [00:25:57.20] [00:25:57.20] um and um and when we do that then it you know it it gives a big boost [00:26:02.19] [00:26:06.08] other comments okay sorry i guess this is all 30 minutes of introduction i was [00:26:10.15] [00:26:10.15] once told if you give a talk with 30 minutes of introduction and people feel there's no content [00:26:15.07] [00:26:15.07] um and i apologize if you think that's true i just thought i would go slowly but there may be [00:26:20.02] [00:26:20.02] no content that i don't know so but anyway now i will start the content so okay um we're going [00:26:24.15] [00:26:24.15] to start with pd so i want you to imagine for the sake of argument that these are solutions to some [00:26:29.03] [00:26:29.16] pde and i don't tell you what pde that they are so the typical way of solving pds and this i [00:26:35.12] [00:26:35.12] mentioned in the beginning is use a bunch of grid points so those are the red points you then get [00:26:40.10] [00:26:40.10] rid of the of the solid lines because you didn't know what they are and now what we need to do [00:26:44.21] [00:26:44.21] if we want to advance the partial differential equation is we need to um we need to basically [00:26:50.02] [00:26:50.02] compute the derivative some number of derivatives of the solution with respect to space at every [00:26:55.09] [00:26:55.09] mesh point so that's the game that we have to play and of course um you know in order to do that we [00:27:00.10] [00:27:00.10] have to somehow interpolate we're only given these information and we have to interpolate and [00:27:04.08] [00:27:04.08] you know typically you might try to interpolate by using some sort of polynomial interpolation [00:27:08.15] [00:27:08.15] and that of course is you know classically bad because you know of you know runs phenomena type [00:27:14.19] [00:27:14.19] overshoots and things like this so um that is the derivatives that the blue curve are given you know [00:27:20.13] [00:27:20.13] don't agree with the derivatives of the black curve and this is the problem of y when you're [00:27:24.10] [00:27:24.10] solving numerically in the discrete equations what i mentioned at the beginning you need to choose [00:27:28.12] [00:27:28.12] a resolution which is small compared to the scale of the target that you're trying to get because [00:27:33.12] [00:27:33.12] otherwise the um the the interpolants don't work when you're trying to compute derivatives [00:27:37.22] [00:27:38.21] so on the other hand suppose i tell you um that actually that these points come from solutions to [00:27:44.12] [00:27:44.12] berger's equation and at the same time so this is burger's equation it's just it's a sort of [00:27:48.19] [00:27:48.19] version of the navier-stokes equation a simple 1d pde that people like me like as an example problem [00:27:54.10] [00:27:54.10] and then suppose i give you an enormous library of solutions to burger's equation i tell you that [00:27:59.14] [00:27:59.14] i've been doing this for a while i've seen a lot of solutions to burger's equation and it [00:28:02.19] [00:28:02.19] all looks like this so now what you can do is when you try to interpolate this thing [00:28:07.12] [00:28:07.12] you shouldn't you won't use polynomials you will use some regressor that is based on [00:28:12.00] [00:28:12.00] these solutions as examples and if you build such a regressor and there are millions of [00:28:16.02] [00:28:16.02] ways of doing this then you you do this is the yellow curve then you do much better basically [00:28:21.14] [00:28:21.14] that is the yellow curve in the black curve you can actually reproduce the solution quite well [00:28:26.06] [00:28:26.06] so this is the idea that is we can build basically you know we can understand the [00:28:31.05] [00:28:31.05] structure of solutions and small scales and use them to try to estimate derivatives so um [00:28:36.10] [00:28:37.05] so okay i'm going to show you this sort of thing on solutions of um of various types of equations [00:28:43.22] [00:28:43.22] and i just want to get our terms right so this on the left is a high resolution simulation this [00:28:49.05] [00:28:49.05] is of a passive scalar that's being advected in a turbulent flow um if i course grain it that's like [00:28:55.16] [00:28:56.06] so that means i might take every cell break it into boxes and average the concentration [00:29:00.08] [00:29:00.08] in every box this is an exact coarse graining of the solution so that is we're not going to [00:29:05.12] [00:29:05.12] try to reproduce it at the same resolution but we are going to try to make sure that we are [00:29:09.18] [00:29:09.18] accurately measuring the um what happens on small scales that's the um you you know on average so [00:29:17.14] [00:29:17.14] um if one tries to do this numerically then this is the exact course grading so if one basically [00:29:23.03] [00:29:23.03] takes a sort of simple numerical scheme and tries to coarse grain the equation and then integrate on [00:29:29.12] [00:29:29.12] that mesh then there's a there's a classical thing that occurs which is an instability which you can [00:29:34.02] [00:29:34.02] see from these little squares which is that the numerics goes unstable and takes over the whole [00:29:38.00] [00:29:38.00] thing and whereas our goal is to basically invent numerical methods that give that give the exact [00:29:43.18] [00:29:43.18] course grammy and um so right actually maybe this is worth saying as well there's another problem of [00:29:49.11] [00:29:49.11] classical numerical analysis which is that even if the problem isn't unstable that by using a coarser [00:29:55.18] [00:29:55.18] mesh you can tend to smear the whole thing out so if you compare this method with this you will [00:30:01.09] [00:30:01.09] see that it gets smeared out a bit because of what we call numerical diffusion so okay all of [00:30:06.23] [00:30:06.23] this comes from you know estimating derivatives and what i'm going to talk about doing is you [00:30:12.13] [00:30:12.13] know replacing the local rules like this one for estimating a diffusive flux with machine learning [00:30:17.22] [00:30:17.22] so here is the method i already sort of said this in pictures but i will just repeat it again so [00:30:21.22] [00:30:21.22] what we're going to do is the following you give me your problem i don't care what problem it is [00:30:25.20] [00:30:27.05] what i'm going to do is do very high resolution simulations of small pieces of the solution [00:30:32.02] [00:30:32.02] so i'm going to you know take your big thing i'm going to break it into little boxes and i'm going [00:30:35.16] [00:30:35.16] to have lots of examples of what can happen on small scales and then what i'm going to do is [00:30:40.10] [00:30:40.10] train a regression algorithm to machine learn the solution manifold and use that to try to estimate [00:30:44.23] [00:30:44.23] spatial derivatives from the low resolution data to try to integrate the equation and so to be [00:30:50.06] [00:30:50.06] specific so here for example is an average stokes equations and um what we're going to do is whereas [00:30:57.09] [00:30:57.09] ordinarily you would you know ordinarily you would do what i was describing you would interpolate [00:31:03.09] [00:31:03.09] you would compute fluxes there are methods like an invection method for convection what [00:31:07.22] [00:31:07.22] we're going to do is we're going to replace only this part the interpolation part with a machine [00:31:13.05] [00:31:13.05] sheen learning alternative and the price that we pay for this is that the um the method will no [00:31:18.02] [00:31:18.02] longer be general it won't work on any equation other than the ones that we've trained it on [00:31:21.22] [00:31:21.22] but as i said at the beginning we really only in practice care about a couple of equations [00:31:26.19] [00:31:26.19] and so the thought is that this you know could be a robust way of helping numerical analysis [00:31:33.03] [00:31:33.03] so um so let's see so more specifically the way that we do this is the following is that you're [00:31:39.22] [00:31:39.22] given the solutions u of xn which are the values of the field at every mesh point and normally what [00:31:45.05] [00:31:45.05] you would do is you would compute a derivative by computing some sort of a weighted sum [00:31:50.02] [00:31:50.02] of these u's within some stencil where the alphas tend to be derived from calculus or from some [00:31:55.18] [00:31:55.18] finite element something approximation like these are these are derivable using applied mathematics [00:32:01.09] [00:32:01.09] in contrast what we're going to do here is we're going to take the use as input and we're going [00:32:05.20] [00:32:05.20] to feed them through a neural network this is a neural network we will then the neural network [00:32:10.04] [00:32:10.04] will produce the alphas and we will use that to um to produce the derivatives so of course this [00:32:15.14] [00:32:15.14] neural network comes with it a large number of parameters that are associated with it and so [00:32:20.04] [00:32:20.04] the way that we are going to find the alphas is to use the automatic differentiation capabilities [00:32:25.09] [00:32:25.09] within the codes if we have a differentiable code we can and we basically will optimize them so that [00:32:30.12] [00:32:30.12] they will actually produce accurate answers on training examples and then we will check [00:32:35.01] [00:32:35.01] whether or not they generalize outside of that and so that is the the key thing that that we're [00:32:40.13] [00:32:40.13] going to be doing in the next one and just to sort of give you an example up front so [00:32:44.17] [00:32:44.17] this is the sort of discretization for the first derivative that we get out of this [00:32:48.19] [00:32:48.19] method for berger's equation so you ordinarily like to compute a derivative you have to take [00:32:53.14] [00:32:53.14] um you know the difference between between two numbers and this is basically a six-point stencil [00:33:01.05] [00:33:01.05] and you will notice that these are sort of showing you the magnitude this is trying to compute [00:33:06.02] [00:33:06.02] a derivative or it looks like minus the derivative you know at the point in the middle [00:33:10.02] [00:33:10.02] and so for smooth parts of the solution oops for smooth parts of the solution the stencil is [00:33:15.09] [00:33:15.09] is recovering basically something that looks like a normal difference that you would take [00:33:19.20] [00:33:19.20] but whereas if you look at what it looks like the same stencil when you're in the shock part of this [00:33:24.15] [00:33:24.15] where the solution is changing very rapidly it's it's actually asymmetric it's a very different [00:33:29.05] [00:33:29.05] beast and the the notion is is that we train the computer to learn these things um so that it can [00:33:35.14] [00:33:35.14] compute you know at lower resolution that it could before so this is sort of you know a pictorial [00:33:41.07] [00:33:41.07] diagram of repeating what i just said you know we take the function we feed it through a neural [00:33:46.02] [00:33:46.02] network we get the coefficients we get the spatial derivatives we then feed it in to compute the [00:33:50.15] [00:33:50.15] fluxes we then take the fluxes and basically use mass conservation to compute the time derivative [00:33:55.16] [00:33:55.16] and basically what we do is we train this thing by going around this loop and um and and choosing the [00:34:02.06] [00:34:02.06] weights within this to minimize the loss so that the thing works and once we've trained it and it [00:34:06.23] [00:34:06.23] works then we can just integrate the equations by going around this this loop so this is the idea [00:34:12.19] [00:34:14.00] and i guess the the sort of messages if i can get this is that this thing works very well so this [00:34:20.00] [00:34:20.00] is a baseline this is a solution of broker's equation and the the um the blue dots are a [00:34:25.11] [00:34:26.08] are the blue lines are an exact solution and the orange dots are the approximate solutions [00:34:31.22] [00:34:31.22] the baseline includes 64 points in this domain which is what you would get if you just use [00:34:36.10] [00:34:36.10] the standard numerical method on the 64 points and you can see that it becomes quite unstable [00:34:42.04] [00:34:42.04] in contrast this is the neural network which is able to basically track the curve [00:34:46.13] [00:34:47.20] very well and um the method sort of as you would expect because we're learning only local features [00:34:53.16] [00:34:53.16] the method um you know you can use it on much larger domains than the one that you [00:34:58.04] [00:34:58.04] trained it on so here is a space time diagram so this is showing you the solution at every [00:35:03.22] [00:35:04.19] you know every time as a function of space and this is this this model was trained on [00:35:09.16] [00:35:09.16] this domain which goes from 0 to 2 pi but we're testing it on this much larger domain [00:35:15.07] [00:35:15.07] and it accurately so this is 32 times upscaling of the mesh and it accurately reproduces the solution [00:35:21.14] [00:35:22.06] so i don't want to belabor this but just to sort of show you if you plot like a measure of error so [00:35:27.01] [00:35:27.01] one way of measuring error is the mean absolute error and you do this as a function of the [00:35:31.18] [00:35:31.18] resample factor so resample factor of one is the ground truth solution and this is sort of how much [00:35:36.23] [00:35:36.23] you've upscaled over a high resolution simulation so the mean absolute error is the distance between [00:35:42.19] [00:35:42.19] the integrated solution and the ground truth which is defined as the space and time average for [00:35:48.12] [00:35:48.12] 15 total time units which is you know the shocks are like bouncing around for a long time and [00:35:54.06] [00:35:54.06] this is also trained it's tested on an inference domain that's 10 times larger than the training [00:35:59.18] [00:35:59.18] data set so this is actually operating on a much larger domain and what you see is that with a [00:36:04.17] [00:36:04.17] standard method that you would use that's just based on calculus then these are the these are [00:36:10.08] [00:36:10.08] these two the two triangles these are first and third order methods you know which have some power [00:36:15.18] [00:36:15.18] law dependence of error on mesh spacing whereas the neural network um you know beats them all and [00:36:22.08] [00:36:22.08] although i don't really have time to basically belabor this point the sort of state-of-the-art [00:36:26.13] [00:36:26.13] method for solving these equations numerically is called the we know method it's a weighted [00:36:31.12] [00:36:31.12] um oscillatory scheme and the um the computer actually is able to essentially derive that scheme [00:36:38.02] [00:36:38.02] when the the right constraints are put on it without human intervention and this is sort of one [00:36:42.17] [00:36:42.17] of the powerful things i think about this whole automatic differentiation on steroids thing is [00:36:47.20] [00:36:47.20] that you can you know demand what you would like to be true and try to find if there are solutions [00:36:53.12] [00:36:53.12] to it and um and you know it it's a it makes things more efficient so um let's see model [00:37:00.21] [00:37:00.21] learns upwinding i don't really want to go through this this is more we've done this on lots of 1d [00:37:05.01] [00:37:05.01] equations um with sort of similar um features um you know here's an example of a of a simple [00:37:13.09] [00:37:13.09] advection equation actually i put this on here because if any of you who are watching i know i'm [00:37:17.12] [00:37:17.12] going through this much too quickly but if anyone is interested in this there's a um so there's a [00:37:22.06] [00:37:22.06] github link which i can send a roman or somebody or you can find by just googling and there was um [00:37:29.03] [00:37:29.03] by a student at harvard named xiaoweiang who did this during an internship at google and so it's a [00:37:34.19] [00:37:35.12] it's a um and in this but this is in jaway's and you can there's a there's a wonderful tutorial of [00:37:41.20] [00:37:41.20] solving one d of x equations this way that is a good introduction to this method so now the goal [00:37:47.22] [00:37:47.22] of course is to try to use this to go towards the problem that i talked about at the beginning and [00:37:52.04] [00:37:52.04] i'm going to just make a couple of remarks about it and then maybe i will move on to the next topic [00:37:57.14] [00:37:57.14] in the um in the in the light of time so um what one would really like to do is use this to solve [00:38:04.17] [00:38:04.17] a uh you know you know tur you know equation with the complexity of turbulence and really try to [00:38:09.20] [00:38:09.20] to both speed up the um the solutions of equations and to even find better models for parameterizing [00:38:17.16] [00:38:17.16] turbulence although at the moment what we're focused on here is really speeding up equations [00:38:22.10] [00:38:22.10] i should point out that one of the things which is sort of well maybe noteworthy about [00:38:27.20] [00:38:27.20] the approach i just told you is that we didn't we use machine learning not to average what the local [00:38:34.15] [00:38:34.15] solution was but to find a scheme that allows one to reproduce the accuracy the same accuracy [00:38:42.02] [00:38:42.02] but on a larger mesh so i this is sort of one of the reasons i mentioned predrag's bricks at the [00:38:47.11] [00:38:47.11] beginning is that we're literally i think trying to learn the actual bricks there's another sort of [00:38:52.15] [00:38:52.15] dream where what you do is you average locally and you say that because i'm averaging locally [00:38:57.03] [00:38:57.03] then i'm i'm um i'm able to to find an equation that doesn't require as much resources that's [00:39:03.03] [00:39:03.03] so for example in turbulence that's their methods sort of the state-of-the-art gold standard is [00:39:08.02] [00:39:08.02] called called the large eddy simulation um but but this is actually not that this is actually trying [00:39:12.13] [00:39:12.13] to ask if we can deduce the pattern and upscale and of course in what i've just shown you we're [00:39:17.05] [00:39:17.05] doing it in a very particular way which is only one you know one of many ways of doing it which [00:39:22.12] [00:39:22.12] we also could talk about if you wanted to at the end of this talk so in any case we've basically [00:39:26.23] [00:39:26.23] extended these these ideas to both advection diffusion and the turbulent flow and also to [00:39:32.02] [00:39:32.23] turbulence itself and i'm just going to show you just a couple of quick results and then [00:39:36.15] [00:39:36.15] we'll move on so this is what i'm going to show you are simulations of advection diffusion in a [00:39:41.20] [00:39:41.20] turbulent flow so these are turbulent flows with different seeds and this is a particular type of [00:39:46.23] [00:39:46.23] velocity field that is sort of canonical for or a 2d turbulent flow and here what we did is we were [00:39:54.08] [00:39:54.08] trying to to um to learn the um we were trying to learn advection in a in a in a turbulent flow and [00:40:03.01] [00:40:03.01] of course for that if every time you change u then that is the velocity field you change the equation [00:40:09.07] [00:40:09.07] and if we had a numerical method that only worked for single u then it would essentially be useless [00:40:13.22] [00:40:13.22] because you would have to retrain every time somebody gave you a different velocity field but [00:40:18.06] [00:40:18.06] on the other hand if you think about the physics of this then if this is the velocity field if you [00:40:23.05] [00:40:23.05] put a dye in it the dye is going to be stretched locally and because the dye is stretched locally [00:40:30.02] [00:40:30.15] then um there's there's universality in the local behavior of the flow and so what you would think [00:40:36.02] [00:40:36.02] is is that if you could create flows from an ensemble which have the same local properties [00:40:41.03] [00:40:41.03] that it should be possible to train a model that basically is really even independent to the flow [00:40:45.20] [00:40:45.20] as as long as it lives in that ensemble and that's indeed what we did we took you know this is the [00:40:50.23] [00:40:50.23] same picture as before except that the con except that for the physical state we feed in both the [00:40:56.04] [00:40:56.04] concentration and the velocities to the neural network to get these coefficients the alphas [00:41:01.16] [00:41:01.16] and we then proceed as before and when we train it we train it with an ensemble of use which are [00:41:07.14] [00:41:07.14] essentially given by different random seeds or different types of flows or different whatever [00:41:13.05] [00:41:13.05] and when we do this we find that that you know the sort of same story that i told you before [00:41:18.08] [00:41:18.08] holds namely the um the um the neural network is able to basically um it is able to um is able to [00:41:28.08] [00:41:28.08] to um to you know to integrate for much longer times in a much higher accuracy than any of the [00:41:35.01] [00:41:35.01] baseline solutions so this for example is the reference solution um that's that this is the [00:41:41.05] [00:41:41.05] this is a reference solution on a fine 256 by 256 grid this is the revolution the reference [00:41:46.13] [00:41:46.13] solution we sampled to a course grid this is the exact course graining and this is what [00:41:50.23] [00:41:50.23] the neural network produces whereas the baseline solutions dissipate to nothing on a 32 by 32 grid [00:41:57.12] [00:41:57.12] and so um so basically that's the deal and here are lots of plots it looks like i've i'm i'm [00:42:03.07] [00:42:03.07] going too long i'm showing errors and things like this the same model also works on different flows [00:42:08.15] [00:42:08.15] it doesn't even have to be a turbulent flow it just has to be a flow which has the same [00:42:12.17] [00:42:12.17] tier statistics and so this is an example of a flow which starts out with a blob [00:42:16.21] [00:42:16.21] and it twists it and then it untwists it it's sort of a shear this way and then shear that [00:42:22.08] [00:42:22.08] way and the neural network you know is able to basically recover the original with only [00:42:27.07] [00:42:27.07] a little bit of dispersion whereas the baseline method you know smears it out um considerably [00:42:33.07] [00:42:33.07] so um finally um i guess the last thing i want to say is well actually there one other thing i want [00:42:38.02] [00:42:38.02] to say is that that you might worry about speed so what is true is that in order to basically compute [00:42:44.00] [00:42:44.00] the sorts of the neural networks that is in order to compute if we go back to this picture so this [00:42:49.01] [00:42:49.01] box is expensive because normally if you wanted to compute these coefficients alpha you know them so [00:42:55.11] [00:42:55.11] if you want to compute a derivative you know the alphas you've pre-programmed them and so you just [00:42:59.12] [00:42:59.12] compute the sum whereas here you have to actually compute this whole thing on the fly as you're [00:43:05.05] [00:43:05.05] computing which you know is a depending on how big this thing is it can be a lot of operations [00:43:10.08] [00:43:10.08] but what we found is is that especially when one gets to two and higher dimensions [00:43:14.12] [00:43:14.12] that actually that that this is not a problem and there are sort of several reasons for this [00:43:19.07] [00:43:19.07] one is is that if you get a factor of eight to ten in each direction then of course you're down the [00:43:25.05] [00:43:25.05] number of mesh points is going down considerably so you have considerable room you know for this [00:43:30.19] [00:43:30.19] compute to fill in that as your you've already gotten a huge cost savings but the other is is [00:43:35.09] [00:43:35.09] that it turns out that most numerical codes are actually running on typical computers with a [00:43:41.05] [00:43:41.05] relatively small percentage they're using a relatively small percentage of the total flops [00:43:45.14] [00:43:45.14] that because most of the most of the time actually is reading numbers in and out of memory [00:43:50.04] [00:43:50.17] it's not actually the floating point operations on the numbers but once you've read them in and out [00:43:54.23] [00:43:54.23] of memory then if you once you've read them you can do whatever computations you want on them for [00:43:59.11] [00:43:59.11] free and so our algorithms for advection diffusion are running at about 80 percent total peak flops [00:44:05.01] [00:44:05.01] whereas the the typical the high order baselines at least in our hands are running at only a couple [00:44:09.09] [00:44:09.09] percent and so there's a huge factor that you get here which essentially saves you from [00:44:14.06] [00:44:14.06] much of the neural network compute cost that you get just by using the computer more efficiently [00:44:18.23] [00:44:18.23] and there are more ways of pushing this sort of thing so um so yeah so we are able to get the [00:44:25.14] [00:44:26.13] the same accuracy um at much higher speeds um so finally um and it looks like i've totally [00:44:33.18] [00:44:33.18] missed time this talk in typical fashion so we've been working and this is with a group [00:44:37.16] [00:44:37.16] within google on a um on a fluid turbulence version of this and there we've written a [00:44:42.21] [00:44:42.21] fully differentiable flow solver within jax this is the sort of snippet of code from the solver [00:44:48.17] [00:44:48.17] it basically solves the navier-stokes equations and produces the trajectories but the thing is [00:44:53.09] [00:44:53.09] is that this is differentiable in the same way as what i showed you before and so there's just [00:44:57.14] [00:44:57.14] a lot of experimenting that can be done to try to find efficient ways of computing and [00:45:02.15] [00:45:02.15] that is what we've been doing and i think i'm going to skip this other than to say that [00:45:07.14] [00:45:07.14] um because i have too many slides other than to say that things are going well um with this um and [00:45:14.04] [00:45:14.04] i'm still quite bullish on it so i now have nine minutes left roman um i could stop now or i could [00:45:21.01] [00:45:21.01] just under the theory that i've spoken too much and everyone's tired or i could keep going [00:45:26.12] [00:45:27.20] so can you give us a glimpse of number four and you know and another another five six minutes [00:45:34.12] [00:45:34.12] something like that yeah that sounds good so actually so number four so i will give you a [00:45:39.03] [00:45:39.03] glimpse of number four so number three number three number three is technical okay so number [00:45:46.02] [00:45:46.02] four number four okay so my slides for number four are really lousy because i um literally [00:45:52.19] [00:45:52.19] just started putting them together um yesterday because it was something we're doing excited about [00:45:56.15] [00:45:56.15] so okay so this is an experiment that i've been um that actually is by small rubenstein [00:46:03.01] [00:46:03.01] um who um who basically did this you know sort of built this experiment for a turbulence experiment [00:46:09.03] [00:46:09.03] and the the way the experiment works is that there's a piston that produces a vortex ring [00:46:15.03] [00:46:16.02] and then what schmool does is he uses a scanning laser sheet [00:46:19.16] [00:46:19.16] to scan the the flow field within this vortex ring so you're supposed to imagine and i have a picture [00:46:26.00] [00:46:26.00] so imagine you have a vortex ring and then what schwall does is he has a scanning laser sheet that [00:46:30.21] [00:46:30.21] scans it it looks you know at different slices and it very rapidly scans it to see what the and [00:46:36.10] [00:46:36.10] if there are tracer particles in it you can see how they're moving and you can measure [00:46:39.16] [00:46:39.16] the velocity field on each of the laser sheets and then of course you can integrate all of those if [00:46:44.02] [00:46:44.02] you want to get the full 3d velocity field of the entire thing now this is limited of course by how [00:46:50.06] [00:46:50.06] quickly and how accurately you can move your laser sheet so we had the following thought which i will [00:46:56.19] [00:46:56.19] just explain in words which is that you know in principle if i tell you the velocity field [00:47:02.19] [00:47:02.19] on a single slice of this laser sheet on a single 2d slice if you if we also just agree [00:47:10.02] [00:47:10.02] that whatever the velocity field is that it obeys the navier-stokes equations i mean after all [00:47:15.14] [00:47:15.14] it's a fluid so it must obey the navier-social equations the question that we were [00:47:19.22] [00:47:19.22] wondering was is that in itself sufficient to basically reconstruct the full 3d velocity [00:47:26.23] [00:47:26.23] that is it's that is you know and this sort of like how much of the flow do i actually have to [00:47:32.02] [00:47:32.02] measure until i can reconstruct the whole thing that's a fundamental question now what this has [00:47:36.13] [00:47:36.13] to do with machine learning is again i would like to emphasize that this is really about having a [00:47:41.09] [00:47:41.09] differentiable code if we go back to that jax code that i wrote or the the smoke thing that i [00:47:46.17] [00:47:46.17] wrote at the very beginning you know in that thing if i basically tell you the velocity on a plane [00:47:51.03] [00:47:51.16] you have an average stokes solver and you can basically find the full 3d solution at [00:47:56.19] [00:47:56.19] this time such that the velocity agrees with what you have that's actually an optimization [00:48:00.23] [00:48:00.23] problem that one can solve but because of these this technology that comes from machine learning [00:48:05.07] [00:48:05.07] it's really not complicated to do that and so we started doing that this was with the postdoc [00:48:09.20] [00:48:12.02] and also with ryan and joel who are students of small as we started doing this on just purely [00:48:17.14] [00:48:17.14] numerical simulations first and these are sort of these slides aren't as good as they could be [00:48:22.23] [00:48:22.23] to demonstrate how this works but what i want to report to you is that this actually really [00:48:27.01] [00:48:27.01] works remarkably well so this is we did this by starting you know and this is sort of reynolds [00:48:32.06] [00:48:32.06] number in the range of 2500 to 7000 of a vortex ring so this is high enough reynolds number that [00:48:39.07] [00:48:39.07] that you know they're they're um there are kelvin waves and [00:48:42.12] [00:48:42.12] for aficionados and things that are happening to the ring but what we were able to show [00:48:46.08] [00:48:46.08] from from just numerical simulations as a start that by um the by like just taking the [00:48:53.03] [00:48:53.03] the you know a snapshot through the middle and then demanding that the 3d velocity field agreed [00:48:58.21] [00:48:58.21] with the um you know demanding that the solution obey the navier-stokes equation that that was [00:49:03.11] [00:49:03.11] sufficient to very accurately reproduce the full 3d velocity field so this is the d this is the [00:49:09.03] [00:49:09.03] real dns and this is what was reproduced from an algorithm that just reconstructs from a single [00:49:14.08] [00:49:14.08] slice and we then um with this experimental system that schmuel built with this rapid scanning piv [00:49:20.00] [00:49:20.00] you know we're able to do the same thing with experimental data and really show that from a [00:49:24.13] [00:49:24.13] single slice one can construct the 3d field and so i mean i'm and the sort of the the the the fitting [00:49:31.18] [00:49:31.18] there's a neural network in here i mean it's sort of silly there's not really a neural network but [00:49:34.23] [00:49:34.23] the neural network was there to parameterize the velocity field which was demanded to obey the [00:49:39.22] [00:49:39.22] navier-stokes equation to do the reconstruction it was simply playing the role of a way of [00:49:44.04] [00:49:44.04] representing the function and it's simply a convenient way of using these differentiable [00:49:48.02] [00:49:48.02] these automatic differentiation tools so um so that's basically what i wanted to say [00:49:52.17] [00:49:52.17] about that it's a but i think that there's huge opportunity here um basically to sort of change [00:49:57.20] [00:49:58.19] the way that measurements are done because it basically gives us ways of imposing constraints [00:50:02.15] [00:50:02.15] that otherwise would be quite difficult to do so i guess i'm gonna sort of stop there so the work [00:50:08.06] [00:50:08.06] on pdes was was done the the the navier stokes part which i was discussing at the end there's [00:50:13.12] [00:50:13.12] a team at google that's basically been building the code to do this and um i can report back about [00:50:18.04] [00:50:18.04] that another time yohai barcelona was a postdoc at harvard who with stefan hoyer really came up with [00:50:24.08] [00:50:24.08] the original idea for how to do this and yohai has now moved to tel aviv the approximate equations [00:50:31.07] [00:50:31.07] part i didn't talk about um the experiments is was is ongoing work with xiao zhu shmuel [00:50:36.21] [00:50:37.16] joel and ryan and i thank you for your attention i apologize for the actually i don't even know if [00:50:42.13] [00:50:42.13] this talk was chaotic because i can't see any of your faces um maybe it's the saving grace of our [00:50:48.00] [00:50:48.00] electronic-based world but anyway hopefully this was not totally useless so thank you [00:50:53.20] [00:51:06.23] one downside of virtual talks is that the applause doesn't come out quite right oh no that was really [00:51:11.18] [00:51:11.18] great actually yeah if you want to see your faces just stop shorting and move the bar at the bottom [00:51:17.22] [00:51:18.23] oh i could stop sharing how do i do that um [00:51:21.20] [00:51:23.22] click on that thing uh go to the top and click again on the share screen you know the [00:51:29.22] [00:51:30.13] third button from the from the left oh it's there okay [00:51:33.11] [00:51:34.15] cool all right there's still there's still 80 people but you know they're all very shy [00:51:41.05] [00:51:41.05] so you see only a few spaces they're in there i'm used to that [00:51:46.00] [00:51:48.12] all right so let me go through the questions that people put in the chat uh so question from uh [00:51:58.02] [00:51:58.02] michael zykowski um should we think of the ml part of this as finding the best interpolation scheme [00:52:05.18] [00:52:05.18] such that the physics is preserved if the original equations are run on the interpolated grid [00:52:10.13] [00:52:12.12] yeah that's right that's that's basically all it is it's sort of finding a basis [00:52:16.08] [00:52:16.08] it's like it's like on the for an equation an even problem-specific [00:52:19.20] [00:52:19.20] way finding bases that accurately reflect what's going on on small scales [00:52:24.02] [00:52:26.02] actually i should say this is well anyway that's what i showed you um next question um [00:52:34.19] [00:52:35.14] probably the name and nevermind um neural networks are universal approximators so maybe it's not a [00:52:41.05] [00:52:41.05] surprise they do well at approximating but the equation has an interesting feature in it shocks [00:52:47.03] [00:52:47.16] would it be possible to use neural networks to discover them if we did not know that they exist [00:52:54.04] [00:52:55.05] right so i think that's a good question so i guess what i would say is that for what i just is the [00:53:02.10] [00:53:02.10] algorithms that i just described we were really only trying to take a highly resolved situation [00:53:08.02] [00:53:08.15] and scale it up a bit right to basically lower the computational requirement and so that's and [00:53:15.11] [00:53:15.11] and sort of a typical number that this seems to work for is 10. like you can get rid of you can [00:53:20.08] [00:53:20.08] sort of do it 10 times less resolution that's not the sort of you know bricks that predrag [00:53:26.21] [00:53:26.21] you know has talked about in terms of coherent structures if you want to think about like [00:53:30.17] [00:53:30.17] coherent structures as being the driver of whatever the problem is which i think is the [00:53:36.13] [00:53:36.13] right way to think about it that is actually at a scale typically that is larger than this sort of [00:53:42.00] [00:53:42.00] 10 grid point scale that i'm talking about i'm sort of talking about like [00:53:45.16] [00:53:45.16] under resolving the komodo rf scale if you want to talk about turbulence but i do think though [00:53:50.23] [00:53:50.23] that coherent structure detection as for example detecting shocks is something that's possible to [00:53:56.17] [00:53:56.17] do with machine learning and but those that would require different ideas than what i discussed [00:54:02.21] [00:54:03.14] in this talk and i would i'm happy to chat about that but i didn't that's a i think that's also [00:54:09.20] [00:54:10.12] possible and interesting and in fact to really you know realize the possibilities here of [00:54:17.12] [00:54:17.12] you know like you know predrag's dream of learning the dynamic bricks that's what's going to have to [00:54:21.09] [00:54:21.09] happen because otherwise there's a limit i think what i described you know lowers computational [00:54:27.12] [00:54:27.12] cost to some well lowers computational costs but there's a limit to how far it can go [00:54:32.00] [00:54:35.07] okay next question what is the typical size of of training data and how long does it take to [00:54:40.15] [00:54:40.15] train the network right um so basically you need to give the machine as many examples [00:54:46.19] [00:54:46.19] you need to give it so many examples that it doesn't see new examples as you're training it [00:54:51.03] [00:54:51.03] if you do this if you play the game as i've just described and you and you happen to through your [00:54:56.10] [00:54:56.10] integration of the equations with these new new discretizations if you come upon a solution that [00:55:02.04] [00:55:02.04] has not appeared in the training data before or that's not similar then the whole thing will blow [00:55:06.19] [00:55:06.19] up because it's a dumb regressor and you know it's you know we do our best to put constraints [00:55:12.04] [00:55:12.04] based on physics to try to get rid of that but that's the situation typically though um you know [00:55:16.23] [00:55:17.12] the universality of solutions on small scales and equations like this is true so it doesn't [00:55:21.22] [00:55:21.22] really require that much data to basically um to basically train it i mean how much data depends on [00:55:28.19] [00:55:28.19] you know what problem you're solving with but this doesn't this it's not um well at least for [00:55:36.06] [00:55:36.06] 1 and 2d flows it's sort of reasonable for 3d um we're doing this in 3d that gets starts to [00:55:41.20] [00:55:41.20] get computationally quite demanding um but um but that but that's the thing but i mean but the hope [00:55:48.15] [00:55:48.15] again is that because the physics is local that um you know the sort of universality you know of [00:55:54.15] [00:55:54.15] you know solutions to equations that we've long admired in physics departments that sort of should [00:55:59.22] [00:55:59.22] buy you quite a bit in you know limiting the amount of training data that you need and i would [00:56:05.03] [00:56:05.03] say you know compared to consumer photographs which is what the first example that i showed you [00:56:10.00] [00:56:10.00] on you would think you needed would need they're sort of far this is far more structured data set [00:56:14.12] [00:56:19.11] next question is from john weiss could these neural networks be trained on [00:56:23.20] [00:56:23.20] compressible navier stokes equation and a riemann solver to capture shocks [00:56:28.04] [00:56:28.23] um i believe that that's possible we haven't done this but i don't there's nothing about [00:56:33.16] [00:56:33.16] what i said that wouldn't that to me doesn't um wouldn't apply to to compressible shocks [00:56:43.20] [00:56:45.07] okay um a question from another from eleniman uh you mentioned that your training makes the [00:56:51.16] [00:56:51.16] approach equation specific is it possible to train the derivative approximately on data that mixes [00:56:57.01] [00:56:57.01] results from more than one pde with this result in a more generalizable approximation such as [00:57:03.07] [00:57:03.07] uh mixture uh as such mixed training as another problem so i think that's a really good question [00:57:09.12] [00:57:09.12] so i don't really completely i don't know i don't my knowledge of the answers this is very limited [00:57:14.04] [00:57:14.04] i would say that i mean for one thing so it certainly is so for example we you know in order [00:57:20.06] [00:57:20.06] to make solvers that work for various rentals numbers then we do this we train on many rentals [00:57:24.08] [00:57:25.12] so that's one thing you can do right which is sort of different equations it's not as different as [00:57:30.00] [00:57:31.01] you know there's some differences but we haven't um so i don't know how far this can be pushed [00:57:36.19] [00:57:37.11] you know on the other hand there's got to be some balance here if you start [00:57:40.21] [00:57:40.21] right i mean the whole point was that we were learning about the um [00:57:44.13] [00:57:44.13] the patterns in a particular equation if you start combining too many then [00:57:48.02] [00:57:48.02] you could easily see it getting confused and there will be a degradation i mean again my hope [00:57:53.11] [00:57:53.11] here is that we don't really care about that many equations like seriously we really don't [00:57:58.06] [00:57:58.06] and so it wouldn't be the worst thing if there were a way of speeding up individual equations um [00:58:04.19] [00:58:06.08] so um a question from zeb brooklyn um [00:58:12.06] [00:58:14.04] do you see a connection between ml enabled course grading and utilizing repeating features [00:58:19.16] [00:58:19.16] and more traditional physics concept of renormalization group style course grading [00:58:24.10] [00:58:25.09] yeah i think that's really interesting i i don't i'm quite interested in this question [00:58:29.14] [00:58:29.14] um i do see a connection but i don't see what it is that's my answer that's the best i can do [00:58:37.03] [00:58:40.21] it it feels like it has to be there you know i mean you know i i am [00:58:45.12] [00:58:48.08] but i don't know what to do but i think it's interesting but it's got to be there [00:58:51.22] [00:58:52.19] sorry that's not a good answer hopefully we'll figure out eventually [00:58:56.08] [00:58:57.16] um a question from chris marcotte isn't this effectively take on traditional [00:59:02.12] [00:59:02.12] i'm sorry a take on variational data simulation it's not strictly necessary to have differentiable [00:59:09.03] [00:59:09.03] flow solver to reconstruct the dynamics one can use an ensemble of models no i agree with this [00:59:15.22] [00:59:15.22] i think i mean i assume that chris is talking about the last thing the last part of the um [00:59:20.12] [00:59:20.12] the last example i think that that's true you know so i don't so i don't know the answer to this but [00:59:26.00] [00:59:27.11] doing having something that's differentiable being able to take the gradient is really a great thing [00:59:33.03] [00:59:33.03] that is now how much greater it is than just having ensembles and using the ensemble to [00:59:38.08] [00:59:38.08] sort of figure out which direction i don't know but it seems much more efficient because again [00:59:43.01] [00:59:43.01] if you compute the gradient then basically that that's like you compute the gradient of [00:59:47.16] [00:59:47.16] the solution with respect to the entire parameter space that you're in at a cost which is like twice [00:59:52.23] [00:59:52.23] the cost of a single solve instead of needing to do an ensemble to search and it feels that [00:59:59.05] [00:59:59.05] it should be better and i think it's but i don't i can't i don't know if theorems to nail that [01:00:04.13] [01:00:05.14] down but but empirically it just it is more efficient in our hands for doing many different [01:00:10.17] [01:00:10.17] types of things we've basically even so this differentiable code thing i mean we've been doing [01:00:14.21] [01:00:14.21] you know using differentiable molecular dynamics simulations to do material design like and you [01:00:19.20] [01:00:19.20] can compare that to like what you would do if you were to do monte carlo right parameter search with [01:00:24.19] [01:00:24.19] monte carlo and it's just much more efficient and i don't i don't know i mean i mean it's but but [01:00:30.08] [01:00:30.23] actually but maybe the other thing you say chris is that i think that my other point about this [01:00:34.19] [01:00:34.19] about emphasizing this is i think that this is really the technical advance that we as physicists [01:00:40.13] [01:00:41.03] were given by the machine learning community and this is sort of a question of like what should [01:00:45.18] [01:00:45.18] we take from this to do physics as opposed to do some other subject is something that i [01:00:50.08] [01:00:50.08] guess i've been really quite curious about and at least this is my current answer that the you know [01:00:55.22] [01:00:55.22] the notion of building complex regressors is very interesting and can be used for things but i think [01:01:00.06] [01:01:00.06] the core thing that can be brought to physics is just the ease of use of this which i is sort [01:01:05.20] [01:01:05.20] of much less sexy perhaps than you know many of the other things that people talk about but [01:01:12.00] [01:01:12.00] on the other hand it it that's sort of what's been driving the way that i've been thinking about this [01:01:17.22] [01:01:21.11] um okay another talk another questions um are there formal connections with homogenization [01:01:28.06] [01:01:28.06] theory lines yeah so that's also a good question i mean again what i would point out this is with [01:01:34.10] [01:01:34.10] respect to the um the the discretization is that we actually are not averaging so that is what [01:01:42.04] [01:01:42.04] i described to you we were basically demanding that there was an l2 loss between the solution [01:01:48.15] [01:01:49.20] and the course mesh and the solution on the fine mesh which essentially means that [01:01:53.18] [01:01:53.18] we were demanding that the course dynamics had point wise agreement [01:01:58.08] [01:01:58.08] with you know the average of the fine dynamics um the average being not like an ensemble average or [01:02:04.19] [01:02:04.19] anything that you would really i mean it's sort of not really homogenization at least not as i think [01:02:09.05] [01:02:09.05] about leon but like it's just the flux average basically of the cell there's nothing actually [01:02:14.17] [01:02:14.17] that's happening in it and so that's what i described to you now on the other hand the same [01:02:19.03] [01:02:19.03] formalism could be used with any type of averaging kernel that you want there's [01:02:23.12] [01:02:23.12] nothing that prevented us that we were using this l2 pointwise loss just because we were [01:02:29.01] [01:02:29.01] being both pigheaded or something and we wanted to see how far we could go without sacrificing [01:02:34.10] [01:02:34.10] accuracy but there's a lot of other things that one could do with this that are that are much like [01:02:39.18] [01:02:39.18] you know homogenization you know a sort of quant numerical experimental version of homogenization [01:02:45.01] [01:02:45.01] and i think that's very interesting but i don't really have anything concrete to say about it [01:02:50.02] [01:02:52.06] all right so these are all the questions in the chat uh does anyone else have any questions if you [01:02:58.13] [01:02:59.09] want to feel free to just unreach your mic and ask [01:03:01.20] [01:03:06.21] yeah michael dan again i now i understand why this is a talk about [01:03:10.15] [01:03:11.18] why you didn't want to answer my question about the lack of pdes for most of the world uh um [01:03:19.12] [01:03:19.12] so i will say though that uh if you're looking for maybe an easier set of pdes it turns out [01:03:24.15] [01:03:24.15] that as ken cameron is now showing uh granule materials are described by [01:03:30.10] [01:03:30.10] very local i guess hyperbolic pdes and i wonder if finding the bricks as you say in in those flows uh [01:03:38.21] [01:03:38.21] would somehow be easier yeah so okay so good dan so i i agree that that's an interesting [01:03:48.04] [01:03:48.04] problem and i guess what i would say is i so i do think that the the field is a different problem [01:03:54.15] [01:03:54.15] so you to me are raising a different problem which is suppose you don't know the equation [01:03:58.00] [01:03:58.00] of motion but you see the objects right as they're moving around can you parameterize local rules [01:04:05.03] [01:04:05.03] for how the objects are moving so that you can use them to generalize and i think that that's [01:04:09.14] [01:04:10.10] a totally reasonable um thing to do it's something that there are many groups around the world [01:04:15.03] [01:04:15.03] who are working on using machine learning and different types of neural network architecture [01:04:19.01] [01:04:19.16] and it it might be an interesting thing to try with granular um or i mean dynamics and i don't [01:04:27.03] [01:04:27.03] know i i i hear you i just it's the reason that i sort of pushed this off to the end is because [01:04:32.02] [01:04:32.17] this talk was about suppose i know the equations and what i'm trying to do is learn the patterns [01:04:38.21] [01:04:38.21] then how can i use this there's another question though suppose i don't know the equations let me [01:04:42.21] [01:04:42.21] then learn the rules and use that to generalize of course in that case you're always your worry [01:04:48.13] [01:04:48.13] is that you haven't accurately parametrized the rules and thus you will make errors [01:04:52.00] [01:04:52.13] because your rules are inaccurate which is a different issue than the one i was focusing [01:04:56.12] [01:04:56.12] on but i do think there's a lot of room there and there has been a lot of progress in this [01:05:01.09] [01:05:02.19] you know in the in the in the area that you've been talking about that i could [01:05:06.00] [01:05:06.17] point you to some papers if you are interested they're not actually by physics the physics [01:05:11.07] [01:05:11.07] community by and large um but their ideas you know in computer graphics for example that i think [01:05:16.10] [01:05:16.10] could be used that type of physics you know that's interesting i mean you know because this raises [01:05:22.10] [01:05:22.10] and i guess since it's the end of the talk i can i can say this and i was thinking when you made the [01:05:26.13] [01:05:26.13] calculation that and and claim will never be able to fully fully whatever that means simulate navier [01:05:31.16] [01:05:31.16] stokes i guess that's one assuming we use digital computation in its current form and two it gets to [01:05:39.16] [01:05:39.16] the the idea of the simulation hypothesis that you know is our universe a simulation or do you [01:05:46.12] [01:05:46.12] need to cheat a full simulation or do you need to cheat and i was curious about you know that [01:05:51.14] [01:05:51.14] your calculation basically shows you could never you know everything we see is not uh is not simply [01:05:56.13] [01:05:56.13] a simulation if that hypothesis holds true yeah um yeah i would really think at least at least [01:06:05.07] [01:06:05.07] within a leopardoff time we should be able to solve the equation we should be able to describe [01:06:09.09] [01:06:09.09] qualitatively at least even quantitatively by an equation of motion i don't know i've asked [01:06:16.00] [01:06:16.00] predrug he's the one who's been president what do you think should we be able to um [01:06:19.22] [01:06:22.19] you're you're staring around i don't know anything right as a bricklayer this is about my prayers [01:06:32.04] [01:06:32.04] but uh now that you know you have drawn me out i just wanna say bricklayers would like an algorithm [01:06:37.22] [01:06:37.22] not that doesn't solve pdes that you know produces entire bricks one at a time and that's done by [01:06:46.17] [01:06:46.17] global solution you know requiring not evolving in time or space or anything but requiring that [01:06:53.01] [01:06:53.22] a brick satisfies equations of motion on every point meaning that it has a constraint as imposed [01:07:02.15] [01:07:02.15] as locally satisfied negative equation it's very different than what you're saying yeah you know [01:07:09.18] [01:07:09.18] it's very close because variational methods and the methods to solve this problem yeah no i agree [01:07:16.06] [01:07:16.06] i didn't my what i was describing was only in space there were no temporal patterns and that [01:07:20.10] [01:07:20.10] i agree well is a missed opportunity the right way to do this is the way that you've described [01:07:25.12] [01:07:25.12] which is that it needs to be in both space and time and treat the patterns together which will [01:07:29.22] [01:07:29.22] clearly lead to more efficient algorithms if we could figure out a way of formulating [01:07:33.22] [01:07:36.06] i don't know i'm a believer so but on the other hand i don't have to do it but i am a believer [01:07:41.20] [01:07:46.21] okay all right since there are no more questions let's thank michael once one more time [01:07:55.20] [01:08:11.14] you [01:08:12.02]