So he's doing some really interesting or really pushing the boundaries between you know trying to write you know problem. Very well. Because very well so without further. Thanks for the invitation rich. Thank you. So this is this is a culmination of a snapshot of some work that I've been doing over the last many many years six or seven probably. And I'm not the only one doing it but let me let me acknowledge give you give you an outline and then I'll move on and talk about some. So so there's a lot of kinds of problems that we would you know that motivate the work that I'm doing where you would like to be able to write down a high level description of a problem and then get extra information from the problem that besides just what you need to talk about some moving motivating applications and then just some of the very basic framework for find out elements and also the way that we attack the problem is to we look for is that we look for structure and the problem there is a new kind of structure it's not the kind of structure used to produce stability or estimates but it's there and it's worth exploring. And that allows it's a kind of generality How are all of the different methods alike. And once you see how they're all alike you can write a computer code that takes care of them and then I'll give some examples about about even though we're not actually generating code right now our code is very fast just interpreted their description of the problems and running with it and then I work some with Vicki how. Is also a Texas Tech who's done a lot of work on block structured preconditioning for an obvious Stokes' So at the end of the day one doesn't want to just build your big systems of algebraic equations but you want to do some extra calculations with them and maybe do some block structured preconditioning so I will talk some at the end just as an example of how we can use the Sundance project not just to write down our models and feed them to a black box solver. But how we can do more advanced kinds of solver algorithms with it and hopefully this will be an inspiration for people thinking about what they might be able to do with it. So this work has been supported over the years by. By Dio We we've got some current funding from from size that in the South and I've also done some related work. Kevin and I have through. Sandia National Labs. So the people that are working most closely with the on this right now are part Sandia and Kevin long the preconditioning work is with Mickey how Jeff Dillon is the Ph D. student is just starting to get starting to do some research and some of the solver techniques I'm looking at and we've also had had some interactions with Mike Parkes at Sandia novel methods that I'll say just a little bit about in the talk. OK So let me just go through a few kind of classical or important sorts of calculations you want to do but all of them have a wrinkle on them. That makes the problem. Interesting from a computational in a software standpoint as well as from a scientific or engineering. So this is a problem from incompressible flow convection problem so you've got your knob you Stokes Equations here. Incompressible. Then you've got a convection diffusion equation that is moving heat around in the domain and so the Stokes velocity and the temperature are non-linearly coupled. And then the temperature itself creates a buoyant body for students where we've made the ask approximation here. So these are interesting to solve because it's a pretty tight non-linear coupling. So it's relatively large you've got four non-linear P.D.'s that you want to discretized and you want to solve. So this is an example of a problem where you know you would want to write down your problem. You want to get your Jacoby in and you might need to do some block structured work with the equations to get a good solver for them. So more generally preconditioning. If you think about so moderator's like a X. equal B. and then you know precondition or something that you psychologically at least pretty multiply the system through by and that's supposed to make your method converge faster. That however and that is that for a lot of problems. The good precondition ors require operators that you didn't think to program at the beginning. So if you want to code up convection to fusion maybe with a variable diffusion coefficient with a constant coefficient of fusion operator. You've got to go write a little bit more code. Some of the block precondition is that we have for an obvious Stokes require require extra operators that you didn't program and so if those operators you know take a couple weeks or a couple months to get running you know that's a big bottleneck to using these good preconditioning methods. There's lots of these lots of this kind of flavor. I wrote a paper that appeared inside of this year. Somehow tying together at a very abstract level some of these kind of. Conditioners based on differential operators with Rhys representation and it was a lot of fun. So I got to review my functional analysis and learn a lot about it are the methods doing that so. So from a software standpoint there. The important question is What if we don't have the code for P.C.'s. And so another another problem that we might want to tackle is this so this is just a one of the problem but other. You know in higher dimensions will have a similar flavor. So I've got some differential equation where maybe my forcing term depends on some parameters P. So after you. Do the calculation the question is how important is that parameter P.. So how sensitive is the calculation that I just did to this parameter if I tweak it is that robust and my in an interesting regime where I need to make sure I know what that parameter is so say you want to do the calculation and then you want to back out a derivative and you might not natively have the code that is capable of doing that sort of calculation. OK. Another problem is suppose you got a P.D.F. solver for some academic fabricated non-linear elliptic differential equation and then the question is Well suppose I want to let my right hand side Alpha now be unknown and I want to determine the right hand side so that I can match some experimental or some known values of the solution. OK So this turns into a P.D.E. constrained optimization problem. And the problem is that you know the operators for the good kinds of optimization algorithms for solving that were things that you thought to code when you were just writing the P.T.E. solver for the simulation right being able to build your low grungy and take the I joined of them and so on and so this this appears in a lot of important kinds of simulations and there's always this tension between the people who come up with the good optimization algorithms. Just use the sound of them and the people saying well we never implemented that. So let's use some other optimization algorithm that uses the code we've gotten hand. So these algorithms all go under the name of embedded or intrusive kinds of algorithms. So so if you have an engineering analysis calculation what you want to do so. Maybe you've got the form of the differential equations and there are coded up so other things that you need sometimes are these things called at. Operators is what I'll call them. So maybe. Newton's method. Once upon a time people just coded the residual evaluation and successive approximations. And then people said we can do. Newton's method and that required people to go to more code development to get them. So if you have these physics based precondition ors. Those have other operators that were in the original code derivative information for sensitivity optimization algorithms. These are all things that you would like to do to do your engineering analysis but where the code. So where the simulator doesn't provide the code with the. Hoops. I'll just jump to the end we can. OK so so mathematically. So we're going to do all of my calculations with respect to find it elements so Wark and weak forms of P.D.S. So you do the standard multiply by test function integrate by parts and so if you notice in your P.T. though there's a common flavor whether this is you know this is a person's equation so it's useful in all areas and not terribly useful in any of them because it's a model for everything right but you know but even if you go to solid mechanics or fluid mechanics problems there's a wide range of problems that all have the form integral over some pieces of the domain where the piece of the domain may change some derivatives of some test and trial functions. So this the structure is all over. So you know for example if you write down the Benard convection OPERATOR It's a little more complicated but they all have the same structure of integrals over pieces of the domains of some products of functions and their derivatives and so mathematically one of the beautiful things about finite elements is how amenable they are to doing the kinds of stability and there are estimates that prove the method are going to work if you can ever get them to run on the computer and so that's that's nice where use these piecewise polynomials spaces the differential equation becomes a nonlinear algebraic equation. The thing that set people up to be able to do the kinds of estimates that they liked was the fact that you get inherit all of the theory that you that you do in P.T. E.'s and functional analysis because finite elements you are you doing. The real weak operator. But it's on the wrong space meaning these piecewise polynomials instead of doing in finite differences you take the right. Operator. I mean the right. The right functions and you do the wrong thing to them you replace derivatives with difference quotients and it turns out that you can think about the basic abstract terrorist events in fact all is backward error estimates for finite elements but I don't. You know so. So it's very easy to use this functional analytic structure or at least it's very well understood how you use this top level structure to say yes if only we can compute the solution it will be a good one. But when you go to doing actual calculations that are going to be meaningful that work in higher dimensions. There's all of these things that you have to tie together. You have to be able to represent your possibly unstructured tetrahedral meshes you may have different kinds of basis functions for different approximations schemes. You have to deal with numerical integration rules all of these give you. Or maybe things that are non-linear operators that map your vectors to vectors so it came then to do a lot of meaningful calculations for some From problem areas. You have to be able to do this all in a great big distributed parallel computer and there's just so many kinds of operators that you want to have so what you don't want in this is necessarily to write a library that says here's my thoughts on operator. Here's my Stokes operator because you know operators are like integers. There's one after every one because you give you give people this set of operators and they'll think of some extra stabilisation term or some extra physical process that maybe you didn't write down and plus you don't want to write all those things by hand. Anyway so you know to code one variational form you have to tie together all of that all of this in. From ation So there's a lot of pneumonic overhead for the programmer and so you know coding all of them is very hard. So. This was named Kirby's recursive conundrum by my colleague Kevin long. So what we what would we notice for a long time ago. You know the early generation of numerical analysts in the forty's and fifty's were thinking. We want to do these important simulations of whether you know. Nuclear reactions whatever and they said well the amount of arithmetic to do all of this is huge and you know huge meant you know a couple thousand operations or something but they said let's build a computer so that we can automate all of this tedious and error prone to asks. OK. When we discovered the next fact programming a computer is a tedious and error prone task. So there's a resolution to the problem. Well let's just get a computer to do it. OK. So that's that's that's our goal. So this brings me to the idea time together all of this all of this domain specific knowledge about finite elements and being able to deploy all of these different algorithms. So what we like are computer programs that use the mathematical structure that's there to reason about and manipulate and maybe even generate new numerical algorithms. So this is this idea is I'm not the only one doing it. Sundance is not the only project. I also collaborate with the Phoenix code which is another high level find an element code has some different implementation technologies that we deal with but outside of the P.T. you world. The flame project is an example of using formal methods to derive all of the different kinds of loop variants and different algorithmic variants and figure out are they going to be correct or are they going to terminate. And what's a good way of mapping these things on the architecture. So they're using this high level high level business to derive linear algebra. Marcus to show is done similar things in signal processing where. He can take an abstract representation of a signal transform and go through through some some very interesting mathematics where he derives little complexity algorithms that look like F F T's better for different signal transforms and then he can generate very low level code that is highly optimized for a particular architecture so all the way from you from a problem description derive an algorithm and map it onto hardware and so it's a lot of success. So there's two basic parts to this that tied together two of my big interests and one of these is math and one of these is computer science but the interesting thing is I don't know that in terms of computer science there's there's some well. Well worn paths. Such as domain specific languages these are these are well known ideas that we need to push into new domains though in the however I've found that a lot of times the mathematical structure that we need to figure out how to automate the methods is different in subtle and not so subtle ways from the kinds of structure that it takes to analyze the methods that you know that you learn in a in a mathematical finite element class so. So here's the structure and this is going to look like abstract math but but let's let's see what emerges when we look at a bunch of different examples. What is there a clock somewhere on the time so. OK So just tell me when I'm if you can tell me when I've got about ten minutes left or five. OK So the thing is after you go through the weak form of the P.D.E.. You get you get a variation of form so you envy themselves maybe tuples of functions but G G of that is a some overs some subdomains they could be subdomains they could be boundaries they could be the whole domain itself and you get this function G. of some stuff in the stuff or the test function in a set of its derivatives and. Unknown function and the set of its derivatives and it may have some explicit spatial dependence and. In this structure we will assume on this or that. Well a few things one any derivative we feel like we need to take of this is going to exist. So so so think formally for the moment. Then we are linear in the sense of a linear operator in the and any of its derivatives because that appears when you multiply by a test function integrate by parts the test functions only appear linearly although G. can be as non-linear as you like. In terms of the unknown functions. So the discrete is ation here you express you as a linear combination of some basis functions in terms of some unknown parameters anything in the test space can be written as a linear combination of basis functions may be in some slightly different space. So then the discrete discrete method is the same thing as saying the continuous method but you restrict it to the subspace. You have a solution to this method if and only if certain derivatives of this functional G. happen to be zero. So remember that G. is a linear in the test function so to say that you call movie is zero for all the is the same as to say that it is true for a basis for the test space and that's kind of what this is saying is that it's true on a basis. So now you had those very compactly represent the nonlinear system of equations. You can do something you like you know what do you do when you're given non-linear equations Newton's method. So if you take. If you write down Newton's method for this but you have to do is you have to compute the. Matrix where you take each equation and you differentiate it with respect to each of the variables. Here is the Jacoby and acting on a vector you get the second derivative of G. With respect to the coefficients separately of the test and none functions and they act on your Newton update. And here's your right hand side where. Is evaluated. You know is evaluated at the point you're linearizing about. OK so now if we write out the more compact notation I had on that slide you will see a few things you know this is just this is sticking grimly to the definitions. So you get some of these sums over the Alphas and beta is our chain rule. So when you push the chain rule through you differentiate G. With respect to its derivatives a V. and then the derivatives of the with respect to the coefficients to get the coefficients differentiating with respect to the coefficients actually pops out basis functions something similar happens for the Jacoby it's really just a change. But if you look at this. There's a few things up to you got these linear combinations but all you have are integrals over domains the functional derivative of a derivative is one of the functional derivative of this function is just some other function. So you've got some function time some derivatives of some basis functions. Here you get a function times the derivative of a basis functions so this pattern. Is going to emerge for lots of different kinds of applications so similarly with sensitivity. If you take your equation that defines a solution and you differentiate it through with respect to P.. This is the chain rule for the P. derivative there. OK And then you can get your algebraic System V. by differentiating with respect to the test functions. So this here is going to be a right hand side of an equation here are the unknown. Perimeters that you're trying to determine the derivatives of you with respect to this parameter and then here's a bunch of stuff that is a sum of integrals some function in time some derivatives of basis functions. OK so some kind of high level this is exactly the same structure that we had for Newton's method. Besides the fact that it's Jacoby and acting on a vector equal to some some factor just what's inside that goes into building all of these operators is the same in that case also. Optimization. So you have some cost function where you try to minimize you've got some constraint which is a weak form of a P.D.E. you've got a few more arguments in here but that's OK And so then the optimality conditions you formally grungy and you take the derivative with respect to everything in sight set it equal to zero. That's going to you can push through these derivatives resist with respect to you. P. and lambda and get. OK and get some operators all show in a minute. So here you know the example I showed earlier you're trying to minimize the mismatch with the regularization the P.T. is a constraint and here is just a loop around G. and if this. So here is the mismatch the regularization here is now the weak form of the constraint equation. OK So and So if you look at the optimality conditions on this. Here's the matrix of all these derivatives in each of the blocks says the form of a song. Integral derivative of some function times products of the residents of basis functions. OK Again same structure for all of these calculations. Punchline. I've said it on every side but I'll say it again. So once you have all the basic building blocks. If you can represent your meshes your basis functions your integration rules and so on. All you need. If you have a tool that can represent these G.R. Zanele R.'s so a language that represents these coefficient functions in. Enables you to take the derivatives of the functions. Then you have what you need for this code. Now we use a technique automatic differentiation in order to in order to compute these derivatives. So we have within Sundance a library that basically with operator overloading builds an abstract syntax tree for representing all of the kinds of operators that we want and then we use automatic differentiation on that. So there's it's worth pointing out that we actually couldn't we don't in couldn't use some of the off off the shelf eighty tools that are out there. We had to do a few extra wrinkles the earliest automatic differentiation implementations were trying to fix a patch up legacy code you have a legacy code safer residual evaluation you want to run Newton's method. So you have to write something that will parse up the legacy code do a D. on the intermediate representation then generate back out new Fortran that will calculate the Jacoby and or its action on a vector. So these are very well done tools they worked great in an engineering standpoint of trying to solve the problem that people were after tools like add a C. and add a for your examples of these so more recently people said well if you're writing new code and you don't already have a legacy code you can make these codes a deal where by doing operator overloading So example the Sokoto project in truly knows does this you write your code your code is then template it over the type of scalar you're computing with and then you can use doubles or you can use some special type that keeps track of derivatives with it but you still can only get the simple derivatives of things that you have things that you already want and you have to write the low level code in order to extract its derivatives. So in Sundance. But first of all we don't have any legacy code and we don't want to write low level code for every operator at that we might want to code that's what we're trying to avoid so. We will be able to build a high level abstract syntax tree for our new domain specific imbedded language and then we'll be able to drive our new operators through out a magic differentiation an interesting wrinkle to this approach is the difference in is fully embedded in the system. It's not an afterthought in fact if you're solving a linear operator you're selling a linear equation like the plus sign equation. We even use automatic differentiation for that. At least psychologically you have you have your weak form you differentiate it with respect to test and trial functions and you pop out the stiffness matrix. OK so this stiffness matrix pops out from differentiation. Even if you're doing linear operators. So we have a paper that is in the galley proofs stage with Kevin Hart. So it should probably appear next year. I hope. Given how this. Runs I can give you pre-print some on my web page. The high level C. plus plus library where we can represent variational forms underneath the hood they get differentiated so the user specifies the problem and then at run time when you say Go build my stiffness matrix Sundance loops over this expression tree and it builds the Finite Element operators and it's actually very fast. We don't pay a big penalty for this Sundance lives within the truly knows project and so we have access to all of the scalable solvers sparse direct sparse and rid of algebraic multi-grade and everything. So we don't have to develop our own solvers we can just use somebody else's. So as I said there's you know in addition to some of the practical details with the. You know with the existing eighty tools we have a couple of extra wrinkles in Sundance we have to differentiate with respect to these arguments that are themselves functions and I'm not aware of any eighty tools that do that so. So our data flow has to take take into account that as well as we have to figure out was this a derivative with respect to a function or a spatial argument and we need to keep that. And we've done some optimizations to figure out who this is something that doesn't vary spatially and we have some specialized rules for that there is a paper in progress on the data flow that we need to finish up this is right we have an interpreter. You can think about it that way. However and interpret a specialized interpreter is a compiler there is string output. So when when it when you make an error it prints out a string representation of the variational form. So that's great for debugging. So once you've got that you've almost got a compiler there's some tradeoffs here. So when Kevin first started developing this he was constrained to run on some code some machines at Sandia that didn't want dynamic linking So if it run time you generate code in the try to dynamically link it back and that wouldn't work on some of the machines you had to work on and it's you know from a system standpoint it's a lot more complicated. So with with the interpreter you can write one source code link against the library and then it all just runs. Very very little but there's a lot of opportunity there that has not been explored but it's actually very practical. Without that. So yeah it's a very basic interpreter but the one thing that is done very well and that is that you walk over the tree once and then you amortize that walking over the tree over hundreds of cells. So you process a lot of cells in the mesh for that one read and so that sets up the chance to do most of the arithmetic with Level three loss. So then the floating point computation is dominating any anything that we're doing with what the data structures so in that sense. Don't do more work than you have to do so which is why we haven't gotten around to exploring cogeneration. So here's an example I've got some other examples I can show people but this is this is one that's. But that is out of our paper where we form your set up and formally grungy and for the P.T. you constrain optimal optimization problem I showed you. So you has been defined to be an unknown function you star has been defined to be some discrete function that we know we have some tabulated data at so. So here's an L. to Norm. Of the fit. Here's a regularization of the. Of the right hand side. This we defined to use a little later on. So here's your constraint. This is looks like a weak form of the P.D. right that defines our constraint. Here's your Look here some boundary conditions set up a functional. And then there's a few more lines that says you know go build a nonlinear solver and feed this functional to it. OK So underneath the hood other then this ability to interpret the thing that's like in a city. There's one thing that came up. So what the. What the what the syntax tree has to do we have a discrete function evaluator that has to know how to walk over this syntax tree and then be able to evaluate that coefficient and its derivatives at the quadrature points after you do that we'll call that thing W.. So we've talked to our symbolic engine and we figured out what W. is at a lot of the quadrature points of our problem and then we have things that look like integrals over some domain some coefficient times derivative of test functions. So then there are there's a few ways of handling this. We differentiate between the case where W. is constant and not constant because there is some some some nice speed up we can get when W. is constant. So we just have Element integration routine that handles stuff of this form with some extra wrinkles for when you have boundary integrals as opposed to sell integrals So this is all fairly standard stuff that you would have to do in any kind of fine. Element code. Except if you set up the data right you're careful with that you can do most of this with Level three blogs. And that is where you know our performance wins come come in Matrix assembly. So the point is if you can do a calculation do it many times. So it's Words to live by. OK so here's just a couple of a couple of slides. It's important when you're thinking about these kinds of codes that parallelism should not be an afterthought. Because there's a lot of things that you can do that will mess you up if you write a good serial code and then try to paralyze it so it was designed from the beginning with parallelism in mind this is just. This is just a plus on equation with weak scaling constant work for profit constant constant number of cells preprocessor on ASCII read. So this is exactly what you expect flat assembly times. So it's this means that our degree of freedom orderings our mesh and talking to truly knows is not getting in the way of anything. And in fact post on is exactly the right problem to do for here because there's almost no work to do on every cell to build the elements to this matrix. So the thing that actually dominates this is inserting the entries into the global stiffness matrix and so if we've done a problem if we had a problem with their insertion in communication. It would actually show up on posts on before it would show up in obvious Stokes So this down here at the slides aren't the graph isn't really clear. I took so that some of the Phoenix codes generated offline code for and for Stokes using tailored elements and I just compared them. You push the button and say OK use the compile already generated in compiled. Phoenix code and then let Sundance interpret the expression build the Matrix and both cases. Everything gets written into any petro matrix so that's normalized. And so. So the thing is Sundance is X.. It running about an order of magnitude faster than Phoenix. Even with the interpreter overhead built into the code and so sorry. OK So down here. Yeah these are number of cells number of triangles in one direction. I think it's been awhile since I did the calculation. So these are hundreds of trying hundreds in one direction so now they're in the thousands on what you know on one processor. So yes so the thing that we try to do so. Yes So the thing. One of the things that's hard about this kind of work from a practical standpoint for the code is you want to be able to take all of these different pieces like meshes and differentiation and solvers and time together on the on the other hand once that that framework is in place you've got to drive through some of these yesterday around Emory and I got lost. So so hopefully things will turn out better on the side so you can think about a tool like Sundance this way that somebody may be coming in from a discrete ization direction and they want to make a contribution to people that are doing applications so they come around here and they come in. So you have traffic coming in and out from lots of different directions and you want it flows smoothly. So people doing solvers people doing descriptors ations people trying to do different applications you'd like them all to be able to use the different pieces of this and help each other out. So I've got about fifteen minutes left so let me talk some about some very very recent things that we have done where we tried to use Sundance in the chilliness library for doing some more advanced preconditioning in. Solver techniques. So one of the things that is nice about the inside of Sundance is one of the one of their modules in it is called T.S.F. which stands for truly no solver framework was in there a long time and it. It's not been terribly widely adopted out of Sundance. But it does a couple of things really well one of them that it does is we have a class called linear solver builder that looks at an X.M.L. file and it goes off and then it builds a linear solver like it's now this doesn't sound like much but the great thing about is that it's got all of these different packages for it is methods for a pre-condition is for direct methods at the same time. That's also a drawback. If you want to be able to switch out between all of these all of these different codes. You don't want to have to learn a new A.P.I. So what we have is something that is aware of lots of different pieces within truly knows and dispatches on them based on an X.M.L. so you can read in tolerances pre-condition or cetera. Feed it to you know feed it to a class and it's a factory that goes off and instantiates the right piece which really knows that on top of all of the major cities and linear solvers that you have access to we have an abstract operator class and then algebra. Defined on this operator class where the algebra is done with deferred evaluation. That's really important sparse parallel unless you really have to you don't want to do sparse matrix multiplication where you do a matrix matrix multiply and that's complicated. Similarly you don't ever want to actually form an inverse of sparse matrix in parallel or serial for that matter but notationally it's very convenient convenient if you can think of the product of two operators as an operator and if you can think about the inverse of an operator as an operator OK this is very easy for a very natural for writing down a lot of a lot of systems. The other thing that's nice when you're doing nothing and other kinds of systems is if you can think about your operators consisting of several. Different blocks like the discussed of the block were the pressure gradient block. So we also have this ability to think about operators that blocks of operators form new operators that act on partition factors. And so as I said this is all of this in here is done with deferred evaluation. So that the sum of two operators does not go off and add the matrix entries together but it gives you an object that when you multiply it onto a vector it does the two matrix vector products and adds them. OK so that's what I mean by the first evaluation. So we can see that it's a very nice nice set of classes for being able compressible my goodness. No one caught that. So I compressed it. If you will. So anyway if we just look at the way I said just turning into an engineer here if we just look at the incompressible not be used to equations you linear cetera. And you wind up with a system of equations that has this kind of structure. Are going to vary depending on how you linearized and such and then you have maybe some kind of continuation loop outside of some nonlinear solver loop and then inside that you've got G.M. or something. OK. In then we'd like to study different different approaches to solving this well there's some classically there's some things you could use a block diagonal pre-condition or you just take the velocity block here and then you take a pressure mass matrix and that one one two two block zero use the Fortran indexing so this is nice. You can prove that this gives you mesh independent bounds. You know that the Matrix itself is so the system itself is a bounded map from each one to into the dual inverting this actually will take you back from the dual into H. one L. two it gets rid of the mash on the other him. Indefinite it has positive and negative real part I going to use. That's a problem for some for some methods like most of them. So in that it does have this bottleneck that you do have to invert this block that's that's a common problem that a lot of these methods have in a precondition or you can often get away with doing it somewhat in Exactly. So some other conditioners that. People have looked at have a block triangular structure have the same math block X. is now the sure complement of the matrix system and you have this B. transpose there. If you do the inverse the inverse of this that you see what you actually have to multiply with to apply the precondition has this has this nice factored structure. No you have to invert F.. And you have to invert the short complement that's not so bad because we know for the theory that the short complement operator is a nice bounded linear map from L. to L. to that means when you discretized you get dependent condition numbers but to make this method even more practical they will embed a an approximation to the shore complement. So the one that I have coded up is called P C D for pressure convection diffusion goes back to thing to K. and Logan lots of people have also worked on it so it is based on a mass matrix for the pressure space a convection diffusion operator but on the pressure rather than velocity space and then an inverse slope last operator. But only on the pressure space. So you swap out solving the short complement for doing one loop last solve which is very easy and then to matrix vector products. One interesting thing about the block trying to loop free conditioners is that they don't drive the matrix A to be the identity but they drive it to be block unit lower triangular where you have an identity on the diagonal. And so the eigenvalues then are perfectly clustered around one. If you do this. Complement Exactly. And so. So for these you do get you do get much more definite definite So preconditioned systems so rezzes going to work much better on them now. It may strike you as odd. But I'm still an alternate that has not been explored very much in the literature for various reasons is what if you just bang on the sheer complement with G.M.'s Now what did I just say the complement is a bounded linear operator from L. two to L two it has a mesh independent condition number so we should converge in a mesh independent number of iterations and what's more to avoid solving the sure complement problem by doing a block precondition or I created a precondition or for the short complement problem to avoid solving the short complement problem. So we have precondition ors that actually work on the compliment. So we can bring on this with G.M. rez. And then we can precondition it with one of the sure complement pre-condition as we invented for the block solver. And what's more there's some other things we still have this bottleneck where we have to invert the af the convection diffusion term for the velocity space. So we're looking at throwing some things like recycling creel of methods that we've started doing a little bit of work on inexact but we're not ready to say say much on that yet. We just wanted to see how far with good software and some good higher level mathematics mathematics theory can we push on just doing the short complement. So some other interesting things about the short complement. If you're doing G.M. or is on the block system. One of the costs and there's two things that make John Resig spend so one can be the matrix vector product. The other is you have to store a bunch of factors. If you're doing the short complement you are vectors or you have one vector in the pressure space. OK. Whereas if you're doing G. and RAZ. On the full couple blocks system you create a lot of actors have a pressure and two velocities or three velocities in three D.. So your memory and your cost of updating to has been blurred matrix in G.M. rez are much lower. If you're acting on the short complement there's actually also fewer math facts that you have to apply in the process of doing this. The theory from Independence is actually trivial or at least reduces to a known result in a trivial way where the known result is non-trivial The drawback is you can't get away with this much slop and doing and doing the F. solve that you have to do so there's some tradeoffs in here. So as an aside I see the time. So let me let me go through this somewhat quickly. So I just looked at a simple reaction to fusion system. Let's look at the mathematics of this for sure. Complements OK So this is a model reaction to fusion. So if you linearized a time step if it's you know. Mathematically it would be very similar. Concert let's look at the short complement here you eliminate one of the Pashtuns in favor of the other you get a Laplacian plus an inverse Laplacian OK we know how to invert the look plus operator that's a solved problem. So now what if we were to precondition this with the A plus operator because the last operator is something we know how to solve you get an identity minus. Well at the P.D. level that's inverse by a harmonic we're not actually going to do it by a harmonic solver. But that at an operator level is very nice if you're in a situation where you have regularity theory. This is a smoothing OPERATOR This is a highly compact OPERATOR So in fact your precondition system has the form of a precondition of a symmetric compact perturbation of the identity. G M. RAZ contrie gradients even will work on this like gangbusters. In fact I coded it up with one two. If I make alpha and beta very large I can get it up to three iteration. Outer iterations on G.M.'s now what a ration of G.M. rez requires to put song cells but using the kind of the high level P.T.E. structure we were able to reason about and see that. Sure. Compliments for this problem. Actually we're going to give us some very good results and using T.S.F. in our block operators we were able to you know to code this up pretty quickly. Now coming back to that. So for the not. The sure complement is not compact but it is bounded and bounded is good so. So here I ran four sets of examples. So this is think about it as a relaxed Newton method if you don't know what pseudo transit continuation is otherwise if you know what side T.C. is it's the society so. So here is what this is coded up citee see for doing the nonlinear solve and then it. Each step. I've got a Jacoby unlike matrix I need to invert I take it apart form the sure complement and I bang on it. So here is G.M.'s. You know thirty to forty iterations. But does not change much. If I go to thirty two by thirty two or sixty four by sixty four. It is it's as much independent as you get in practice. OK it's not even slowly tailing up on this. I've got a reasonable Reynolds number on here. It's one hundred so that's not it's not that it's not trivial either. Sorry this is driven cavity. So you've got to start somewhere. So here I turned on recycling G.M. or a solver called KROGER I went and wrote a book up and T.S.F. so that I can go in and stand. So when you do. G.M. or as when you restart or when you solve a linear system you can reuse some of the Create a lot of factors that you have formed to somehow accelerate or improve your convergence later on. So I am recycling ten create factors in here and it's giving me something a few I'm saving a few iterations here and there you know so some of these later steps I'm actually saving. A bit so you know so there may be something to explore here and then I took G.M. resin I hooked up the pressure convection diffusion equation. The P.C.B. pre-condition are here. So you want to compare this column to this column the first P.C.T. buys me nothing. Why. If you start from an initial guess of zero your first solves the Stokes equation for Stokes the P.C.B. operator to generate through a mass matrix. It just doesn't help but then after you do one iteration you have a non-trivial velocity and then P C D is actually helping So you're saving about a factor of two or so in the number of outer iterations. So essentially for every P.C.T. step for every step you state here you know you've gotten rid of one of these nasty F. cells and that's the thing we're trying to minimize here and then if you turn on recycling with the pressure convection the fusion. You know you say you're saving a little bit. It's not clear whether this is worth doing. But it was changing one line an X.M.L. so it was an X.M.L. file so it's completely worth trying. And there may be some tuning some primitive tuning to do on this to figure out what the optimal way of using of using of using the recycling solver. Now this code is in progress. So I was hoping to have it done before I got here but it's not quite ready yet. If you take the sure complement of the convection problem that I started with you have a temperature of convection diffusion minus some things that are mass matrix. And then an obvious joke solver. But we know how to solve an obvious Stokes problem. All right well sort of you know if you pre-condition this operator with your temperature convection diffusion which you also know how to do. It's just a scalar elliptic problem. It's a little ugly but you can feed it to algebraic multi grid or something even better if you have something that's convection aware for a multi grid method and this also at least at the P.D. level and so with some regularity of some. Sions has the structure of a compact probation of the identity. These guys are at least bounded they may be compact. But for the linearized not via Stokes Equations. You have elliptic regularity for the linearized problem with elliptic shift there you get a look at shift there. Well a convert compactness says this is compact perturbation of the identity. So the question in practice before you dive off and try to get some really delicate estimates on you know what's the biggest what's the smallest eigenvalues of that or you just want to code it up and see if it's worth doing some of the analysis on the could is in progress. So if this convergence quickly it's a very good convection solver because if you do five iterations on this that says that you can do something. Coupled tightly in a non-linear way to an obvious to us and a small constant multiple of the cost that it takes to solve an obvious to. So so that mathematically these are the kinds of results that were they were trying to go towards and say how can we use all these pieces and all these solvers together to tackle the problems that we know how to solve coupled to other problems that we know how to solve. So looks like I'm about out of time and right on the dot so conclusions. So many numerical computing is the is the idea that we're trying to leverage high level structure high level descriptions and then reason about the structure to produce a very efficient code that you don't have to program in I notice I'm not saying rapid prototyping Sundance is high performance code but it's also very easy once you've learned the system to get stuff running so I prefer the term rapid deployment rather than rapid prototyping because rapid prototyping get the idea works now I have to go spend months of my time writing writing the real code you don't have to write the real code. It is the real code so that we also hope I showed from this last part of talk. We have something of a framework where we can take different ideas from numerical analysis and hopefully get them all to work together. So with that I will stop here.