So he's doing some really interesting or
really pushing the boundaries

between you know trying to
write you know problem.

Very well.

Because very well so without further.

Thanks for the invitation rich.

Thank you.

So this is this is a culmination of
a snapshot of some work that I've

been doing over the last many
many years six or seven probably.

And I'm not the only one doing it but
let me let me acknowledge

give you give you an outline and
then I'll move on and talk about some.

So so there's a lot of kinds of problems
that we would you know that motivate

the work that I'm doing where you would
like to be able to write down a high level

description of a problem and then get
extra information from the problem that

besides just what you need to talk about
some moving motivating applications and

then just some of the very basic
framework for find out elements and

also the way that we attack
the problem is to we look for

is that we look for structure and
the problem there is a new

kind of structure it's not the kind of
structure used to produce stability or

estimates but it's there and
it's worth exploring.

And that allows it's a kind of generality
How are all of the different methods

alike.

And once you see how they're all alike you
can write a computer code that takes care

of them and then I'll give some examples
about about even though we're not actually

generating code right now our code is very
fast just interpreted their description

of the problems and running with it and
then I work some with Vicki how.

Is also a Texas Tech who's done a lot of
work on block structured preconditioning

for an obvious Stokes' So at the end
of the day one doesn't want to just

build your big systems of algebraic
equations but you want to do some

extra calculations with them and maybe do
some block structured preconditioning so

I will talk some at the end just as
an example of how we can use the Sundance

project not just to write down our models
and feed them to a black box solver.

But how we can do more advanced kinds
of solver algorithms with it and

hopefully this will be an inspiration for

people thinking about what they
might be able to do with it.

So this work has been
supported over the years by.

By Dio We we've got some current funding
from from size that in the South and

I've also done some related work.

Kevin and I have through.

Sandia National Labs.

So the people that are working most
closely with the on this right now

are part Sandia and
Kevin long the preconditioning

work is with Mickey how Jeff Dillon is
the Ph D. student is just starting to get

starting to do some research and some of
the solver techniques I'm looking at and

we've also had had some interactions
with Mike Parkes at Sandia

novel methods that I'll say just
a little bit about in the talk.

OK So let me just go through a few
kind of classical or important sorts

of calculations you want to do but
all of them have a wrinkle on them.

That makes the problem.

Interesting from a computational in
a software standpoint as well as from

a scientific or engineering.

So this is a problem from incompressible
flow convection problem so

you've got your knob you
Stokes Equations here.

Incompressible.

Then you've got a convection diffusion
equation that is moving heat around in

the domain and so the Stokes velocity and
the temperature are non-linearly coupled.

And then the temperature itself
creates a buoyant body for

students where we've made
the ask approximation here.

So these are interesting to solve because
it's a pretty tight non-linear coupling.

So it's relatively large you've got four

non-linear P.D.'s that you want to
discretized and you want to solve.

So this is an example of a problem where

you know you would want to
write down your problem.

You want to get your Jacoby in and
you might need to do some block

structured work with the equations
to get a good solver for them.

So more generally preconditioning.

If you think about so moderator's
like a X. equal B. and then you know

precondition or something that you
psychologically at least pretty multiply

the system through by and that's supposed
to make your method converge faster.

That however and that is that for
a lot of problems.

The good precondition

ors require operators that you didn't
think to program at the beginning.

So if you want to code up convection
to fusion maybe with a variable

diffusion coefficient with a constant
coefficient of fusion operator.

You've got to go write
a little bit more code.

Some of the block precondition is that
we have for an obvious Stokes require

require extra operators that
you didn't program and so

if those operators you know take a couple
weeks or a couple months to get running

you know that's a big bottleneck to using
these good preconditioning methods.

There's lots of these lots
of this kind of flavor.

I wrote a paper that appeared
inside of this year.

Somehow tying together at a very
abstract level some of these kind of.

Conditioners based on differential
operators with Rhys representation and

it was a lot of fun.

So I got to review my
functional analysis and

learn a lot about it
are the methods doing that so.

So from a software standpoint there.

The important question is What if
we don't have the code for P.C.'s.

And so another another problem that
we might want to tackle is this so

this is just a one of the problem but
other.

You know in higher dimensions
will have a similar flavor.

So I've got some differential equation
where maybe my forcing term depends on

some parameters P. So after you.

Do the calculation the question is
how important is that parameter P..

So how sensitive is the calculation
that I just did to this parameter

if I tweak it is that robust and my in
an interesting regime where I need to make

sure I know what that parameter is so
say you want to do the calculation and

then you want to back out a derivative and
you might not

natively have the code that is capable
of doing that sort of calculation.

OK.

Another problem is suppose
you got a P.D.F. solver for

some academic fabricated non-linear
elliptic differential equation and

then the question is Well suppose
I want to let my right hand

side Alpha now be unknown and
I want to determine the right hand side so

that I can match some experimental or
some known values of the solution.

OK So this turns into a P.D.E.
constrained optimization problem.

And the problem is that
you know the operators for

the good kinds of optimization algorithms
for solving that were things that you

thought to code when you were just writing
the P.T.E. solver for the simulation right

being able to build your low grungy and
take the I joined of them and so on and so

this this appears in a lot of important
kinds of simulations and there's always

this tension between the people who come
up with the good optimization algorithms.

Just use the sound of them and the people
saying well we never implemented that.

So let's use some other optimization
algorithm that uses the code we've

gotten hand.

So these algorithms all go
under the name of embedded or

intrusive kinds of algorithms.

So so if you have an engineering analysis
calculation what you want to do so.

Maybe you've got the form of
the differential equations and there

are coded up so other things that you need
sometimes are these things called at.

Operators is what I'll call them.

So maybe.

Newton's method.

Once upon a time people just
coded the residual evaluation and

successive approximations.

And then people said we can do.

Newton's method and

that required people to go to more
code development to get them.

So if you have these physics
based precondition ors.

Those have other operators that were
in the original code derivative

information for
sensitivity optimization algorithms.

These are all things that you would like
to do to do your engineering analysis but

where the code.

So where the simulator doesn't
provide the code with the.

Hoops.

I'll just jump to the end we can.

OK so so mathematically.

So we're going to do all of my
calculations with respect to find

it elements so Wark and weak forms of
P.D.S. So you do the standard multiply by

test function integrate by parts and so
if you notice in your P.T. though there's

a common flavor whether this is you know
this is a person's equation so it's useful

in all areas and not terribly useful
in any of them because it's a model for

everything right but you know but
even if you go to solid mechanics or

fluid mechanics problems there's a wide
range of problems that all have the form

integral over some pieces of
the domain where the piece

of the domain may change some derivatives
of some test and trial functions.

So this the structure is all over.

So you know for example if you write
down the Benard convection OPERATOR It's

a little more complicated but they all
have the same structure of integrals over

pieces of the domains of some products of
functions and their derivatives and so

mathematically one of the beautiful things
about finite elements is how amenable

they are to doing the kinds of stability
and there are estimates that prove

the method are going to work if you can
ever get them to run on the computer and

so that's that's nice where use
these piecewise polynomials spaces

the differential equation becomes
a nonlinear algebraic equation.

The thing that set people up to be able to
do the kinds of estimates that they liked

was the fact that you get inherit all of
the theory that you that you do in P.T.

E.'s and functional analysis because
finite elements you are you doing.

The real weak operator.

But it's on the wrong space meaning
these piecewise polynomials

instead of doing in finite
differences you take the right.

Operator.

I mean the right.

The right functions and

you do the wrong thing to them you replace
derivatives with difference quotients and

it turns out that you can think about the
basic abstract terrorist events in fact

all is backward error estimates for
finite elements but I don't.

You know so.

So it's very easy to use this
functional analytic structure or

at least it's very well understood how you
use this top level structure to say yes if

only we can compute the solution
it will be a good one.

But when you go to doing actual

calculations that are going to be
meaningful that work in higher dimensions.

There's all of these things
that you have to tie together.

You have to be able to represent your
possibly unstructured tetrahedral

meshes you may have different
kinds of basis functions for

different approximations schemes.

You have to deal with numerical
integration rules all of these give you.

Or maybe things that are non-linear
operators that map your vectors to vectors

so it came then to do a lot of meaningful
calculations for some From problem areas.

You have to be able to do this all in
a great big distributed parallel computer

and there's just so many kinds of
operators that you want to have so

what you don't want in this is
necessarily to write a library

that says here's my thoughts on operator.

Here's my Stokes operator because you
know operators are like integers.

There's one after every one because
you give you give people this set of

operators and they'll think of
some extra stabilisation term or

some extra physical process that
maybe you didn't write down and

plus you don't want to write
all those things by hand.

Anyway so
you know to code one variational form

you have to tie together
all of that all of this in.

From ation So there's a lot of pneumonic
overhead for the programmer and

so you know coding all
of them is very hard.

So.

This was named Kirby's recursive
conundrum by my colleague Kevin long.

So what we what would we notice for
a long time ago.

You know the early generation of
numerical analysts in the forty's and

fifty's were thinking.

We want to do these important
simulations of whether you know.

Nuclear reactions whatever and they said
well the amount of arithmetic to do all of

this is huge and you know huge meant you
know a couple thousand operations or

something but
they said let's build a computer so

that we can automate all of this
tedious and error prone to asks.

OK.

When we discovered the next fact
programming a computer is a tedious and

error prone task.

So there's a resolution to the problem.

Well let's just get a computer to do it.

OK.

So that's that's that's our goal.

So this brings me to the idea
time together all of this

all of this domain specific
knowledge about finite elements and

being able to deploy all of
these different algorithms.

So what we like are computer programs that
use the mathematical structure that's

there to reason about and manipulate and

maybe even generate new
numerical algorithms.

So this is this idea is I'm
not the only one doing it.

Sundance is not the only project.

I also collaborate with the Phoenix
code which is another high level find

an element code has some
different implementation

technologies that we deal with but
outside of the P.T. you world.

The flame project is an example of
using formal methods to derive all

of the different kinds of loop variants
and different algorithmic variants and

figure out are they going to be correct or
are they going to terminate.

And what's a good way of mapping
these things on the architecture.

So they're using this high level high
level business to derive linear algebra.

Marcus to show is done similar
things in signal processing where.

He can take an abstract representation
of a signal transform and go through

through some some very interesting
mathematics where he derives little

complexity algorithms that look like F F
T's better for different signal transforms

and then he can generate very low level
code that is highly optimized for

a particular architecture so all the way
from you from a problem description

derive an algorithm and map it onto
hardware and so it's a lot of success.

So there's two basic parts to this that
tied together two of my big interests and

one of these is math and one of these is
computer science but the interesting thing

is I don't know that in terms of computer
science there's there's some well.

Well worn paths.

Such as domain specific languages these
are these are well known ideas that we

need to push into new domains though in
the however I've found that a lot of times

the mathematical structure that we need to
figure out how to automate the methods is

different in subtle and not so

subtle ways from the kinds of structure
that it takes to analyze the methods that

you know that you learn in a in
a mathematical finite element class so.

So here's the structure and this is
going to look like abstract math but but

let's let's see what emerges when we
look at a bunch of different examples.

What is there a clock
somewhere on the time so.

OK So just tell me when I'm if you
can tell me when I've got about

ten minutes left or five.

OK So the thing is after you go
through the weak form of the P.D.E..

You get you get a variation of form so you
envy themselves maybe tuples of functions

but G G of that is a some overs
some subdomains they could

be subdomains they could be boundaries
they could be the whole domain itself and

you get this function G.
of some stuff in the stuff or

the test function in a set
of its derivatives and.

Unknown function and
the set of its derivatives and

it may have some explicit
spatial dependence and.

In this structure we will
assume on this or that.

Well a few things one
any derivative we feel

like we need to take of
this is going to exist.

So so so think formally for the moment.

Then we are linear in the sense
of a linear operator in the and

any of its derivatives because that
appears when you multiply by a test

function integrate by parts the test
functions only appear linearly although G.

can be as non-linear as you like.

In terms of the unknown functions.

So the discrete is ation here
you express you as a linear

combination of some basis functions
in terms of some unknown parameters

anything in the test space can be
written as a linear combination of

basis functions may be in some
slightly different space.

So then the discrete discrete
method is the same thing as

saying the continuous method but
you restrict it to the subspace.

You have a solution to this method if and

only if certain derivatives of this
functional G. happen to be zero.

So remember that G.
is a linear in the test function so

to say that you call movie is zero for

all the is the same as to say that it is
true for a basis for the test space and

that's kind of what this is saying
is that it's true on a basis.

So now you had those

very compactly represent
the nonlinear system of equations.

You can do something you like

you know what do you do when you're given
non-linear equations Newton's method.

So if you take.

If you write down Newton's method for
this but

you have to do is you have to compute the.

Matrix where you take each equation and

you differentiate it with respect
to each of the variables.

Here is the Jacoby and acting on a vector
you get the second derivative of G.

With respect to the coefficients
separately of the test and

none functions and
they act on your Newton update.

And here's your right hand side where.

Is evaluated.

You know is evaluated at the point
you're linearizing about.

OK so now if we write out the more compact
notation I had on that slide you will see

a few things you know this is just this
is sticking grimly to the definitions.

So you get some of these sums over
the Alphas and beta is our chain rule.

So when you push the chain rule
through you differentiate G.

With respect to its derivatives a V.
and then the derivatives

of the with respect to the coefficients
to get the coefficients

differentiating with respect to
the coefficients actually pops out basis

functions something similar happens for
the Jacoby it's really just a change.

But if you look at this.

There's a few things up to you
got these linear combinations but

all you have are integrals over domains
the functional derivative of a derivative

is one of the functional derivative of
this function is just some other function.

So you've got some function time some
derivatives of some basis functions.

Here you get a function times
the derivative of a basis functions so

this pattern.

Is going to emerge for
lots of different kinds of applications so

similarly with sensitivity.

If you take your equation
that defines a solution and

you differentiate it
through with respect to P..

This is the chain rule for the P.
derivative there.

OK And
then you can get your algebraic System V.

by differentiating with
respect to the test functions.

So this here is going to be a right hand
side of an equation here are the unknown.

Perimeters that you're trying to determine
the derivatives of you with respect to

this parameter and then here's a bunch
of stuff that is a sum of integrals

some function in time some
derivatives of basis functions.

OK so some kind of high level this is
exactly the same structure that we had for

Newton's method.

Besides the fact that it's Jacoby and

acting on a vector equal to some some
factor just what's inside that goes

into building all of these operators
is the same in that case also.

Optimization.

So you have some cost function where you
try to minimize you've got some constraint

which is a weak form of a P.D.E. you've
got a few more arguments in here but

that's OK And so then the optimality
conditions you formally grungy and

you take the derivative with respect to
everything in sight set it equal to zero.

That's going to you can push through these
derivatives resist with respect to you.

P. and lambda and get.

OK and
get some operators all show in a minute.

So here you know the example I showed
earlier you're trying to minimize

the mismatch with
the regularization the P.T.

is a constraint and
here is just a loop around G. and if this.

So here is the mismatch
the regularization here is now

the weak form of the constraint equation.

OK So and So if you look at
the optimality conditions on this.

Here's the matrix of all
these derivatives in

each of the blocks says
the form of a song.

Integral derivative

of some function times products of
the residents of basis functions.

OK Again same structure for
all of these calculations.

Punchline.

I've said it on every side but
I'll say it again.

So once you have all
the basic building blocks.

If you can represent your meshes
your basis functions your

integration rules and so on.

All you need.

If you have a tool that
can represent these G.R.

Zanele R.'s so a language that represents
these coefficient functions in.

Enables you to take
the derivatives of the functions.

Then you have what you need for this code.

Now we use a technique
automatic differentiation

in order to in order to
compute these derivatives.

So we have within Sundance a library
that basically with operator

overloading builds an abstract syntax
tree for representing all of the kinds

of operators that we want and then we
use automatic differentiation on that.

So there's it's worth pointing out that
we actually couldn't we don't in couldn't

use some of the off off the shelf
eighty tools that are out there.

We had to do a few extra wrinkles
the earliest automatic differentiation

implementations were trying to fix
a patch up legacy code you have

a legacy code safer residual evaluation
you want to run Newton's method.

So you have to write something that
will parse up the legacy code do a D.

on the intermediate
representation then generate back

out new Fortran that will calculate
the Jacoby and or its action on a vector.

So these are very well done tools
they worked great in an engineering

standpoint of trying to solve the problem
that people were after tools like add a C.

and add a for your examples of these so
more recently people said well if you're

writing new code and you don't already
have a legacy code you can make

these codes a deal where by doing operator
overloading So example the Sokoto project

in truly knows does this you write your
code your code is then template it over

the type of scalar you're computing with
and then you can use doubles or you can

use some special type that keeps track of
derivatives with it but you still can only

get the simple derivatives of things that
you have things that you already want and

you have to write the low level code
in order to extract its derivatives.

So in Sundance.

But first of all we don't
have any legacy code and

we don't want to write low level code for

every operator at that we might want to
code that's what we're trying to avoid so.

We will be able to build a high
level abstract syntax tree for

our new domain specific imbedded
language and then we'll

be able to drive our new operators through
out a magic differentiation an interesting

wrinkle to this approach is the difference
in is fully embedded in the system.

It's not an afterthought in fact if
you're solving a linear operator

you're selling a linear equation
like the plus sign equation.

We even use automatic differentiation for
that.

At least psychologically you have you have
your weak form you differentiate it with

respect to test and trial functions and
you pop out the stiffness matrix.

OK so this stiffness matrix
pops out from differentiation.

Even if you're doing linear operators.

So we have a paper that is in the galley
proofs stage with Kevin Hart.

So it should probably appear next year.

I hope.
Given how this.

Runs I can give you pre-print
some on my web page.

The high level C. plus plus library
where we can represent variational forms

underneath the hood they
get differentiated so

the user specifies the problem and
then at run time when you say Go

build my stiffness matrix Sundance
loops over this expression tree and

it builds the Finite Element operators and
it's actually very fast.

We don't pay a big penalty for
this Sundance lives within the truly knows

project and so we have access
to all of the scalable solvers

sparse direct sparse and rid of
algebraic multi-grade and everything.

So we don't have to develop our own
solvers we can just use somebody else's.

So as I said there's you know in addition
to some of the practical details with the.

You know with the existing eighty tools
we have a couple of extra wrinkles in

Sundance we have to differentiate
with respect to these arguments that

are themselves functions and I'm not aware
of any eighty tools that do that so.

So our data flow has to take take into
account that as well as we have to

figure out was this a derivative
with respect to a function or

a spatial argument and
we need to keep that.

And we've done some optimizations
to figure out who this is

something that doesn't vary spatially and
we have some specialized rules for

that there is a paper in progress on the
data flow that we need to finish up this

is right we have an interpreter.

You can think about it that way.

However and

interpret a specialized interpreter
is a compiler there is string output.

So when when it when you make an error it

prints out a string representation
of the variational form.

So that's great for debugging.

So once you've got that you've almost got
a compiler there's some tradeoffs here.

So when Kevin first started developing
this he was constrained to run

on some code some machines at Sandia
that didn't want dynamic linking So

if it run time you generate code in
the try to dynamically link it back and

that wouldn't work on some of
the machines you had to work on and

it's you know from a system standpoint
it's a lot more complicated.

So with with the interpreter
you can write one source code

link against the library and
then it all just runs.

Very very little but
there's a lot of opportunity there that

has not been explored but
it's actually very practical.

Without that.
So yeah it's a very basic interpreter but

the one thing that is done very well and
that is that you walk over the tree once

and then you amortize that walking
over the tree over hundreds of cells.

So you process a lot of cells in
the mesh for that one read and so

that sets up the chance to do most of
the arithmetic with Level three loss.

So then the floating point computation
is dominating any anything that we're

doing with what the data structures so
in that sense.

Don't do more work than you have to do so

which is why we haven't gotten
around to exploring cogeneration.

So here's an example I've got some
other examples I can show people but

this is this is one that's.

But that is out of our paper where we
form your set up and formally grungy and

for the P.T. you constrain optimal
optimization problem I showed you.

So you has been defined to be
an unknown function you star has been

defined to be some discrete function that
we know we have some tabulated data at so.

So here's an L. to Norm.

Of the fit.

Here's a regularization of the.

Of the right hand side.

This we defined to use a little later on.

So here's your constraint.

This is looks like a weak form of the P.D.
right that defines our constraint.

Here's your Look here some boundary
conditions set up a functional.

And then there's a few more lines that
says you know go build a nonlinear solver

and feed this functional to it.

OK So underneath the hood

other then this ability to interpret
the thing that's like in a city.

There's one thing that came up.

So what the.

What the what the syntax tree has to do
we have a discrete function evaluator

that has to know how to walk
over this syntax tree and

then be able to evaluate that
coefficient and its derivatives

at the quadrature points after you
do that we'll call that thing W..

So we've talked to our symbolic engine and
we figured out what W.

is at a lot of the quadrature
points of our problem and

then we have things that look
like integrals over some domain

some coefficient times
derivative of test functions.

So then there are there's
a few ways of handling this.

We differentiate between the case where W.
is constant and not constant because there

is some some some nice speed up
we can get when W. is constant.

So we just have Element integration
routine that handles stuff of this

form with some extra wrinkles for

when you have boundary integrals
as opposed to sell integrals So

this is all fairly standard stuff that
you would have to do in any kind of fine.

Element code.

Except if you set up the data
right you're careful with that you

can do most of this
with Level three blogs.

And that is where you know our performance
wins come come in Matrix assembly.

So the point is if you can do
a calculation do it many times.

So it's Words to live by.

OK so here's just a couple
of a couple of slides.

It's important when you're thinking
about these kinds of codes that

parallelism should not be an afterthought.

Because there's a lot of things that you
can do that will mess you up if you write

a good serial code and
then try to paralyze it so

it was designed from the beginning
with parallelism in mind this is just.

This is just a plus on equation
with weak scaling constant work for

profit constant constant number of
cells preprocessor on ASCII read.

So this is exactly what you
expect flat assembly times.

So it's this means that our degree
of freedom orderings our mesh and

talking to truly knows is not
getting in the way of anything.

And in fact post on is exactly the right
problem to do for here because there's

almost no work to do on every cell to
build the elements to this matrix.

So the thing that actually dominates
this is inserting the entries into

the global stiffness matrix and
so if we've done a problem

if we had a problem with their
insertion in communication.

It would actually show up on posts on
before it would show up in obvious

Stokes So this down here at the slides
aren't the graph isn't really clear.

I took so that some of the Phoenix
codes generated offline code for

and for Stokes using tailored elements and
I just compared them.

You push the button and say OK use
the compile already generated in compiled.

Phoenix code and

then let Sundance interpret the expression
build the Matrix and both cases.

Everything gets written into any
petro matrix so that's normalized.

And so.

So the thing is Sundance is X..

It running about an order of
magnitude faster than Phoenix.

Even with the interpreter overhead
built into the code and so sorry.

OK So down here.

Yeah these are number of cells number
of triangles in one direction.

I think it's been awhile
since I did the calculation.

So these are hundreds of trying
hundreds in one direction so

now they're in the thousands on
what you know on one processor.

So yes so the thing that we try to do so.

Yes So the thing.

One of the things that's hard about this
kind of work from a practical standpoint

for the code is you want to
be able to take all of these

different pieces like meshes and
differentiation and solvers and

time together on the on the other hand
once that that framework is in place

you've got to drive through some of these
yesterday around Emory and I got lost.

So so hopefully things will turn out
better on the side so you can think

about a tool like Sundance this
way that somebody may be coming

in from a discrete ization direction and
they want to make a contribution

to people that are doing applications so
they come around here and they come in.

So you have traffic coming in and
out from lots of different directions and

you want it flows smoothly.

So people doing solvers people doing
descriptors ations people trying to do

different applications you'd
like them all to be able to

use the different pieces of this and
help each other out.

So I've got about fifteen minutes left so
let me talk some about some very

very recent things that we have
done where we tried to use Sundance

in the chilliness library for doing
some more advanced preconditioning in.

Solver techniques.

So one of the things that is nice about

the inside of Sundance is one of the one
of their modules in it is called T.S.F.

which stands for truly no solver framework
was in there a long time and it.

It's not been terribly widely
adopted out of Sundance.

But it does a couple of things really
well one of them that it does is we have

a class called linear solver builder that
looks at an X.M.L. file and it goes off

and then it builds a linear solver like
it's now this doesn't sound like much but

the great thing about is that it's got
all of these different packages for

it is methods for a pre-condition is for
direct methods at the same time.

That's also a drawback.

If you want to be able to
switch out between all of these

all of these different codes.

You don't want to have to learn a new
A.P.I. So what we have is something that

is aware of lots of different pieces
within truly knows and dispatches on them

based on an X.M.L. so you can read in
tolerances pre-condition or cetera.

Feed it to you know feed it to a class and
it's a factory that goes off and

instantiates the right piece
which really knows that on top

of all of the major cities and
linear solvers that you have access to

we have an abstract operator class and
then algebra.

Defined on this operator class where the
algebra is done with deferred evaluation.

That's really important sparse parallel
unless you really have to you don't

want to do sparse matrix multiplication
where you do a matrix matrix multiply and

that's complicated.

Similarly you don't ever want to actually
form an inverse of sparse matrix

in parallel or serial for that matter but
notationally it's very convenient

convenient if you can think of the product
of two operators as an operator and

if you can think about the inverse of
an operator as an operator OK this is

very easy for a very natural for
writing down a lot of a lot of systems.

The other thing that's nice when you're
doing nothing and other kinds of

systems is if you can think about
your operators consisting of several.

Different blocks like the discussed of the
block were the pressure gradient block.

So we also have this ability
to think about operators that

blocks of operators form new operators
that act on partition factors.

And so as I said this is all of this in
here is done with deferred evaluation.

So that the sum of two
operators does not go off and

add the matrix entries together but
it gives you an object that when

you multiply it onto a vector it does the
two matrix vector products and adds them.

OK so that's what I mean
by the first evaluation.

So we can see that it's a very
nice nice set of classes for

being able compressible my goodness.

No one caught that.

So I compressed it.

If you will.

So anyway if we just look at the way I
said just turning into an engineer here

if we just look at the incompressible not
be used to equations you linear cetera.

And you wind up with a system of equations
that has this kind of structure.

Are going to vary depending on how you
linearized and such and then you have

maybe some kind of continuation loop
outside of some nonlinear solver loop and

then inside that you've got G.M.
or something.

OK.

In then we'd like to study
different different approaches to

solving this well there's some

classically there's some things you could
use a block diagonal pre-condition or

you just take the velocity block here and
then you take a pressure mass matrix and

that one one two two block zero use
the Fortran indexing so this is nice.

You can prove that this gives
you mesh independent bounds.

You know that the Matrix itself is so
the system itself is a bounded map from

each one to into the dual inverting
this actually will take you back from

the dual into H. one L. two it gets
rid of the mash on the other him.

Indefinite it has positive and
negative real part I going to use.

That's a problem for some for
some methods like most of them.

So in that it does have this bottleneck
that you do have to invert this block

that's that's a common problem that a lot
of these methods have in a precondition or

you can often get away with
doing it somewhat in Exactly.

So some other conditioners that.

People have looked at have a block
triangular structure have the same math

block X. is now the sure complement of
the matrix system and you have this B.

transpose there.

If you do the inverse the inverse of this
that you see what you actually have to

multiply with to apply the precondition
has this has this nice factored structure.

No you have to invert F..

And you have to invert the short
complement that's not so bad because we

know for the theory that
the short complement operator

is a nice bounded linear map from L. to L.
to that means when you discretized you get

dependent condition numbers but to make
this method even more practical they

will embed a an approximation
to the shore complement.

So the one that I have coded up is called
P C D for pressure convection diffusion

goes back to thing to K. and Logan lots
of people have also worked on it so it

is based on a mass matrix for the pressure
space a convection diffusion operator but

on the pressure rather than velocity space
and then an inverse slope last operator.

But only on the pressure space.

So you swap out solving
the short complement for

doing one loop last solve which is very
easy and then to matrix vector products.

One interesting thing about the block
trying to loop free conditioners is

that they don't drive the matrix
A to be the identity but they drive

it to be block unit lower triangular where
you have an identity on the diagonal.

And so the eigenvalues then
are perfectly clustered around one.

If you do this.

Complement Exactly.

And so.

So for these you do get you do get
much more definite definite So

preconditioned systems so rezzes going
to work much better on them now.

It may strike you as odd.

But I'm still an alternate that has not
been explored very much in the literature

for various reasons is what if you
just bang on the sheer complement with

G.M.'s Now what did I just say
the complement is a bounded linear

operator from L. two to L two it has
a mesh independent condition number so

we should converge in a mesh
independent number of iterations and

what's more to avoid
solving the sure complement

problem by doing a block precondition or
I created a precondition or

for the short complement problem to avoid
solving the short complement problem.

So we have precondition ors that
actually work on the compliment.

So we can bring on this with G.M. rez.

And then we can precondition it with one
of the sure complement pre-condition as we

invented for the block solver.

And what's more there's some other things
we still have this bottleneck where we

have to invert the af the convection
diffusion term for the velocity space.

So we're looking at throwing some things
like recycling creel of methods that

we've started doing a little
bit of work on inexact but

we're not ready to say
say much on that yet.

We just wanted to see how far with good
software and some good higher level

mathematics mathematics theory can we
push on just doing the short complement.

So some other interesting things
about the short complement.

If you're doing G.M.
or is on the block system.

One of the costs and there's two
things that make John Resig spend so

one can be the matrix vector product.

The other is you have to
store a bunch of factors.

If you're doing the short
complement you are vectors or

you have one vector in the pressure space.

OK.
Whereas if you're doing G. and RAZ.

On the full couple blocks system you
create a lot of actors have a pressure and

two velocities or
three velocities in three D..

So your memory and your cost of updating
to has been blurred matrix in G.M.

rez are much lower.

If you're acting on the short complement
there's actually also fewer math

facts that you have to apply
in the process of doing this.

The theory from Independence is actually
trivial or at least reduces to a known

result in a trivial way where the known
result is non-trivial The drawback

is you can't get away with this much
slop and doing and doing the F.

solve that you have to do so
there's some tradeoffs in here.

So as an aside I see the time.

So let me let me go through
this somewhat quickly.

So I just looked at a simple
reaction to fusion system.

Let's look at the mathematics of this for
sure.

Complements OK So
this is a model reaction to fusion.

So if you linearized a time
step if it's you know.

Mathematically it would be very similar.

Concert let's look at the short complement
here you eliminate one of the Pashtuns in

favor of the other you get a Laplacian
plus an inverse Laplacian OK

we know how to invert the look plus
operator that's a solved problem.

So now what if we were to precondition
this with the A plus operator because

the last operator is something we know
how to solve you get an identity minus.

Well at the P.D. level that's inverse by a
harmonic we're not actually going to do it

by a harmonic solver.

But that at an operator level
is very nice if you're in

a situation where you
have regularity theory.

This is a smoothing OPERATOR This
is a highly compact OPERATOR So

in fact your precondition
system has the form of

a precondition of a symmetric compact
perturbation of the identity.

G M.

RAZ contrie gradients even will
work on this like gangbusters.

In fact I coded it up with one two.

If I make alpha and beta very large
I can get it up to three iteration.

Outer iterations on G.M.'s
now what a ration of G.M.

rez requires to put song cells but
using the kind of the high level P.T.E.

structure we were able to reason about and
see that.

Sure.
Compliments for this problem.

Actually we're going to give us some
very good results and using T.S.F.

in our block operators we were able to
you know to code this up pretty quickly.

Now coming back to that.

So for the not.

The sure complement is not compact but
it is bounded and bounded is good so.

So here I ran four sets of examples.

So this is think about it as a relaxed
Newton method if you don't know

what pseudo transit continuation is
otherwise if you know what side T.C.

is it's the society so.

So here is what this is coded up citee see
for doing the nonlinear solve and then it.

Each step.

I've got a Jacoby unlike matrix
I need to invert I take it

apart form the sure complement and
I bang on it.

So here is G.M.'s.

You know thirty to forty iterations.

But does not change much.

If I go to thirty two by thirty two or
sixty four by sixty four.

It is it's as much independent
as you get in practice.

OK it's not even slowly
tailing up on this.

I've got a reasonable
Reynolds number on here.

It's one hundred so that's not it's
not that it's not trivial either.

Sorry this is driven cavity.

So you've got to start somewhere.

So here I turned on recycling G.M.
or a solver called KROGER I went and

wrote a book up and T.S.F.
so that I can go in and stand.

So when you do.

G.M. or as when you restart or

when you solve a linear system you can
reuse some of the Create a lot of factors

that you have formed to somehow accelerate
or improve your convergence later on.

So I am recycling ten
create factors in here and

it's giving me something a few I'm
saving a few iterations here and

there you know so some of these
later steps I'm actually saving.

A bit so you know so
there may be something to explore here and

then I took G.M. resin I hooked up the
pressure convection diffusion equation.

The P.C.B. pre-condition are here.

So you want to compare this column to this
column the first P.C.T. buys me nothing.

Why.

If you start from an initial guess of zero
your first solves the Stokes equation for

Stokes the P.C.B. operator to
generate through a mass matrix.

It just doesn't help but then after you
do one iteration you have a non-trivial

velocity and
then P C D is actually helping So

you're saving about a factor of two or
so in the number of outer iterations.

So essentially for every P.C.T. step for

every step you state here you know you've
gotten rid of one of these nasty F.

cells and that's the thing we're
trying to minimize here and

then if you turn on recycling with
the pressure convection the fusion.

You know you say you're
saving a little bit.

It's not clear whether
this is worth doing.

But it was changing one line an X.M.L.

so it was an X.M.L.
file so it's completely worth trying.

And there may be some tuning some
primitive tuning to do on this to

figure out what the optimal way of using
of using of using the recycling solver.

Now this code is in progress.

So I was hoping to have it done before I
got here but it's not quite ready yet.

If you take the sure complement of the
convection problem that I started with you

have a temperature of convection diffusion
minus some things that are mass matrix.

And then an obvious joke solver.

But we know how to solve
an obvious Stokes problem.

All right well sort of you know if
you pre-condition this operator

with your temperature convection
diffusion which you also know how to do.

It's just a scalar elliptic problem.

It's a little ugly but
you can feed it to algebraic multi grid or

something even better if you have
something that's convection aware for

a multi grid method and
this also at least at the P.D.

level and so with some regularity of some.

Sions has the structure of a compact
probation of the identity.

These guys are at least
bounded they may be compact.

But for
the linearized not via Stokes Equations.

You have elliptic regularity for

the linearized problem with elliptic shift
there you get a look at shift there.

Well a convert compactness says this is
compact perturbation of the identity.

So the question in practice
before you dive off and

try to get some really delicate
estimates on you know what's the biggest

what's the smallest eigenvalues of that or
you just want to code it up and

see if it's worth doing some of
the analysis on the could is in progress.

So if this convergence quickly it's
a very good convection solver because

if you do five iterations on this
that says that you can do something.

Coupled tightly in a non-linear
way to an obvious to us and

a small constant multiple of the cost
that it takes to solve an obvious to.

So so that mathematically these are the
kinds of results that were they were

trying to go towards and
say how can we use all these pieces and

all these solvers together
to tackle the problems that

we know how to solve coupled to other
problems that we know how to solve.

So looks like I'm about out of time and
right on the dot so conclusions.

So many numerical computing is
the is the idea that we're trying to

leverage high level structure high level
descriptions and then reason about

the structure to produce a very efficient
code that you don't have to program in

I notice I'm not saying rapid prototyping
Sundance is high performance code but

it's also very easy once you've learned
the system to get stuff running so

I prefer the term rapid deployment
rather than rapid prototyping because

rapid prototyping get the idea works now I
have to go spend months of my time writing

writing the real code you don't
have to write the real code.

It is the real code so that we also hope
I showed from this last part of talk.

We have something of a framework where we
can take different ideas from numerical

analysis and
hopefully get them all to work together.

So with that I will stop here.