A pleasure actually when we do our job and our series will have a very very. You also have to be your first human to have. Somebody who recently. Spent one year left over since October. Here or see this summer to spend some time with that over the last few months. Very diverse interests and also just by the label the department of Georgia Tech a significantly long Earth and Planetary Sciences. Science and Engineering even if you. Aren't And so you start with great large operation around. The way I look at. The scenery it works for us but for the my stack machine learning a lot with lots of different technologies moving like Miley Cyrus and looking for hearing that that you can get out of the way THANK YOU THANK YOU Can everybody hear me. I added that that's also always good to do OK Well thanks for giving you know opportunity to talk so this is not a very polished talk a very very results that are like a couple weeks old in there I wanted to try to give you a just off. Of what we do there is not an enormous amount of math but there are certainly a lot of connections to epic ations and I will once in a while make a remark and try to connect it to think you know there's a lot of material with you know how far I get the key idea here is that there have been. In part driven I guess by machine learning a lot of recent developments in the matrix factorizations that allow us to think that we could never hope to do the way that this concern is the way we acquire days. Because I mean seismic exploration we use manmade sources to find out what the Earth looks like and so. We these techniques can really have an in major impact how you conduct that business and then I will also talk a little bit towards the end of our get there on a very esoteric type of approach where we can now do think that you could never hope to do all any computer on earth because the problem is too big it's an enormously lift the problem but by being very clever with linear algebra you can actually do something and so hopefully in the end of this talk you have a an idea that linearized or linear algebra randomization compressor sensing machine learning have a lot of offer a lot of tools to really make fundamental breakthroughs in applied field such as a seismic exploration OK His work is not by myself it's by a whole bunch of migrated students some of them are in the audience some of them are postdocs. OK So just to give you an idea and and we aspire to these skills and certainly in the first part of the talk we will hit the sort of the order of scale as I progress more towards the end still an export of stage but we really don't know Algren's and with this sort of skill problems in mind right so in the field of seismic imaging what we do as I mentioned a little bit earlier we have a source that sensor that energy into the earth would be flex and then it's being detected by a bunch of microphones and from those echoes we get back from the earth we try to make an image of the subsurface OK so it's a very diverse field it involves Jew physics but it also involves signal processing a lot of mathematics on the guards is what makes it a particular challenge is that a typical images of thousand Q You have a billion on knowledge that is a lot so that already makes the big difference. Full big day that's not only big data was also big malls. Right we typically collect two dimensions more data than we do. The image so that's a that's about ten to the power fifteen and then for those of you who know about P.D.S. We probably gateways over long distances which has major numerical issues to do that right so there's a lot of problems that he chair right there now. So what gives you are really doing is we are basically using a way if you question that is paramount to I am think of M. as the speed with which way Spoke to get in the earth that will and we want to attain that as a grid functional space from data we collect at the surface. And those never last the defines how wasteful we get we know its source we can be later sourced to a way field through the way if you question that's a discrete system of equations that this could ties is a P.D.E. that describes how waves propagate right and that is determined by M. so the song Speed that very spatially determines how that system works that's a big system that's a system a billion by a billion for one frequency if and has a billion point shirts a large system of equations right there. OK And then we call that the only data at the surface so we're basically trying to find an That explains or preserve data because of course we do explosions in the field to measure the response. And to solve the sort of problems we saw over a large minimisation probably very minimize an objective parameterized by that M. that minimises the misfit between observed data and the for modeling that we do in a computer F. that is permit by AM and again to remind us these things a lot things are also expensive right to collect a seismic survey set you back between the thirty eight. Two hundred million dollars These are brick exercises take months and months with large groups but it's not something you do enough to know. OK So and the system resolve if you for instance solve it in that kind of main this is a large system but there are direct links if you look at this this looks very much as a an objective you would minimize in machine learning right so just label things different so this would be to impute that this would be training data out of the M R parameters of a network ever is a network and we solve a.p.t you can think of that conceptually so multilayered network except we have ten thousand players we don't do not only operations between the different layers but it tells you how big the systems are actually to really large systems or we creations we have to solve just to link the physical parameters in the earth to the observed data OK So and there are lots of. You know associations you can have as I said the form although you can think of that as a genitive convolutional network we do things called add joint state but that's back propagation we do similar to any sourcing that's sketching and stochastic optimization So all of terms that may be well known in the machine learning community also apply to our community we just call them differently right but the structure is the same except that you say our variables are quite large although you have to say that the hidden variables in a deep convolutional network are also large. But I think if you want to think of the skills of problem we're really trying to learn five K. plus video right so we're not working on small little images we work on really really large data for us and so if you were to say I want to use the blurring in this field tomorrow at skill you have a problem at your hand right so this is sort of the but that's a challenge so that's why I'm thrilled to talk to this audience and to start working on this sort of problem. OK so I want a large research program so I did a lot of work on optimization compressor sensing this is sort of an outdated so it doesn't have machine learning yet and it's Matlab you move to Julia in the meantime but it tells you a little bit a mind map of what the different things we are in your group and we have to work with lots of different things to solve this problem it's not like a knife idolised thing you can just work on with a small team you have a large team of people touching all these boxes in that mind map. Just to give you an idea what the output looks like so this is sort of a conceptual model of the earth where you have to last the changes as it as a function of space this is just two D. But of course we want to go to three D. and then typically what we produce are either a bit blurred images or sharp images of high frequencies that basically delineate where the earth changes we call this process to generate this from the data we call that migration I'll come back to that later. But the idea is to get very high resolution images and for that you have to understand the physics of how waves interact with the earth and collect data and try to figure out from the data what the earth looked like so it's a large inverse problem but there's a lot of H.B.C. components because these things are show Big OK. So I was just so now let's go to the research topic for today and I will speak specifically today is how we can exploit lowering structure that underlies today to our mind to collecting data in five dimensions to sort their instance to receive a dimension and time whereas the image of the earth has three dimensions so to must be some sort of an inherent redundancy and we're going to try to find ways to exploit that and then if I can any and I want to talk about what we call so if you image of all humans which are lifting off the problem so if billion barrels are not enough for you in that case the it's a billion by a billion matrix that's. Right so which you can never hope to form but you can maybe work as actions with it. So that's what I hope I get there. So why do we care well because we would like to reduce acquisition time acquisition is expensive it is has an environmental impact that you want to reduce can we be more clever how we acquire data reduce cost which is a very important these days because the oil industry was the lower price it's not in the best shape but that's for everything data collection everywhere is expensive perhaps if you if you at least if you do with censors. Then there's a massive amount of data and that makes it also difficult to compute So if you can find the percent patients of your day that are enormously compress and allow you to extract so four years of it without having to form two full volumes could have an enormous impact on how you process that data because you take away a lot of burdens related to Io. And then can you do things that you can never dream to do unless you're clever with your linear algebra and that's what I get there. OK So just to again to to show you a little bit what's going on so this is a graded philosophy of the subsurface So that's the property of are interested in and as I said that has typically about a billion point and that from that with the wave equation we can generate people shot records which are basically a single source experiment this is the data that's collected by a two dimensional array of receivers and so the vertical axis is time you see here all these way from coming in and then this is the action the Y. coordinates for your source for your receiver and then you do hundreds of thousands of these so you collect a lot of data and then of course the question is can you map from here back to here so you say it's a large imaging problem so they will soon hear it but abides in the data they collect so you have to do a mapping from a five dimensional better buy to something that let's say with a billion or so OK to make things even wars we are interested in basically doing a lifting where every grid point in the model that we're interested in generated a volume like there is so actually the unknown becomes a billion by a billion so that's a that's something that you know that's almost like one fifth of the earth in a big mate the big Google matrix of and it's dense right so that's the. OK so we can all form that explicitly So let's start talking a little bit about what we do in the acquisition because again we need to collect data in order to create these images. And so as I mentioned we are sampling a five dimensional way for you that has a long range for time and then we have two receivers and two sources that walk all at the surface so it's a five dimensional data voice. And so we work a lot on compressor sensing in the beginning and using transform domain techniques and these are great if you are in three dimensions but in five dimensions the cause of them mention their basically hits you so these things don't scale so all the fancy stuff we used to know from mom say I want to can Alice's things like curve words wavelets all that stuff but fortunately it doesn't really work because it blows up in your face because the data is so large so that prohibits a scale up to five B. So can we exploit an order type of structure to handle this sort of problem. OK so to do that we're going to share a three D. seismic survey as a along the fifth Corps and with a budget get rid of one corner it and if you do that by taking a four year transform a long time and then we look at mono Matic frequency slices right that is you have four dimensions to source to receiver but that one frequency and then you look at different frequency so you can be treated independently because of the poor Transformers orthogonal. OK so we call these things a frequency slice OK So this is an example for frequency slice that's a small one but it's still a dense matrix all for twenty seven thousand by twenty seven thousand and that presents a small patch this is soup this is sort of a post stamp three D. seismic example but that's already pretty tough if you will have to work with this. Stuart explicitly right so we would like to work with this for Matrix and but you know that may not be possible so one of the things we can do is to and I'll talk more extensively about that as we go on is to recognize that if we organize the data in a particular way which I will revisit in a minute. We can actually work in attractive form so instead of working with this dense matrix we recognize that if we organize made to size this four dimensional tensor into a matrix and we do that clever that there is actually be done and see and it's not so crazy that there's redundancy because these sources of receivers look at the same earth there is fundamentally we don't and see there certainly if you don't charge too high temple frequencies. So we can actually think in terms of factors form and then we only work with the factors and we can then is out forming of all volume extracting certain volumes that we need for the inversion that is the imaging I mean is that OK I want. OK so what we're doing is we're doing basically the Netflix matrix completion problem on steroids it's a large matrix and so there's a whole literature on this we worked with in my group of those who has been wrecked who is now a gurgly who works a lot of machine learning problems and there's a whole bunch of papers to chicken find online that deal with. OK So what is key if you do matrix completion and I will I will go through these and then I will make the connection to how you have to rethink your site's making their way to make it fits in this framework because if you just do think naively it won't work they have to be sort of clever way to do and so you have to first of all try to find every person station of your day that reads a mate is ation of your of your rank for tensor into a way that the single values P.K. fast so how can you do that that's the first question we are. OK So we basically have to think about how to own fault a rank for tensor into a matrix and there are different choices you can make. So if you have four dimensions here to source into a receiver you can either you know put together source X. and and source Y. and that's would be the canonical representation and the same sort of receivers or you can do a permutation because there are many different ways how you can make the size of your data so this is a example of a frequency fly where that means you look at the day that one frequency where I too long to get a source X. in the source by corners or the X. axis and the receiver X. and receiver Y. coordinates on the vertical axis and in here you see all these sort of different little experiments that is basically a little part of a three dimensional survey. And this is what these things look like so you see the sort of all sort of Tory functions that we carry away from the diagonal because waves propagate away from the diagonal not this doesn't decay fast the singular values. If you permute So you long to get a source X. and receiver X. source wearing receiver Y. the single values to K. way faster so how you organize your data how you re percentage as a Matrix makes all the difference and that's the end that much of a real big difference that makes whether your matrix completion works or doesn't work and this is sort of where you as a person in the plights feel have to understand why that's important and understand what you need to do because nobody told us to do this that's something that we came up OK so that's good so we have a representation where the single values decay fast and that's reflection of the fact that seismic data is we don't because again we collect five dimensional data and in the end we're interested only in a three dimensional earth and this data is generated by a three dimensional earth so that there is a low rank structure is not necessarily a big surprise. OK but that's not the whole story we also need to think about sampling and sampling we do not have a lot of control over because we are dealing with a physical system so the simple sampling I'll talk about for now is where we miss certain sources or receivers that is we just didn't collect if we didn't fire source somewhere or we didn't have a receiver somewhere right which is I did my boss told me I couldn't do it because it's expensive or I just couldn't do it because there was a physical obstacle or something OK what you want with the sampling and that's an idea from compressed sensing if you want your sampling to break the lower rank structure that's the then a rank minimisation algorithm will give you potentially to fool data back that's sort of the compressor sensing idea we try to exploit OK so if you are in a canonical representation there is source X. source Y. receiver receiver why I mean Mrs source we missed a column. Right if we had missed a receiver would miss of receive miss a row and missing welcher columns in The Matrix does the opposite of increasing the rank it decreases to rank so this is the Warsaw simple sampling you could ever dream up if you are interested in a matrix completion problem and nobody ever told you that because it's your Titian in a field to say we think uniform random samples then we will never have this but I can all do that in a field in the field I miss a source and I miss it all the way right so this is sort of a I can't control that because her physical system I'm working with. But if you go permuted then it looks like we'll lose random blocks and now the rank or the decay of the single value to say as much floor and therefore a rank minimisation will send a chance to get your full matrix back so there's a simple trick where actually there's a free lunch we have a representation community. Fast and the sampling is in sort of lose the use terms incoherent it sort of raises the slows down to decay of the single values which is a good setting to be in if you want to fill in this matrix as if you took samples everywhere. OK So let me use techniques from convex optimization to solve these problems so X. in this case is the full data matrix and she had samples everywhere then the curry A is say a mask that takes out yeah so. So. Yeah so OK this. Is. Yours at the if it's if it's for Yes yes correct. Yes all. I could be although what we will show is in the recovery we actually want to use information that X. extends over the whole area so we consider the poor the act rather than locally so what industry to us they think in terms of very little window stem bite them bite them at ten we think of The Full Monty and then we exploit the fundamental redundancies that resides in the data because we're looking at the same or right so I you may be able to to fine tune the things and I will I will give you a little bit more point as to where we're thinking theoretically. Here you had so we in the end we don't need to know which way to expensive OK so what we do in order to do it is there now we make a show OK So anyway this is a that the mall's how you acquire data so that knocks out things where you don't have anything and B. as you observe data and you solve an optimization problem that finds among all major She's X. the one that has the smallest some of the singular value is subject to that if you apply the sampling operator to it you get your data OK so now we make an and then additional step and this is turning a nice convex problem into a non-convex problem but we are forced to do it is because we can never hope to really work as to who actually can only work in factories for so we work in an AND LEFT RIGHT decomposition of the matrix where the L. has a small rank March smaller than the size of of X.. OK now we develop the solver for it is either some papers on that but for the interest of time I will go with a quick but we never do we never get to see in the values we work as an opera of the for being is known by the factors for what the singer values are because we can't even avoid doing any of that OK so now that's nice but you know who cares and also you would like to know before hand a little bit what should you do and what you should not do what I did now is very very qualitative can we go to something that's slightly more quantitative Having said that it's very difficult to come up with strong mathematical conditions that will tell you exactly what you need to do in the field right so that's still an open problem the same thing with compressor sensing that has beautiful theoretical results but almost nothing translates to practitioners are made These are not idolized there's lots of things that just don't necessarily map but there is one thing that I think will give us a little bit of a handle that suspect gap so to look at basically what's the connectivity of your sampling in the Matrix and that is a predictor of how well you think you can actually recover. So if you remember our goal is to find an approximation to X. from noise you observe entries that live in Omega and they settle for make us much smaller than the set of X. right so we have a restriction operator that X. on X. that gives us the samples for a mask if you if you wish. And then we solve this organization problem to get there. OK So there are couple things that you need to do so first of all a lot of the mathematics makes assumptions on how it is only goes distributed and typically people issue it to be uniformly distributed. But as I said already that is not necessarily something we can do in practice and another thing that you need is that you have to have the singular vectors need to be incoherent so there are some conditions on what the single vectors look like how the energy is basically spread the Morse the energy spread the better and that is something we can compute. At least for a large chunk of acting. OK so if we have uniform random sampling then everything is not right so that means we have a matrix we only have actually in The Matrix where where it was white and. And and so if that's uniformly distributed that's good. The other thing that we need is that our data looks sort of spread so it should not look like this it's more looked like that and from the seismic data that show you already things are enormously spread these are waves of over get everywhere so things are very incoherent it's you're in a good setting where does this this would work. I know it is a bit hand-waving but I'm sure if you compute some of these things that will be in good shape. The only thing what we don't have is this uniform random sampling and only access to run them uniform random and subset of entries in the Matrix we we just don't have control over that in that way so what then determines a good matrix is a metric that tells us before hand how well matrix completion would be performing given a certain sample mask and that's the sort of the aim here. OK So that has to do with connectivity so you want basically. Your sampling to have connected so many feet and maybe that alludes to what you're also maintaining with so blocks and that is certainly true when you are uniformly sampled then that happens. And so you can express this in terms of the spectral gap of the sampling mask so if you have a sample mass going to zero where you have no data and the one where you have data you consider there's a matrix and you look at the two largest single values from graph theory that tells you how connected the sample points are and that's easy to compute you can compute these two large single values on a very large system or the problem. OK And then you can have either small spectral gap or a large fractal gap so that is that there's ratios are to close to one or it's much smaller than one and what the theory tells you it's good if it's Mark smaller than one so you want to have spectral gaps that are much smaller than one and we're working on and this is work also is a student who is still at U.B.C. I think really making this more more more robust I want to just give you flavor now how this works and what the effect of it is so if you have a large spectral gap that is it is very is much smaller on then also if you start nicely sampled and we can then expect better results from A to complete. OK So let's start with looking at the idea OK So this is an example now or of everything in the right organization uniformly sampled so this is what we can never hope to do and but this has a very good spectral gap very small that means there's a large difference between the first in the second finger Valley and then we can actually recover for remarkable low sampling rates right we miss ninety five percent we have only five percent of the samples and we can get a very good result back something like twenty four to be. Right so well that's idolise speak can never hope to do. In the field but it's good in a man in a computer to at least check what these things can do or what they can do what there is here how do you make an error so this is a residual. Or you'll lose something just to keep your idea of these things looking and find a main this is sort of a basically acute unfolded sort of vertical axis is time the horizontal axis of receiver act the vertical axis is the receiver Y. and then we look through it like this sort of so this is a this is the time flies if you look from the off and here you look from the two sides you can see that you miss a massive amount of data and this is what you get after your interplay so this is doing the interplays of all the different frequencies taken in first for a transform and this is what the data is sort of looks like and it's not perfect you can see there's noise but we get a lot of the as spectra of the data from a very very low sampling rate. This is much lower than typically people can attain with a compressor sense. When you work with transformers and this is what the true thing is so yeah it's certainly not perfect but at least we we're getting there. OK So let's now look at the small case what happens if we have a small versus a large spectral gap so this is again is the original data and then we do recovery where we have a a a large gap so it's close to one or a small gap that's close to zero and you can see that there's a remarkable difference in the recovery it's the same number of samples it's just how you select the samples as to make sure that there is a large spectrum so being the difference between being clever and being not so clever how you put your sources and your receivers in the field. And we can make a lot of that and you can see as suspect to get goes up all the action are sort of recoveries go down every spectacle subsampling is and that's so that confirms sort of the relationship NOW course we would like to have that much more precise in the equations but there that's typically. Not so easy but I think we have a chance here to do more than we are ever been able to do. OK so now that's all still idolise then let's look at what the guys in the industry do and see how that works and so the most expensive and sophisticated way people acquire data in the field in marine settings is with so-called coal acquisition it's a very clever idea. So instead of having a boat so what you normally have is a boat that goes in the boat told you in the array of twenty kilometers long with a conciliar in microphones on it and then it and then pulls like twelve of those a race and then then they night it just tries to transfers in certain acquisition area and in additional way they would do it sort of this way and then they have to return because the boat has to turn and they come back and then there is really only good sampling in this direction in a poor sampling in that direction so it is a poor of a cycle poor as removal sampling because the boat has to prefer a natural direction how it goes over the area. So what people have proposed to do instead is random coral sampling the boat goes into random circles and the centers of these corals are randomly distributed so they are doing random sampling in a very clever way that you can do is a boat that pulls a twenty kilometer long array right so it goes in very large circles with the circle starts like the perturbed as it transfers over the area so this is sort of where the sources are tell you can see from our Chile plotting what the to react to this is where are the sources firing and where the receivers are firing So let me show you a zoom so this is sort of what these corals look like if you look at the source and that's not a nice pier all the grits that people would like to have so the question is can we go from a mask like this and infill such that we have a equal space sources that's the that's basically the goal of this whole matrix completion is can you fill in what you missed because you missed stuff here right whatever's white you didn't put a source and then of course you have to make Also choices to how dense a fictitious source grid do interplay to how a dense fictitious receiver great door into play and word that we can again you suspect to get to play around with matey see that have this mask and then look how that all translates to how you sample your sources and receivers and how that translates then to a umask that you have to invert if you want to complete this data as if you had you know infinite resources and you spent light years collecting the data all the data. So we played around with this and these are sort of the grid sizes for the receivers and the sources and look at the sweet spot to see which one would be the best for the best recovery now unfortunately I haven't verified whether these would really lead to better or worse results because the complication efforts to recover a volume like this are massive so you don't do this overnight you can't run too many either or you could do it on smaller problems if you wish but in this case we did. OK so what in the end of looks like if this is the mask in if you organize the data in the non-canonical organization and so the task now is can you Interpol eight wherever things are white. And it misses a lot of data and the spectral gap is not particularly good so it's much better than doing other things but it's not particularly good and so so we can't expect fantastic results certainly not compared to uniforms random sampling where the spectacle was significantly lower for this is what the data looks like so if you zoom in this is the to react area of the boat and you can see sort of these different stripey things are the arrays of these receivers so every boat carries like twelve receiver a race and you can see that it makes you crazy sort of. Territories but mind you this is in a weirdly organized dataset it's the permuted is supreme mooted mask. So the task now is from this to fill it in right and this is what you get if you fill it in and so that is what is remarkable about this that we use information over the whole matrix to fill it in that's different from what the industry to US industry works a little little cubes in parallel have no idea about correlations that exist over a whole survey area and that I think is the reason why we can do is we were crazy enough to think of this problem as in The Full Monty in the who sort of size rather than already immediately chopping up things a little in those and trying to play with that so that I think is one of the Kerio messages now you can see that the visuals watch that so it's not perfect but if you look at the data so this is the ideal data this is the data we collect so you can see you can miss whole you know part of the data and this is after you're into place so it's much less good than the previous one but still it's remarkable that you can recover days. And actually it's got the intention from industry were Houston the company which is slumber shape that does this call sampling they come and give a presentation on this because they didn't know you could do this. OK. So how are we doing on time OK so you are. A mayor on a few minutes over OK so what's cool about is that sight from the fact it can be covered these data volumes you recover them in enormously compressed form because you basically only work with the factors so at the low frequencies that can lead to enormous compressions we work only with point five percent of this full data voice so data from a compression perspective you can instead of carrying a a a truckload of hard drives is you you can put your thing on my thumb drive right is a massive compression of what you need and that may have a big impact on how we've done subsequently work with this data to create images so it's a it's a that will be a game changer because you can distribute the data over a cluster much easier than what you have to do now if you carry along all these massive full data OK so I will skip this a little bit but there's enormous compression is the only thing that you need to remember is that these things work great for low frequencies for high frequencies compresses less and that is sort of on the suit theoretically why that is so there's just much more complexity at high frequency so anyway we can form a beta for a version on the fly without ever forming the full data for you that means I can give every note in the course or access to the full data whereas now they have to talk to our central huge database and extract the data and that leads to an enormous amount of traffic now to have it all local It will have an enormous basically impact on how seismic data is being processed and I think maybe we can learn from this every skill to larger scope problems in machine learning. OK So that's sort of the most down to earth topic and now and in the next fifteen minutes I'm going to take you to something that's maybe much more esoteric but I'm going to try to give you just of why we care about is and why we can do certain things that you can actually never hope to do if you think about these things brute force feed industries really put forth. I think the dollar has six hundred thousand core cluster for themselves I think has to create computers these guys like to do things brute force they throw a lot of money at it but something even if you have the Vegas computer on earth you can just not do and this is what this is OK so there is some lingo here that. You may hate but I will try to connect it to as some terminology that you guys may be familiar. OK So let's first look at the physics of imaging what's really going on and I didn't talk a lot about that that would be topic for another talk. But what we really do to create an image we probably gave away field in a computer from a source into a velocity model we think if we presented it and the public gets that simulating waves OK And then we also back propagate here you can sort of see backpropagation state what we also know from the notion that the networks you go down in the you go back up the same structure here write me back propagate there is basically whatever it is that the receivers and then we cross correlate to a field. OK so that's not so you have a way field in space and time and forward and I joined and your cost correlate and you look at the zero electron that's what your images and that's sort of what tries to be present here they have a source way receive a way field and then your correlate them and if you look at the zero lakh term the. It's basically looking at zero offset you look at the same point but you don't have to you could correlate the two way fields and look at two different points. Now that's a lifting if you look at all points so that the complexity you image now becomes as big as the image by the image becomes quadratic really large but it has a lot of very interesting information that people are interested in and that's why they compute some of it but brute force and we're basically saying now we can use clever sketching techniques to do these things without having to do things for. So we call this extended images because you map now the data to a six dimensional object which is consisting of the three space dimensions times the seats so it's a very large object and so you can think of this is lifting so for those of you are seen say face retrieval problems people sometimes lift the problem this is a lifting so you make the unknown's way larger and that has particular mathematical advantages. OK. So what people do is do it is for as subset of offsets brute force massively expensive because you have to do store all these massive wave fields that have a billion variables each and cross correlate them and then do different cause correlations and store that reaches to match and so what we want to do is actually think of in terms of these image volumes as a function of all possible subsurface offsets so every point in the subsurface generates of all three dimensional volume in terms of the vertical and horizontal offsets that is the distance between two points in the subsurface where it's a massive six dimensional object we're dealing with OK so what we do now we use clever techniques and this isn't an observation I just of a little and Verjee for T.V. is an audience that actually we can think of this object never format but we can that we can create actions of that of that object on the actors and we can do that cheaply and it's a very much a little bit related for those of you were to Joel Trott's talk a couple weeks ago we have actually is to surround Wong actions of this matrix and then we can play games. All high speed up a little bit there's going to be a little bit more mathematics here but you know there's some math so in the end so what in the end is going on so mathematically what do we have here we have a source on the right and we have the data received you're on the right and it's the forward wave equation has the wave equation you wish that forward way away from the source free is the joint where you feel that is basically the receiver WAFL propagating in an agile and wave equation OK so you have we are way fields in space and there's a functional frequency for our work frequency by frequency OK So an image for you is the outer part of the V. and you so that's quite erratic in the size of the view has a billion variables then he has a billion by a billion and so and the different columns in V. and you we present different source experiments because we have this for every source we have a different one. OK so what conventional increasing does it says you know what we're not interested in this whole object we're interested in the diagonal but that just means you take the heart of our product of these different way fields U. and v and some of them that's what you would normally do and that's it gives you sort of the zero at correlation which you could do something else either is much more so think of it as a cute object and people only do to get an image if they look at the diagonal and I'm saying I want to look at all of diagnose are I want to look at the whole volume at whatever way I want to and I'm going to be able to do that because this whole volume has permits a lower rank. So this is and that's the trick we're going to play and we're going to only work with probing So we're going to apply it is volume on to probing factors and then work from that OK So let's look what this looks like so say this is a very small earth model this is the image for you so this is a thousand by now it's one hundred by one hundred images or you manage ten thousand by ten thousand right because it's screwdriver right in the brain the size of this model. If you look at the single values though they decay enormously fast so there is an owner printing low rank structure here that we may be able to use and so we do that by sketching what we sketch with P.B.S. So we play and this is the trick here if there's one equation of the second part you need to remember this one is if you enter just simple looking at the linear algebra just playing with matrices and then it stares at you so what we do in order to compute the action of we all know a. Sketching factor say a random vector We just have to invert the wave equation we have to restrict it to where the sources are we have to call balls with the source convolve or correlate with the source control over to data inject back at the receivers and propagate again so all these things you can do consecutive in the computer you never have to form before. So that's cool because now we can try to find out from probings what the range of the what's the range spacer. And we can do that by probing it with a limited number of random vectors on the right hand side. And then we can do a Q.R. factorization I have to apologize for the notation we just run out of symbols so Q. means in this context a source and in this context it means the Q. of the Q. are factorization but changing the notation makes everybody else I think of the players forms for something so it's hopeless so this is what I meant was it's not a college presentation yet. Anyway we can work with actions of e on Q. and A and B. is Q and that contains all the information you have to have in order to work with a large object like this case you have in C. and I should really speed up so just let me just show you quickly what we can do is a randomized this would be. So. When you have these actions you can basically computer disk you are factorization and then use this Q. and E.Q. to form basically a low rank as three D. approximation of that large matrix. Right so that's this guy and from that guy we can form a left right factorizations by simple along being the square root of the of the singular values on the left on the Right OK So now this case isn't a turn of the formula for imaging right on the right this is all the formula we have and on the left we have now something that is based on this matrix factorization of this ridiculous the large object and you can ask yourself the question which one is going to give us a bit better image if we are only allowing ourselves to work with a very limited number of probings or basically wave equation souls. And think I'm losing everybody when I was show you an image and then we're done. So in a way what we're doing is either every put a a a a a sketching on the data which basically means sandwich or matrix here between the W.S.W. as fenceposts that's a basically a matrix that. Contains basically crosstalk because these are Gaussian made fifty or you probe basically on the right and then work with these factors and with the same amount of P.D. call speak on the basically comparative to images so the amount of computational work is the same but we are just being clever with doing fancy linear algebra tricks versus doing things move force that's basically what we try to show and what is motivated by is that if you look at the P.K. for dissing the value of the data first has to be gave the singer a very. The image for you the image for you speak a much faster then the. Then the data so a low rank approximation of the image for you is much better than the lowering of the data. So that's the key and the key idea here. Are these are the programs I will skip it and to show you some results so I will show you also what people care about. So you may think that this is all very subtle but this is the difference between finding and overs are for not so this is only a bit Tran probing I did a conventional way or the new way. Sorry So this difference and you need to look carefully you see certainly flexors popping up that are not existing in a conventional way particularly Steve events here that's typically where the oil reservoir is that's what people spend a lot of money on and you can do this same amount of work same amount of P.D.'s also just being clever visually and algebra. Doing it and this is why we increase the number of programs to one hundred then there's still an enormous difference. You can see certain reflectors coming up that didn't exist in the other approach just by doing these sort of smart randomized tricks. OK So I think that's it I will leave it here. First. The first line of the second part of what. The first not the first. Yeah. OK sure this one yeah. OK. Sure. This. Well it is so this is so fun to all be problems very linear in the source have to pour right so we write it's deliberate or like this because we're leaning on the sword every few years. If it's a line in the source so a whole class of inverse problems would pertain to this not only waves. But this but this shape I mean this form looks very much like a C.N.N. You have a primer trysts who show network you have training date out so D. is a training data for now we have to be known with M.R. the parameters of your network and people use stochastic optimization techniques is sold at. And then they use backpropagation we use also back over again we call that hunch that chilled out a lot of connections so we looked a lot at solving this problem using so classical musician techniques but that's a separate talk but since this is a machine learning center and I thought let's make sure the connection is everywhere except our problems are large and we can do tens of thousands of iterations you can do hundred maybe. She broke. It off with. Lars Bush share this story. As. Well so loosely so so in a way we are also in the predicting business but we are fundamentally different than where you guys are so we are if I have to correct I can't predict every data for that area so that's a predictor right but what we really care about you don't care about predicting they would care about M. So our parameters say the parameters of your network are doing objects of interest so we need at least part of our parameters to be interpretable where you meant Israel is learning techniques that are hidden variables that's exactly what we're looking at in my group now and then we don't care whether we interpret those variables but in principle we do really care and as the object of interest. Just. We're going to go. Away for business yes. Sure. Basically. But I can learn today that it's not the problem I WANT I WANT I WANT something that gives me an image the images of I'm interested in so if you happen to do so in a way you can think of it we permit to rise be the model by the way if you question and you say to corporations of the wave equation is something that gives me attention what information you have what you would ASADA questions you could say I don't want to have an image of the physical parameters of the earth I want to know at this depth it whether there's an origin of our or not and then you may do a black box one but then you have to train that is a heck of a lot of examples where you have data and you know there was a reservoir there then maybe you can do that. For you right. Yeah. Now star struck. So I think you mean that if you do you mean the out the time will to sway you so different experiments different. Not just the lesson that yeah so that we caused forty or timelapse Yeah so we look at that we have a look from a machine learning perspective all that but then your he added not our scale of or big larch right but I so where I see where the opportunities for machine learning are is to this is that this is based on a presumption physicist who thinks he owns the physics understands the physics that basically say the Jupiter is believed that's that's true I don't believe it's true so where can you use machine learning to make up for the fact that the wave equation doesn't describe all the physics and that you have a bunch of parameters you don't necessary care about but some of them you shoot because otherwise you can't interpret your results and that's I think also the difference between machine learning perhaps sometimes and an inverse problem Visine first problem is we really need something that that you know we can interpose. System and it just gets worse problems you know you find all. The learning will work in a way but it's sometimes you know OK so. This was. Really sort of it. Was Yeah well so I believe that I'm a big believer in physical constraint network so you can think of again networks that have both so but I don't believe so to allow this whole black box and CNN'S are going to do everything I don't think so but I don't believe that the wave equation is going to do it all either right so datz where that sweet spot to sit in between Very well thank you so much.