Thank you. So this is my first visit to Georgia and really excited to be here and give this story. OK so. Today we're talking world language it distills mean product and fiance language it bestows the same problem which is studied since nine hundred seventy two it's a generalization of string added distance where the world is more in the dark means less product. Meant less Mattocks multiplication this is a valley and if you can solve mean plus mathematics multiplication fast and even have a faster the Guardians for computing albeit shortest but even graphs So look at this fundamental questions and look at their connections and many implications but before I get into the main part of the talk let me do a little bit of backstory as sunfish mentioned divers are between two labs in Maryland and they're like many labs went through some transitions and similarly for the D.N.D. and I was made to do a lot of data crunching for to look at. And there were my Some of my colleagues from data was grooves they also mentioned this issue that they were facing So the issue was they had large X.M.L. files and for some strange reason those experimental phase will not come by. So we started looking at what is happening and this is what exactly we found so in X. Some of documents and also in H.T.M.L. documents are the home pages that you. Do so in H.T.M.L. documents. You get to see these open specs OK So this is an open barrack this is a closed. And if an exam and document has to be valid These are open and close tags they must match our we know we are formed to be. Their problem that we've been facing it in T. was that many of the six The MILF I use they had been gathered from different sources they have been integrated and some areas have happened that were bad were. Going to. Muting to mismatch in the open and close spec sequence for example this is a document which is well balanced with value is a valid X.M.L. file you open a title tag. You open artist that closes artist you open the publication. Back So this is well formed but this one this is not well formed so here it should have been an open tag whereas there is a spec and all external compilers there are very draconian in nature whenever there is this kind of error is happening they just don't process stuff I there is a has had the first time when they get good to see and something that is not matching So this was happening and it didn't even then we found out that this is nothing unique to a dainty this has been reported quite a beat there are lots of X.M.L. letters and even more H.T.M.L. for example Google reported that out of the exam of the file is there for the second largest cost. Because of X.M.L. it is about fifteen percent of the time is a mismatch between opening and ending X. so I wanted to formalize this problem so let's look at this and example so this is again the next summer. Which is not valid but how can you make it valid if you look at this auto start if you did this Arthur stack just making one added We can make this X.M.L. document valid so we suggested let's do this minimum added distance Mr Pierce So if you have an X.M.L. file that if you look at the stack sequence if it's if it is not well formed let's try to make it well formed from a making minimum number of edits on it what kind of edits we can insert tags we can delete so you can substitute X. We also did lots of experiments and we found out well this minimum I did this and this repeals this part from the SO in this plot if you go you have better relations after that. Correct it looks more similar to the golden standard. Five And what of this though these are some of the here are sticks in that different browsers like Internet Explorer Google Chrome they all have mechanisms to do this kind of editors because otherwise many of your H.T.M.L. files you go to the waves. Are not open. They contained Yes Yes And they do a lot of greedy had to stick so they had to combine the grid here to stakes but we found no minimum at a distant district is really to perform with. So formalizing a little bit more so example data formats or maybe you just want to correct the X.M.L. files and look at the content and the structure together the data format it can be present and by some context free in language so it's a formal language want it's free language and if it does want to look at the back cyclones that back sequence basically its presence and what is known as Dike language. So back language is nothing but the language of real balance parent this is a particular subclass of context. And when you're talking about X.M.L. leafleting we are trying to say well you were given an X.M.L. file which may not be valid member of the language the context free language of the day language so let's make minimum number of edits on aid so that after the edits added that file will be a member of the language OK six some later do all the many kind of data but if I have all this period academia I would probably have not encountered this kind of so I encounter this kind of it by peculiarly because I spend some time in industry so let me I would have thought about some computer biological actions and computation of a given someone in a sequences D.N.A. sequences often you want to find no underlying structure. As one search problem which is very basic to computer do you want to find the second to restructure it in Issaquah answers. Me We are all only very interested in graphical gardens when many of you are interested in gravel got of them's. Shortest but is one of the most fundamental question that we would like to answer and we'd like to have a very fast regarded him for it. Or maybe just metrics multiplication. So what I really want to propose is that all this problems the X.M.L. in a collection might look very different from the others but they're really not this connected they're very well connected OK so that's with that goal in mind let me define the. First problem language at a distance that appeared in my data. So what is language added distance language added distance it's a generalization of two basic problems in computer science one is going to extreme Grammont by saying there is string at a distance so going to three grammar passing nothing but you have a context of grammar so here is the graph this is the grammar for date language well balance parent is this and you are given a string and you just want to know whether the string can be positive guarding to the roots of the grammar this string can be parsed the string cannot be. And the string at a distance of the game here you are given two strings one reference being one input string you want to make minimum and it's on the impulse string insertion deletion and substitution making this minimum anything want to make the impulse string look like very different string So for example here if you delete these capital T. if you substitute C. and then you insert the here with three Idiots you can make the input string look like that if the string. String at a distance computation and has been studied a lot in theoretical computer science but also in Applied areas like in natural language processing and computation about the D.N.C. language it distances generalizes board this problem by seeing and string and in this sense how so far string at a distance we are just comparing with a single reference ring when input string comes in language at a distance we do have a lot of strings to compete against So women input string we say well. What is the minimum it is you want to do on your own input string so that it looks like at least one of the string in your sit so instead of one we have many differences for the so that's what we mean and generalize a string at a distance and how does it generalized by C. so in passing we are really trying to answer as you know one question of whether the string buses are not and here we are interested in finding the exact distance so how many of its early career in your string such that they did that distance and if the string with us. OK instead of zero zero and run when Sir you want to find the minimum distance from the strings that are generated by the language with string. So that with and without it distances generalize a string it spins. And this problem this language that it dispenses tries to answer questions of the following form given a moral model which is specified by a formal language does he or did does feed the Morton. And so the our model is the Lang the language stream so do you want to know whether your data fits the model or what changes are required in your data to make it fit the model and this type of question so we were interested to study it from the data quality point of view external correction but it originated from the study of it I got it in parsers from compiler optimization it has many applications and computational linguistics as well where you want to find and correct mistakes grammatical mistakes in computer Chanelle biology where we can find distance between a gene sequence and a mallet men disease which is modeled by a gram only an hour and if willing and so on. So has it done so this language your distance problem it has a very old history so it was studied as early as in one nine hundred seventy two by Peterson who studied it for the purpose of designing a parser and they gave a dynamic programming algorithm that ran scene in Q Where N. is the streamlet with a quadratic dependency on the grammar side. So this time there is dependency of quadratic dependency was later improved by Meyers in one thousand nine hundred five but the Q.B. dependency on the stream enjoyment since this problem has lots of applications there have been many fast you listings for it to those who listings did not come with any kind of performance count so what we are interested in we are interested in to design a faster faster than in Q It should return either an exact answer to the minimum distance on an approximation to lead for a lag with it distance problem so that's our goal. And this is the first set of results that I will briefly mention will show you the garden and how to analyze it very briefly for this set of results so the so we gave in the first sub cubicle got of the foreign language at a distance problem this is from two thousand and fifteen which fire which computes the distance approximately near optimum so we can give an M one plus minus abseiling approximation and the running time is indeed the omega are divided by their sum of silent dependency So what is Omega only guys the exponent of us metrics multiplication which is about two point three seven three also so there is another parallel result in seem to cause and fifteen which shows that there cannot be any non-trivial multiplicative approximation guardian for a language at a distance problem. In time lives than interview. So in that sense this interview my god the god of them that we have is up to. Yes So Herc So this is a million metrics multiplication so it says that if you can have faster and faster language at a distance problem with non-trivial metrics multiplication you will have a faster guarding four billion matrix multiplication and the same to me could give someone plus minus Absalom approximation algorithm for computing all shortest but density graphs and as a. Result of many other distant best optimization problem one graph so that with a connection from language there distance to grap problems also we get the fast software we go guardian for maximum likelihood pursing us took us to context free grammar so what is maximum likelihood parsing of the request of context to grammar will still gusty context free grammar it's a generalization of context where we each production rules so we have some rules for every grammar so here we do every every rule you have some conditional probability how lately you have to apply this rule and then given a string you want to know what is the most likely parts of article listing according to this given distribution so this can be solved in N.Q. dying by modifying the willow and see where you go going to them for parsing. And for our dignity see if the person but there has been had been no progress enough to make us up to legal guardian I don't exactly approximate for us because the context place. Struck us you're going to me it tries to answer a question of the falling foam here given a problem generating model is took us to conduct three Grammont can we view it as a generated model so given a probabilistic generative model what is the most likely explanation. Because we are trying to find a past that is maximum likelihood pursing So what is the most likely explanation for your observation. So these kind of lies of the foundation of natural language processing so struck us the context remarried generalize us here in my government to generalize us different types of adopting modeling Lake. Has been used a lot in biology and in other communities. So we give the first give we go to gardening for it giving me an optimal performance approximation OK so any question. So let's look at Q We go gardening for language a distance that was given by discern more than forty years back in nineteen seventy two. Service over there is really simple so you have an N. square sized to make programming table every instance in the dynamic programming the presence of particles so you have the string length N.. And every entry in the dynamic programming tell it basically the presence of particulates substring you want to know whether this stuff string from I DO J. Gagnon we parsed are what is the minimum distances minimum and if we want to need to do in that particular substring. So if this sale presents X. that means well OK so if this I do this the prison see Z. Z. That means starting from particular entries see who can parse the substring I do GA With an eighty deg C. with a minimum number of eight its of C.. So how can you compute that so you will look at all possible break points like I do K. minus one and K. two G. of the string A to G. and then you look at OK so I took a minus one I can see it starting from A with the editor of X. I can parse K. to j starting from B. with the editor of Y.. Then of course like I can parse I DO J. starting from C. because it says I took a minus one B. buses K. to J. so I can I do J. starting from C. with a deed of how much. Explicitly. And if there are many scores here then I will be taking the minimum of expressway because I want to find the minimum added distance. In that's a very here that's a very good question in any context for example you can convert it into the stream scheme normal farm and that is your right hand side we have to non-terminals and that is essential for the standing program. OK so far X.M.L. It's a special type of grammar move with this a diagram on the day. Showed a few slides before. The X.M.L. back sequence and when so this was the grammar so you can convert it into a Chomsky normal form and then you can do that and I think programming. Then we have to read that gives you the same that generates the same language. City directly yes. Yes So we deal with it directly on the X.M.L. Texas. I mean you know. What I mean holiday and it's a happening yeah. And. Yes I will look at all speeds and then you can then I can manage it yes. With us because. So. Some of the tags this is this father follows this grammar right so you have different open parent this is and OK So let's look let's look at this grammar I want is this grammar say. OK what is grammar saying so next time I fail you have opened and closed so each of these open back you can think about it as open parent this and close stacks are you spelling this is so this grammar is basically saying you have two types of X. two takes up and that this is this one. And this square one but you can have many others so what is the saying forty yard external file to be well balanced You must have one something well balanced then followed by two matching balances open and close. Here again something you need something to be well balanced here Richard we have an open open parent this is followed by a close Panther says well what you can have you can have something that is well balanced and something that is well balanced. OK so this can be shown that this grammar basically generates all stree all balances and strings that are well balanced So this is a very typical question in undergraduate I'll gardens where you are given a sequence of parent hisses and you ask can you tell me where the decisions of balance is as well balanced or not so as a very typical answer that you just use a stack you insert open parent this is in the stagger anywhere there is a close parent this is take the top of the stack match it if you can match it all the way down there in the strict sequences balanced and this is not in John skin on my farm because these are kind of the CMOS that is generating in the string and these are the entire mills but you can convert it into Chomsky form for the Johnson on the phone we learn to look so in duty which will be more complex but then you can use the dynamic programming to find the exact sequence of changes exact sequence of edits that I declared on the old X.M.L. tags. To remind you. This was one place. For this part so there are a lot of us would say is that if you want to use dynamic programming even for these grammar you cannot do anything better than in. A century. Well. Yeah so if I want to recognize computer aided distance based on this grammar welcome to Day to disinvest on the John Johnson form of the grammar I didn't read the same language. So there's a density question OK so this was the dynamic programming so for computing for every idea I'm looking at all break points I took a minus one and K. to J. just combining using the minimum scores you were just adding up this course X. and Y. and then here taking the minimum and you get a combining this non-terminals the envy because you have this production see you saying I took a minus one can be generated from E. and j can be emitted from B.. So why is it and kill because you have in square size table for computing every every entry you have to look at possibly order of and break points so the total complexities are in Q And here I am not showing you the dependency on the grammar where do you do have to go through all the grammar so there is a dependency on the grammar errors OK So that was a dynamic programming of guarding OK I forgot to mention one critical part of the standard programming. Do you see this pink I learned. Being colored productions So this being colored productions are some air producing rooms whatever they are producing rules this way or the normal rules for the grammar if you would have just used this rules that means you that means you haven't gone daunting to do any edits but maybe you have to do some edits so these rules are allows you to handle these edits for example in the original grammar this non-terminal Capitola that was generating the scene will be now I'm lifting this non same non-terminal Capitola to generate the scene will be as well what does that mean I mean I'm allowing substitution so I can substitute if need be but whenever I'm doing substitution I have to be added cost so that's why the school is one. OK so here this is a this is for substituting a with the similarly We'll have it was originally generating B. but now I allow will to generate is well and because they allow that I have to pay for the substitution cost so these are production rules for substitutions Similarly I can have admission reproductions for insertion and deletion. And then at the end we are really trying to find the finder we are seeing such that the total score is minimized. So I originally productions all have school zero and this admission and productions have the cost which reflects the added cause that you will be paying for substitution or insertion or deletion Well even even for normal parsing there will be dependency on the grammar Ses because you have to go through this productions but this admission rules were causing the dependency from Lynyrd required reading in the grammar so for that. Improved by my own. OK so now this is an every programming clear OK so now we'll view this dynamic programming in alt and it we still view those than I mean programming computational as a way of computing transitive closure or special paper metrics multiplication. So basically I'm going to say the same thing as a fit for the dynamic programming you'll have a metrics Initially you'll just have the diagonal entries they will keep multiplying and you will compute the transitive closure and at the end whatever your transitive closure metrics looks like that will be exactly the same as if you have computed the dynamic very program to the end and have looked at that at every programming table so what does those metrics multiplication so every entry is the spear one non-terminal and one score so saying this is X. means basically a front string substring from I took a minus one it can we part starting from here with the school of X. kid to this one is sinking into J. can we starting from B. with a score of Y. when you do metrics multiplication from N B U General C. because there is this production and from X. and Y. you generate expressway and instead of X. and if there are multiple search Lake Effects one excuse me while you were on the way to you will be taking these last weight. OK so it's all donated we're viewing the same dynamic programming a beautician So why does this alternate view is going to help solve goal is to be. Alone. Yes. Yeah. So I. Saw I had through some of the same or similar saying so this this is America are dismal numbers mean after Of the six months the annual compute the transitive closure so why does this viewpoint really helps in this viewpoint has to help us in designing faster algorithms then we have to answer two questions so fast we have to show you how to compute this matter explored at first and then I have to show you have to complete the transitive closure first OK So let's look at how to compute each metrics product first so far that first of all it can be shown to have already stressed weight of the that we are computing mean our class so this particular type of magics multiplication that we have it can be computed in same time as computing mean class metrics computations. So remember there was a mean plus in the title. So what is mean classes easier product metric so then go to compute digest entry of the product metrics you look at all the breakpoints and then look at. In your normal logical multiplication you would have a multiplication here and you will be summing up so now Sam is replaced by mean and multiplication replaced by less. I'll be assured this but it's basically you want to go from I to move to the node. You are adding up and you're taking minimum that's why the mean plus product is basically can valid if you know how to complement less product first you will have a faster. Yes but so what do you know about it we know that we can even can cause we can compute meaningless Magic's multiplication in time which is in Q.. Or to discredit or logon and they're missing results and we Liam's but this does not give us any sort of cubicle going at them so you get. Savings over and kill but this does not give you any truly self cubical going to them and if you're wondering why you cannot have Fosamax multiplication Because mean plus brought up it's not a world when it's over say meanings to all the techniques for Magic's multiplication does not apply here so means less productive basically you so far what we know is order of it and you will call it basically order in Cuba. And this one now standing up and question whether me call computing means less product suppose all the entries are indeed does and they are between one and in when is the dimension if it can be solved. Because if you can solve it you can solve all the shortest but in the love in Q. and A bunch of other problems OK So we haven't really improved turn our director and COO Well then maybe if we was approximation it can help and indeed there are some of the Guardians from one thousand nine hundred seven which gives a good approximation of gardening farming class product and that garden proxy missional garden ran scene into the Omega time when we made eyes exponent of metrics multiplication so maybe we can use approximation for computing each of the product first. OK So let's look at the second question computing transitive closure just generally if you can compute each metrics product for us computing transitive closure is not a big B. because it will just grab the metrics in your dick and then it's great again Square again logon times you repeated if you can compute each of the product for another factor of log and often you can even remove that you can compute the transitive closure. But that's not the case here very strange because here the metrics month that we have is not even associate the. So because of this non diagnosis so when you are multiplying these two matrices because you have this production C. goes to A and B. you can multiply these and we get C. and then maybe there is an other metrics with as D. and now C.N.D. can be combined and you get some E. But if you try to directly combine B. even that would be there may not be an introduction which combines B. and D.. Which means that this matter it's multiplication is not associated and that's a big problem because not associated with the means there is lack of flexibility you do not have the flexibility of just grabbing for example if you have a very long history that will imply you have to say you have to multiply the matrices in a very particular odd you might have to multiply the matrices in Bain's your simple logon time squaring is not going to work. So what does that mean that means transitive closure computations mean it's a long time. It's not easy to say that the transitive closure time is same as metrics multiplication time and the second thing is that. Even if there is more literally an approximation we are trying to use that proximately approximation algorithm for each product even if there is more there only an approximation that Iran will propagate maybe N. times and that will blow up OK So what do we do. So back in nine hundred seventy four and is the valiant in his Ph D. thesis develop these I missing algorithm for context free grammar parsing where the improved for POV scene and the running time from N.Q. to into the Omega and there he showed basically he considered this kind of non associate the products and he showed how to compound a computer transitive closure. For metrics multiplication and that this kind of non when you don't have the CD will be in time same as computing single metrics product so it's a very nice guy there are a simple very nice garden just based on divide and conquer but now in divide and conquer you generally divide the problem into these joint parts here you see divides into some overlapping birds and then overlapping mattresses is very important because all the savings are coming from this overlap so if you're interested I can talk about it later. He does that so that this one hole so here we have more complex operations because we are taking mean it's not simple passing but we can use a valiant garden we can extend it and we can show that time to completely compute transitive closure is basically same as dying for a single product. So that was good but then what about the error propagation of an error propagation is cannot be handled by balancing it so if you want to plug in some known approximation I'll go to the information class product and use violence I'll go out of their legs tensions of L.A. and sell garden you can construct examples and sure that world that approximation alien propagate maybe not in times of out into the like order into the silent times and so on so it's not going to give you in one place if silent approximation. So here are some of the key ideas so first of all you can. Compute single metrics multiplication of the type that we have fussed if the scores that the edits are the scores that we have and integers which is the case but the integers which are bounded by C. capital are in that case you can use these old guard of them by. That and also a subtle where you can compute the product in our brains into the Omega. OK but this does not help us because the number of it is the score is not bounded it could be as high as order of N. So this does not apply but then we also have served a single metric multiplication you can do it fast your interest does not need to be bounded but as long as you do not have too many distinct scores distinct Now if that made it a distinct schools if there are no not too many distinct course you can still do the products faster so far that if there are distinct scores then we can create. Numbers natural numbers in the range of one to Our Square which is known at noon a C. then sequences This is known sequence or has a property that if you take beers from that created numbers in that range one square they have distinct sound and that property that they have distinct some basically help us to get an Al Gore or them. That has a running to model our square into the Omega OK so we can still get faster algorithms when they're not when the entries are not bounded by there are not too many distinct entries. So you want to know where we so we're going to stick you use that property so that this will be our final garden so we've tried to follow valiance method for computing transitive closure not directly the valiance method but the extension of it right after doing each of the metrics multiplication that is suggested by Valiant. We do some depend on rounding on this course. So we'll do the dependent learning to make the number of distinct schools that apiary in any of the metrics to be not too many so that we can imply that in the construction of C. Don't sequences and we're going to multiply those matrices first and how do we keep these chords to be not too many distinct values so so if we have if they fail to mention B. is and why is so this a non-terminal in the school is why and the why is C. Suppose one plus Delta to the our class K. this is a score then I will round it I will now need either down to one president or to the are all I will rounded up one class builder to the last one and I will do this mounding by choosing appropriate probabilities so this probably destroys a problem it is will ensure. That even if you have a long bust like long stream long past his originally done one which is more difficult to handle we can sure the expected cost of the street is same as the actual cost because of the choice of the probability distribution but more importantly the variance lumens bounded. So once a variance remains one of then eventually you can employ a leg is fine and then we can employ want to show that this timid cost estimate cost is very close to the optimal cost and that also gives approximation a garden not really for language at a distance but we did all this but. Also similarly force took us to contacts with them and pricing and so on. Switches Yes the variance is one that lets them in you have no. Difficulty showing Putin you would smile but it just works. OK so now you can ask really not doing very well with it and so in other then ask why approximation Well there are some reasons for using approximations for example for strong gusty took us to context to grammar but I think we could show conditional Hardman's that if you can have a faster exact I'll go in first because we can't experiment but I think it will directly have a faster. Spot OK So that kind of gives you some hardness implications first across a context. But here comes an interesting part I was trying to prove simply lurk on dish in the lower bounds for a language at a distance computerization so I had a high hurdle about which shows that if you only allow insertion as you already did operation. Then yeah you can sure hardness you cannot compare language to dispense Exactly. But when I was trying to modify my proof using three all types of it is in session deletion and substitution that Rufus failing so obviously that raises the question is language and a distance Q.B. card when all insertion deletions and substitutions are allowed. What is happening and why is the lower bound failing. And then we are served basically it's not. So when you have all the street types of edits insertion deletion and substitution if you look at the metrics multiplication that we read to be. This quarter's have a nice property and the property is that it's has a property a bonded difference what does bounded difference means is basically say do if the elite number eight it's too far substring I DO J. is excellent for substring I DO J. plus one it's been known to be much different from X. it will be either explicit one or next leg the difference could be at most one or Similarly if you look at the column like I do G. I do G. and I plus one two G.B. Then again the scores will not change much. So this means less productive years we are reducing it to computing the spatial matrices by the special matrices if you look at the scores it has this bounded difference property if and you have is bounded difference property only if we allow all this free tips of it if you just follow insertion this matrix is will not have the bounded difference property and that's why we are getting this Kiwi curtness So if we are all the street a for it it's those metrics as we have bounded difference property and then that this was our main contribution we so we should that mean close metrics product with bounded difference it can be computed in less than cubic time soon going to have a truly stop here we go guardian for computing means last product when the matrices have bombed a difference. Into the two point two four four. It has where I and that was the complexity we reported in our paper but then there was also the. Why is a strange number with me because it's coming from you have to use rectangular metrics multiplication and it's coming from there but then since then there has been a recent paper which improves the rectangular metrics multiplication so that will have some implication for our complexity as well a bit from two point eight two four four it will go down to two point eight. So what do you have we have stopped you we've got a garden for boundary difference means less product it's under different properties a significant generalisation a bounded integer properties so you do not mean the are metrics to have bonded indeed there can be as high as N. you doesn't need to have bounded difference so if you move along there are also three if you move along the columns their values will not change much and how much it can do those values change basically we saw we show that that's the difference in any one dimension for any one metric so one of the metrics can be completely arbitrary as long as form and other metrics and it's also not required that it has mounted difference property in what the rows and columns does in one dimension. If it is that most end of the three minus will make on mine if it silently get us up cubicle garden so if you substitute or may not be true if you are optimistic that matter is multiplication going to solve the main square this means you can handle bounded difference of two into the one minus Epsilon and if you can get rid of the sign and then basically have solved the open question like getting a meaningless product at a garden for one to anyone to do about it and that would be fantastic so the same thing. Gotten in far out any folding because I don't if well you can view it as a spatial case of computing language I did distance and we also get soft quarter to go girl names for what is known as mean plus convolution monotone means less convolution this problem a study even start with thousand and fifteen by Jan and Lou and Stein and they gave the order to go to God the man using on Me Tell You can also get us a order to get them for it we don't do using so they use this nice here they mumble examine A.D.N. dollars from additive Combinator expert reviews we can give a very elementary proof for getting some idea what radical gardening for me Ask them Lucian I have ten minutes. So let me try to give you some glimpse of the proof OK so what is well no difference again if you look at consecutive elements in the rule for cons if you do elements in the column The difference should not be much we can handle much higher difference but just for two this sick cause either the difference is one in fact for language added distance or out and if folding these differences or not and if willing is differences to for language at a distance as differences one. Similarly let's assume that what the matrices have been a difference though we can handle just one metric to be bounded difference and just in one dimension but you may again for to this it lets consider the two matrices about a difference and just one of a difference. So here are the stakes for the past year in sub give it time we will compute a match Ricks see price that really be an added People approximation of garbage in the production metrics. You want to compute see the product metrics Instead I will compute see play what does seem from what property does the frame house if we look at any idea back to entry and the entry in the sea plane. They are bounded by their absolute differences bounded by some form into the upselling epsilon is apparent So you get an additive approximation fast so once you are labor approximation. We will call a triple I K J to be bad. If. So wonder why do you need trying to compute we are trying to compare C.H. And so which triples contribute to see ID all the CIA King and we get so I get it triples contribute to C.A.G. it so we'll call these I.Q. to triple to be bad if you look at I.Q. players be good if this is very close to see playing my age I wonder what does that mean it means the I.Q. players big it is also very close to see ID we can see play my just very close to CA. For that means maybe this year I could be good this is the lawn that is giving you the minimum value that is contributing to compute CIE so bad means we cannot ignore it we have to consider this I get a triple because it might be upon Tendo for computing CA did. I get is not good it will not be bad if I get is not bad then is good because we don't have to look at it it's far away from C. play my idea so far away from C. idea we can ignore it and idea is if the number of sides back triples us up to week and we know which are these bad triples then we can vary and will quickly go to only this back triples we look at it and will compute the ID first and if not if there are none if there are many better both then what we do into some non-trivial partition and we by doing this non-trivial partition will reduce is bounded difference mattresses to compute mean plus product on matrices with bounded entries we know matrices with bounded entries and can compare them fast. So maybe just one non-trivial partition will not help so we'll do it i United Nations some of the entries will become bounded and after a few idealizations we showed that the number of bad entries are really stopped Kilby we can find there is quickly and then we can just look at them and compute the age. That's the main stage. Let's look at it in some little more detail five minutes OK So the first of us to compute additive approximations since we are considering bounded difference computing additive approximation is not hide. So just look at these great points these big points are separated by into the absurdum. So if you just compute the multiplication based on this great performance all those entries here are very close to this great point and so just compute the metrics multiplication based on these great points you know and once you have those values this value of truly approximates all the others in this book so you do good by doing just this man's magic multiplication of the good points and get the property that we are looking for additive approximation. So we get a good approximation for C.A.J. might look strange that all we're using There's been a difference property for getting this additive approximation but let me tell you one thing that if you have an arbitrary matrices you can do it of approximation. It's not as simple as this but we do have ways to handle it. OK so once you have this by additive approximation that we don't know how to find back triples we have to find bad triples because if you can find bad triples we'll just look at them and we're going to consider this contribution. Yes. Yes. I wanted exact. Yes and that is additive approximation. Yeah so now I want to find the back triples How can you find a back triples so suppose I know OK when he is I can say is about triple is a plus because he is very close to see three major then it's a battle so if I know it's a bad trouble then says these are very close to I care this is very close to good did I basically know this entire part is bad is bad. And conversely if this AI can play as big a deal where not about triple then I know indeed this block is not bad. Some boundary difference property helps us to find this about troubles by just looking at discrete points and there are sub cubic number of good points. So if we have only our dive into the three minus a little stop giving number about troubles and great exhaustively a new method the back triples are numbered them in place product appropriately and additive approximation is of giving completing over the battle of the stuff you will get us up cubicle guarding Otherwise you there are many battles that Riemann and many battles means there are values very close to the mean. And that's a property we're going to use. OK so now what we're doing to do we're so far now let let's assume that all triples are bad OK It's all in good triples are bad so we'll do our non-trivial partition on the matter X. and our goal is to transform the matter eccentrics to small indigenous because if we can make it small and it does then we lose that Alan go in and model it'll cause a lot of them and we'll compute it fast so far that left speaker law X. on column wide randomly from the Great points you have bounded difference how you make some in interest to be small. For example if I subtract the consecutive rows it will become small. Because the difference is not that much let's try to do something similar so you have the idea let's take this go X. subtract Xscape. One seriously since your substracting you want to maintain the same E.I. cable as the gauge is so you add extra here Sam part I mean I'm just creating this new entries I'll be back a day and doing the partition. New mattresses and. Yes I picked X. and Y. randomly so I subtracted X. case I want to Minton Samal fi K. plus B. decade it was same as a I get as big a J So I've added excrete to it. OK. So now what is this a I came minus a skid Samus saying a cable as big a Y. A minus biggie right minus six K. but then what is a I get less became why I came out is about triple everything is bad so that means that I could lose became a is very close to C. C. prime I want which means it's very close to CIA because the prime I mean is ridiculous to see anyway here what is this this is a X. can we can write and why do we know X. K.Y. is about triple So that means this is very close to C X Y. C prime explosive close to see X. Y. So this is within this similarly here what is this extension as big a j we know Xscape j is about triple. So this is very close to see extra. So then now subtract CIA write subtract C. prime I write our science abstracts the premix way where do you get you get an entry of rallied in class minus order of into the silence because this is very close CIA and Z. frame I.E. are very close to each other see X.Y.Z. premix where you are very close to each other and compute them in class product and once you've completed the mean plus product How's your product looking like like I'll take a day is the I.K. plus begin to minus the prime I you write less replying X. Y. minus the pre-mixed So it just depends on the stakes and why that you have chosen an i and j so you can then convert it and get back to element list. So this is what happens when you have all triples are bad and obviously you may not have all triples to be a brat in that case will transform many of the metrics entry is too small and dangerous and we do we do night editions so for example with X. and Y. randomly we compute the decades it is not smaller in city to infinity because there might be some other fake it which is not small the decade here again if it's not small instead of doing then we'll compute the mean plus progress for the entries are smaller than in Sept give it time and after that we'll check if the number of bad triples that only meaning to be called Bodies have morphed into the three minus ten or. So after every step we are taking all are there somewhere in the back triples that remain to be called Bodies have to be if in fact given the new stuff we look at the material that's it and we can show that the number of these I did ations is really small and very interesting in that taking time is into the ticking time for about triples as I mentioned it's cubic so very interestingly. There's a number of repeat Asians suppose A N B R I return to matrices and you are following the steps you can show after this few weighted ations you are left with a bad number of bad triples that is sub Q. weak. Even funny I return the matrices and you do not need to have bounded difference property. So the boundary difference property does and then also to find those back triples fast and that's a part of and we're stuck now for our if we traded in matrices we just don't know how to find. These back triples though there are a few bad triples there among the same Q. triples we need a mechanism to find them even though there are a few battles left we just don't know how to find them far beyond a difference when it is exactly where you think about a difference Robert in North Wales adjective approximation you can do it and these I did ations everything works. OK So this is how close we are to us getting a truly Sub Q We go to go to the informant less so this kind of the looking and mostly in these days and we have some extensions of it where we can get mean we go gunning for our generalisations I mean less product using this framework and. We are hoping maybe this is not to really kick this will of this week so I haven't really talked about a lot of problems. For the. One Above four hours of so gusty context we got our buzzing language at a distance with the insertion these are all of a P.S.P. hard which means. This we get this breakthrough and last product they really meant to be card here X.M.L. repairing a distance out and if we can show it. Lower bounds of its shares and we'd say is there as hard as computing Woolley and metrics multiplication. For approximations well for language a distance we do not know is going to hit that hardness yet we can solve it is up to reclaim but is it into the two point five. One we don't know really don't know for approximation of Guardians here we have a fast metrics my using fundamentals multiplication we have one plus main is absurd an approximation for a bag at a distance of a B. Indeed they wouldn't have allowed me to use for as much as multiplication X.M.L. appearing so far that I do when we do have our linear payment darling giving close to optimal levels so that proclamation when in the early entry. So there was this question which. Obviously was in fast metrics multiplication is not the most practical thing to do if you are looking for practical applications and language at a distance does have a lot of practical applications and we also have a lot of bones would say is that you if you have to get any non-trivial math multiplicative approximation you have to use fast metrics multiplication so do we do any of what Magic's multiplication metrics multiplication and maybe get additive approximation since the multiplicative approximation is not working so we do have some recent progress there so we do get a quadratic time deterministic Gogarty them that is gone when it really means it does not do as far as metrics multiplication for language at a distance are in a folding approximately. The ship ticking and languages and so on. We'd gives you an additive approximation. And we also have some space subs Linnaeus Bizzle guardians for language at a distance but for other problems as well like long as increasing subsequence string at a distance and so on and their Vista and general technique which you call amnesty dynamic programming basically looks at a dynamic programming. Relies on again this bounded difference property in relying on this bounded difference property it gives you an additive approximation which directly improves the space and then complexity of Benami programming so what all of this works feasts into the area that I'm really excited to work on which we call fine grained approximation algorithms so fine then complexity is about understanding the complexity of polynomial game problems and here we are trying to understand their proximity how fast that understanding the tradeoff between quality and speed for volume problems and also understanding the hardness of approximation and that's a year that we are calling find an approximation and go in there as organizing a workshop in this area. So if you're excited if you are interested to know. If you're interested Well this is a good doctor me of life OK And there are many future directions so thanks.