Welcome everyone I'm home who post thought here my folks basically focus our system of security and specifically I'm interesting Our find a program box and be able to exploit to you hackers them and try to develop a defense mechanisms to prevent these attacks today and we're going to share this data flow hacking to be with Turing complete attacks. So this is not a map on my talk today So first I'm going to give a brief background each duction of the memory error and existing attack techniques then I'm going to talk about how to build all the constructors is the data are indeed attacks systematically follows that if I want to introduce the talk about the expressiveness of data into the tax and last step is to talk about the tech against as a real what program. And finally I will talk about it a little bit about the defense mechanisms against this new type of attack. So first I want to need to do use the memory error and as our existing control attacks. So simply speaking the memory error is the program going to make an error when it access memory so this is an expected and this gives attackers ability to change the memory in a malicious a way so the most famous one is the buffer overflow wondered so this program has buffer overflow but ability so here is put a buffer on the stack and its memory Stream Copy of going to copy from the input to the output buffer so know what it is going to feel these are for a buffer We summer with the input but if the input alone. Is a longer than the lists so the copy it will continue to do the pondering of the other put a buffer and we overwrite other scenes on the stack so this is our memory. And the terrorists will have the ability to crop to these variables other than our to put buffer the question is what can a text do with this a memory error. So to others this question we want to know what is a memory so first in memory we have our control data can do that I mean the program will use such state are to deter me what is next to execute for example in the red call and they're going to jump their goods are you going to use a function pointers to decide where to go next and routing instruction are going to use a return address see if the stack to determine where to go back so attackers can change the control data to hijack the control flow of the program. I want to use this to figure the industry to to call me attack techniques so the left side is the control photograph it describes how the normal prep for the number execution will be are you can see this no you can't imagine it's not as a function this is a relationship of the jumps and the other side is the memory layout it contains the data and the code to execute So while has a call my attack is called according to action so attackers just replace arbitrary malicious code in the data section and to me as a code jump to it so as we know there's no difference between data and code in the current architecture so attackers can execute is a minute at a time. So another technique is called the reuse attack so a test will make the program jump to existing code to achieve malicious behavior so post these two techniques change the control flow of the program and they are known to public. But it is this control attackers are getting more and more difficult to launch because resources has proposer so many defense mechanisms and some of them even get the ploy to our system for example not days we have. Prevention. So this one just to to make sure that they to arsk data section it's not actionable anymore so this is the ploy the most of the modern systems so this way even a place malicious code in the thieves section and in the program jump to it there is usually a crash so the code injection attack is blocked. So for the second quarter Rios attack we propose control flow integrity so the idea is for each jump for them with this node we pretty find some letter to mate target and we're going to check the real time target to see whether this one is one of the Select image targets on ALT If so it will allow this jump if not some malicious jump so this code reuse attack is also blocked in Syria. So it seems so good knowledge is up locked so the question forces are. The question forces what can a text do without control attacks with memory error. So let's check the memory again will find the will find none can show data but is there are security critical. For them pull in this course nip it they have the. User ID wearable this is user ID to tell me the privilege of this process and determine which resource can access and which the cannot and actually can corrupt this user ID to change the process privilege from a normal user to user as we know to use it has more privileges the normal users this can launch a very severe attack. And a another interesting example this one flag Corsi from old in I.E. browser on Windows if I choose this one bit of this flag so that execute any code machine without any difficulty into you so this to a simpler tax issue is that. None can tell you the attack is also possible and also even severe. System but the question is will we have very few knowledge posit this an attack all we call it data or into attack so my research is going to understand how to systematically built data are into attacks and also how expressive this data are into the test can be regarding compared to other attack technique it's so my work are going to give this quick answers to this question first the one I'm going to present a technique called Data Flow stating that I will systematically build dates are into the tax I'm going to assure that with this tool a computer in one thousand attacks from eight vulnerable programs so second I'm going to show you that these aren't are tax can be very expressive you can even build a curing complete computer on the victims machine. And I use this technique builder three in a twin attacks. So first let's see what is this a systematic way to built data are into attacks so before we go to the diffuse as say a simple example code so this is a web server cold. I want to understand that there's a flaw in status program so first this to love cold I'm going to lower the private key of the server into memory as you can see this is a key content and this is a pointer pointing to the key content and a form is that the program is going to parse the use the input prosecutor request and receives a fair name. Of the content and it's in a backwards country into user client side as you can see this is that if the flow of the felony. And it connected with content there are several pointers pointing to these are available in memory. So one can't text too with memory error in the program so this one again is based off of one of the. So as we can see there's a pointer here for a name for name is a pointer pointing to the stream of the real founding So if attackers use this vulnerability to corrupt the pointer to make it a pointing to a different location to point into the key content therefore execution the program are going to copy the content of the key as the found him and seen him back to the can aside so this way attackers can leak the private key of the server does not change it without a change any control flaw of the program. So this is the way to build a attacks because data. So simple are speaking to the flow station akin to manipulate existing to the flaws in the program to launch attacks. So the benefits of. It evils a systematic a way to feel that they are into the attacks so basically you just attackers just asserting the program to find interesting to flaws and try to connect them together to launch some malicious behavior for them for they want to be key information or the even can't get a higher privilege. So of course there's a flow state you have some constraint So first of all we want to conform to C.F.I. to make sure it isn't will not be plucked bias existing by solutions and we need to make sure the program will not of crash and I want to bypass other existing in the face of mechanisms like an address minimization So the main chariots of data flow station actually is the. Source problem the source who will be time consuming so as we know one program will have a lot of rebels saw a lot of flaws and used to the floor are going to have a lot of candidates to connect for them Paul this is a two did a floor show here this is a flawless execution so if a person want to connect this to data flow so the daughter from the want to be two is one candidate but you can see there are so many candidate attacks and you decide which one will be feasible and it can be used to build attacks. So also what is to understand of the memory error in the first place to understand which part of can be crafted by attackers and we represent the summer influence as a constraint and a service into a server to quickly check whether they are solvable or not. OK let's go to the D. to say what is. So here's another program and I want to use this one to demonstrate the idea. So we can draw the did a photograph of this program in this to the missions the mission is the time the white the mission is the memory space so as we can see at the last three the program of use ID from system call and a C.V. into memory space on the stack here they're supposed to use I.D.'s one hundred here and the lab for the programmer can deceive the user ID into the other space pointed to by P.W. So let's suppose this address is A one here so this isn't the movement from movement of user ID in memory and then at the Land nine the program are going to put the user ID on stack for the functions that a your ID again so this one shows that in the floor of the program for your ID verbal. So there's a vulnerability here and a lot of Fife this is a form of stream volubility So this is an ability to see the key vertex ability to change our tremelo creation to the true value. So computer attacks with this one country tele So the message is as follows so first we represent as a memory influence on the same graph this is a very this is not a tat OK so this rather than here represent is a memory error inference So I mean if it happened at the last five and once it happens it affected almost the whole memory space. This is the difference of this country to memory error for others we can try to differently. So we can compose this to the floor this edge in two dimensions Why is time dimension which means the program believe this variable at address one is unchanged from four to ten nine and another that mission is the program we can to copy the content from a one into the stack at nine and use it for the follow execution so we can see this intersection between memory influence and as our data flow here actually this is the location we're going to launch an attack so basically I would just use the memory error to crop this is value from a normal user ID two or to use ID and the program will believe this is unchanged from time for time nine as they are going to use this value. For the for execution. So this way this is a simple way to crop to the to the floor to launch attack and how come I'm rich to data flows so this is because advanced as teaching technique so the intuition is the pointer side is the data movement direction. So let's include the good flow of the pointer here so suppose OK this is this is a pointer of the DIDN'T movement as we know we just assume that receives one. And also we consider another to the following the program did a flow of one zero the source code of there is not here but it exists in the program so the question is how can we connect these two as one. So similarly we can decompose the data flowing into them issues and withdraw the memory or inference again here so this intersection we have checked we can use this want to build up their attacks but we can find another intersection here which is an intersection between the merry influence and is a difference of the pointer so here we can use the memory error to change the pointer value into another one so I follow nine the program we're going to use this value to find the source of the memory copy and copied to the destination and use this one for the folly execution. So if we compose this data flow graph we can get it this. Data Flow So as you can see this is the origin data flow this beautiful one has to be pricked and a connection with the data flow of zero so this way we can Mercy's two different independent data flows along the same attack. So we can generalize this idea of a station and we can also basically the ponderous this crowd the pointer and we can generalize to say we can cross the pointer or pointer and to attack interactively and also we can generalize to. We scrub to a pointer of a point of pointer interests in times so similarly we can have this a multi flow stage which means that we can have a lot of intermediate in the flows in the middle so starting from the source flow we can pass the data along some in the media flows and finally reach the target floor to leak it or use it maliciously. So based on this techniques we developed a tool called the. So this one required the banner of the program and to import of the cold first one should the normal behavior in other words show the memory error so we just ran this banner in the our Talk to get as a solution trace from those whose interests we launch attack to understand of the memory influence and to understand is critical data flow. And we have us this algorithm to select the candidate of the stage and a funny sort of rooting interest in you solver to get working exploits. So to evaluate our system we check it with. Eight but reprograms including web service thousand Everest and critical programs likes you do so from this is the programs we build the ninety exploits and sixty of them are new to public. And. This attacks including information leakage for them we can leave the private key of the server you can even the POS word or some configuration on the server side and also the include is the previous escalation tag for them as you can change the privilege. To user or they can change the. You can access resources is not access. So seven of these attacks are built to use these to advance the technique like this level stage and ten of them even work even with sorry import on the system. So if stories are out about process are is kind of technique so not a lot of concurrent here but it discusses the offline so sure want to study of this advanced stage so this search each T.P. web server has a stack of paste buffer overflow so the lever side shows the source code there is. A stack a piece about the following side of the log function so this is assembly code of log So basically they're going to first to see if the context and those who have a buffer full and then they're going to pop the context and continue. So this yes I hear actually is a pointer of this is of the command to execute so when the return for oblong you're going to use a yes I as a pointer to call it. So origin or attack are going to crop to the C.S.I. which is the pointer of the command to be attacks so this is a pointer stage but we found this does not work in some case because compared are going to retrieve the C.S.I. again from the fresh from from the midnight stack and this attack would not work but our tool will automatically find another option which is a second. Two level stage which means we can view this the E.V.P. here as a pointer or the pointer of the command and our. Crowd the pointer to front to interact it probably is a P.R. and from the rate changes the command to execute so this way only this two level stage will work to build its attack. OK So in summary here we proposed this a data flow station as a systematic a way to be able to data our Internet tax and it demonstrates that is the automatic construction of the Internet tax is possible. OK so now it's a question is. How poets present this in the tax does it have to rely on particular data does it have to lead on particular function. So next I'm going to show the second part of the work is called The data are into programming this we're going to are going to show that data our Internet can be built generally and it still can be very expressive even Turing complete. So similar Are we first a check simple code here. So this one we have our structure and to have some local variables for the program to our sanity to to handle and here it has a staggered base the buffer overflow again so what can text do here as we can see this no critical operation just know your ID critical data here so we can tax to so say how it built this malicious operation using this vulnerable cold so this a malicious operation is a very important kernel because they are all being laced inside of kernels so to boot is a tagalong to totally conform to the control for integrity. So let's see twenty Stana weather is possible we can find there are some basic operations inside of the pockets computerization there's a follow up there's a loop condition is some operation and it is a dishing operation and if we check this one record again we can find a similar operation is there is a look here this condition they are memory access operations and so kind of use this one to simulate is this operation done says yes let's see how this is the possible. So we first saw the memory layout of this our regional program and the we and I which are the element that we want to simulate. So first let's go to this final program in the first round so this while I'm going to simulate it is for loop. And then we can use this memory error to crop of the pointer to crop a stack in this particular way. So this is fully under control of attackers So next the program is going to check whether the the counting appointed by tap is now or not so here the pointer has been cropped it upon into a list so this check is going to simulate it is this check here to check whether list is not or not if so if it is now you're going to jump the jump the prick is a loop so we can use this want to simulate it is a loop condition and a following we can make of the program execution to resist a particular branch so here it is as RB has been cropped it pointed to this location and its US Army has a particular is a structure structurally out I just put the source of it out here so the type field is such a sorry to self and a tab This one has been this tab has me pointed to this list so this operation the result of this operation is going to change as RB pointing. To the next the field. And funny in that this operation is a dish operation and again I just put it is a memory out the expected memory out here so the total is actually mapped to this problem and a size is already cropped to the point is a dent so this operation is going to simulated this operation they are going to add this value into the problem field haunted by latest OK so now we have simulate three Operation scenes are targeted to computer but it seems we have finished this interesting of the loop but it's fun I scored again and that are wrong of execution so this time we just crop this in memory of your different away. As you can see here we probably are getting a different way. And this time we be sure this is not satisfy a box of the actual symbol continue but we are going to crop the tap a point into this a stream of value in memory so this times as Q. soon will reset different branch here and actually this operation are going to simulate this operation is a malicious competition so now finally we can make it this one reprogram to simulate is a malicious complication as you can see these are very basic operations in this manner program but we can achieve very interesting operations for attackers. So based on this operation we proposed the idea of a data are into programming so this is a general way to build dates are internet hacks this one thousand really are any particular code or data to launch attacks. So this one compared expressive attacks I even Turing complete a once. Two last days are in the programming the out two basic elements first one is a data into the gauges and otherwise dispatchers So next I'm going to show what has he so first one data on to get is a just the X. It is six instruction sequence so this is sequence show you know more execution so that we can conform to their fight. But is a property of this sequence is is that they're going to rate operating from memory to some operation and a C. of the result into memory because they tore into the TAC does not have control of the two flow so they have to use memory as a way to see if this in the media result. So basically they would have this the. Operation to load or operate the from memory and do this particular functionalities for them addition here and a fundraiser going to see if the value into memory again and we can find another they tell me to get it for them as they are going to read from different location and perform different operations so the key part is we just need need to make sure this too is the same memory location so that we can build meaningful a computation. So once we have the data indicator is next parties to how to connect them together to build a meaningful attacks so this is the purpose of these patters so we require this pressure to have a loop and also have a selector. So the loop make sure the program will execute is again and again so selector are going to select here fifty active particular gadget in each round. For them Paul here for the first the run of the execution this is selector just that you will get you the one and three for the run and two in a year will six and seven and a forty year old a one and a four so this way it can actually of this particular order to achieve interesting attacks so what does active me here actually mean is they're going to connect the result from this to this get it so curious as cushion this is the this interactive to get you to still get executed but is that going to read operant from don't care about and are going to save continent save the result we still don't care about so only this gadget they are the result are possibly to use the other to make meaningful calculations. So you in the previous example this is the well is the loop and this read operation is a selector to enable particular operations. So once we have the special gadget how to build a country attack. Sorry before that I want to assure that they tore into their in the programming can simulate a very minimal language we call them in top as allies of the program provided it's a basic operation is attackers can use to build a turing complete attacks so you can you can find a proof in the paper as a skip here. So to build a country attack we provided tools to help you so in the first stage we provided two to find all the possible the talented gadget for them of these checks is memory and load this additional provision and also would provide a tool to identify these pressures from the program basically we want to find all the loops with gadget inside and just need to carefully pick up necessary edges and loops to build a meaningful attacks. So to evaluate to the teeth on the programming. Which. Technique not real world programs each of them will pick the one can create a vulnerability. From this in one programs we identify more than seven Sol and get it and from the concrete the memory error we can reach. Would reach to more than one sold in two hundred a gadget so this case it can be used by terrorists to build attacks and we men the detectives that is the programs can simulate all basic mean dop operations which means once the program have enough. Dispatchers a terrorist camp in complete attacks so how about of these pressures so from this one programs we find more than one solvent these pressures and from the concrete to memory error we can use a test can use one hundred and ten edges so with this case and despair. We confirmed that two programs can be used to build the Turing complete attacks. And we come to once including how to bypass as our including to simulate that or bore and even we can change the code byte of the program so next I'm going to shoot it to use of the first one of the by process are. So this one just ahead protectress two by process are on the leaks a private key of the server so existing solutions are going to use even the leakage they're going to leak at the address of the program one by one and family cat is interesting data but the question is can it if it is are without any information the key is to never work so here is what we do with the current programming so this is the attack against the vulnerable per F.T.P. server so this why use Open S.S.L. for an occasion and he says the program is his chain to reach this a private key. So the product he of the program is here. All this in the media you do media about those are just the randomized and this one is at a fixed location so one way to link is a country of the key is that. Address of this is a variable and a look at this or to address of this variable and then you could address seven times and finally you get the content of the key but here we do not only to any address but we still get the key. So the message is always a source instead of program to find a meaningful get it so we find. It here first why is the move gadget he said we can move every value from our location to another location and we have an addition Edgett we have a law to get it so we this get it we find another dispatcher this did Specter just keep accepting the request and handle them and dispatch them into different function is these functions who have the. We have. So here let's see how we can perform this attack so basically this this combination of gadgets. Has started from this fix the location S.S. L C context are going to redo the count comping of the cert and copied it back into this to fix the location so this way we will put this address book put it this value into this location and then we just to have this attack seven times and finally we can put his address of the private key this is the two into this to fix the location so this is. It happened in the victim there aside and finally we can put attack by right. Pointer here with the address of the predicate and this is right some call going to leak the private key into the attacker ether network and during this attack we do not get any address of the program but we can still achieve this private or keep the cage to network so all the computer some related work with our in the program E So one can to do that packets are annoying but this one is in no understanding of the effectiveness and just some other attacks so far in the program is the first to work to show that it is incomplete it will conform to C.F.R. and it does not need to pending the data of functions. So your summary of this work I'm sure that into the text can be Turing complete and I propose a message or call that it on the program e it can be used to build this express if it's our Internet tax. And a bill or three country attacks to demonstrate it is this possibility. So next I'm going to sue you that could attack that attack in this across this is the browser which is open source the version of chrome. So this in this attack I'm going to bypass the same policy instead of browsers is of earth fundamental policy instead of browsers to isolate the resources so suppose this website this domain evil dot com They want to access the resources from Joe Box This would be not allowed so the browser it have this check they're going to see or less check of the source origin and the packet origin if it doesn't match you disable the access so this is check a code would be like this so wait check the source code of chromium and we find that just several data in the browser to determine whether a low is excess on not so actually this is very interesting so if this part is the abled the brother just gave up checks and who allowed access no matter. Matched or not so actually we find a several bias in the program that I will. Control with access to like geo location or web content. So a tech seems averse to forward we just use it attack to crap despite the launch attack but in fact in browser in the browser the a lot of defense mechanisms like so I thought P. like I like the memory partition and internal addresses. So before I saw a piano say five we used to turn the attacked by pass this to defense and if our memory positioning we use cross party reference to bypass it and for us are we use. Printing too to bypass the S. R. so others can be found in the paper. But I want to have this fingerprint technique so this one is too is a way to find this critical data from the memory space so the ways of the site is that this particular structure they are going to show a particular did a patent in the memory so this why is the part I want to cry about but before that you're going to have a several. And any memory where if if this for the memory of continue this structure you're going to have a protocol point or have the host pointer and the when pointer and a. Pointer and after that it will be cropped so basically we're just the living in a scan of the memory and to find a memory location then you have this pattern basically you have have this for pointers and this too will be mostly the same and that last won't be most of the time is zero so once we find this data structure we believe this is the East is of this structure and we just crop this pointer is this part here to launch attacks OK so next I'm going to show so you a simple demo of this attack so this way we're going to show that attackers can access the job box and upload a file into the victim's file system. OK so so sorry so first victim I'm going to access the attacks website so actually attack happens inside So this is this is pretty from issue on the tag So basically if you bypass the star they're going to crash the pointer so actually this attack already happening. So then the attacker sort of. Right so their attacker we open another tab to load is the box website as it will make this a click using javascript and to upload the places file into the victim's job box but a job box you know it will be someone will have a. Start on the operating system so. Hopes. Up a lot of this one into victims of job box and this one will be synchronized into victim local file system so this way we can upload the other three file into victims local file system so this one is a calculator but it can be any malicious code OK. OK so just now we discusses the deterring attack so here we want to talk about it a bit to potential defense mechanisms. So OK we know the defense the memory safety so memory safety are going to block the memory error in the first place so no matter control attack at that attack is not possible now but the question is Is this the memory safe solutions are quite so low it is going to use high overhead if the program is more than one hundred percent So imagine your browser is slowed down by times this is not acceptable. So another possible defense is integrity so basically it's just that they find this is just identify the role of the fine and use of the variables as your check whether the real use is correct or not but a similar overhead is too high and also we can have this going to different techniques like a data space where imagination but now this is just so many ways to bypass us are using the memory error or are used to how do we are set a channel and this is not a perfect solution yet and also we can find a solution see how do we are. Pecs but it turns out is that still have a lot of high overhead and as a result it's not a good so your summary nowadays would want to have a practical solution yet so this is still hot topic now days so you conclusion in this talk I proposed. Techniques show that automatically construction of data and attacks is possible with the started the fluidity to the flow station technique. And also assure that the different attacks are expressive and can be Turing complete and also made a several attacks are rewarded programs OK thank you for polished me and this is my for mission and all packets of you can try it if you're interested. Thank you. So I'm ready yes so. Thank you. OK sure. Corruption. So. For example. I'm sure. You. Know. Those. You know. So. You. Show up. OK So first the turn after is memory error and this memory error can crop to any location has any value so if attacks are wanted you can connect the data flows if they want but the question is it doesn't make sense or not so so if they connect the two simple data flows and the TEGA so not will see anything they propose So basically they're going to find interesting data flows and connect them together like a data flow private key or data flow of some critical data like the use ids thing so the question is can we protect these particular flows. Yet maybe if. It's a solution but. Right. OK So I actually have two questions I have to comment on this first one is. What data flow is important to attackers This is very hard to define So is that a true that if this is just a counter and this card is not amenable to attackers we do not know about that so of course the U.K. you can protect the particular very very important data flows like private key or user ID But as I show in the second a work the data or program e attacks can even build expressive attacks even with this simple operations there's no critical data there's no critic operation just the addition. In memory remember low but had a terrorist can do a lot of interesting work so maybe to protect some important data we can use this mess or but this cannot prevent other means of expressing a deterrent to attacks or. Yeah definitely yeah. Thank you that's question. Sure sure go ahead. So. The. Price for one. Bomb is. What. It's. More. So. So. There. Might. Be. Yeah definitely but there's a question here so if you used just randomly it's probably memory space is probably different the data flows the question is. How likely the scene will trigger some particular important data flows put to the attackers for them some passage just to have some evaluate the file formats I will break immediately and they may be very short and doesn't contain any meaningful data flows like it did for all of our private key is that a flaw Pranav something like this so it's not very clear whether that makes sense or not so to me it seems most of the time that does not contain very meaningful data flows but to symbolize Hussam it makes sense because you want to say I have to keep it afloat I want to explore other data flows and see whether we can look at this easily and not but you know seen what I see is slow and no bird predicate was here so maybe but as it was to explore I think. OK great. Joy just talk if you have any question just let me know my email is there thank you.