Good afternoon everyone my name is Yang Ji and today I'm going to talk about our work called refine a bull attack investigation. We know that although there they are there are many prevention approaches but there are. Many data breaches happening every day so we know that many big companies like us on the any yahoo have been breached and their customers have been stolen and the sum of their data have been tempered and. I think everyone here knows that the last year in fact incident lost hundreds of millions of users data very sensitive data to the attackers and also. On this Monday that You Tube has been hacked and they find some of the video clips have been tempered have been defaced so so our focus of this project is to more on the post mortem site so we want to know that the other really happened during this attack and that we want to know it accurately so the current status of this. Investigation of such. Data breach at the temporary incidences is largely using system level logs however that we find this is not create. For example that here. For. If the extension of the Firefox browser has been compromised and then. The attacker kin can skin these the file systems of the victim. By reading a multiple files and then he can selectively sins' out to some of the data over some of the file into the attackers control site. But this is actually if we we only rely on the system. Level logs we can only see a bunch of. Activities of the read activities and then a bunch of Syrian activists down by the by the Firefox but we cannot know exactly this file C. has being. Breached. So this reduces the accuracy of of the of the attack investigation who is doing. So. We proposed a system called the refined attack investigation and the we used the. A technique called a dynamic information flow tracking which is a dynamic binary instrumentation tool that can instrument the execution of the program instruction instructions so that we can find find out the most to find during the level of the data that happens during our programs execution however it is very accurate but it incurs the very high overhead to the execution of the program so we so we used a recorder Play Technique which we recorded the execution of the program at its run time it was a very low overhead and we replay the execution during the off time and then we use had this gift this heavy but very accurate. Analysis tool to find out what exactly happened during the attack and also that we use. A set of a graph based pruning technique to narrow down the scope of the analysis and the focus on the word this exactly matters during the attack. So our system achieves a very high accuracy almost. Approaching one hundred percent of the accuracy in a we have a very good run time efficiency less than five percent overhead according to the benchmarks and also that. We have. Highly improved our analysis efficiency compare was previous system by over eighty five percent in the we also. In the storage size that our system consumes two terabytes per year on the typical that stop using. So here's the. A brief overview of our architecture so our system resides in the in the operating systems kernel and there we also have a customized that Lucy so we recorded the non-deterministic inputs to the program during the run time and then we ship these logs to security assure the analysis host and then we promise graph using using these logs and then we apply a set of triggering analysis reachability analysis to to narrow down. The scope of the investigation and finally we apply this. Technique only on the executions that is related to this attack. So here are. So. So at first that the we have a very big cost level graph that represents just that that happened in the system wide scope and then we narrow down this scope according to the rich ability analysis and then finally that though we conduct a. Gift analysis on part of these subgraph and then find out the exact. The most to find during an accurate information that we are interested in so here so here's an example for example. Which is actually on the. On the victims file system and there. They are or touched by idea wow only the file af has being executed. So we're using our system ring that we can find. This file has been executed the first. Program. To two and then it's packaged and then send out. By Firefox browser which is also compromised so there we know exactly how these. Happen. So we so we use the. Exercises conducted by DARPA to evaluate our system in a we are able to detect all the attacks accurately and though we are in our war performance is also good in terms of the runtime overhead less than five percent and then even the with some very intensive applications like downloading a big file it's less than forty percent. So here is a. Graph that shows that we reduced our dependency confusion rate inaccuracy hugely to zero in most cases and so I want to Summary To summarize our our work that we actually now ring is the state of art of the. Attack investigation system and. Advocates when the authority but attack investigation to a new level and most importantly that it provides the repeatable and the refined investigation so it's not only. You know one round of and that and that's this you can do them well multiple wrong at different. Analysis purposes so next step that we plan to modularize instrumentation and. To for better possibility country our implementation is in Linux and maybe we will try to. Move that to other operating system and also that we try we will try to commercialize and deploy the ring system in the real enterprise environments Thank you. Yes So that now that our increment our implementation focus on Linux and we are actually trying to modernize this into a kind of module and for the for the for the windows we we are a we're also trying to do this because we know this is actually close source right so that we are we are. I think it is stew feasible but you need. Some engineering efforts. That it. Was actually conducted a. Has a ring a set of. Testing environment and they have an independent. Red Team attackers so they so they conduct the attacks and then we are our system is evaluated how we detect and how we analyze these attacks. So we we actually have plan to open source our project. And. Yes. So that basically that. We. So we maintained data that happened in the systems execution for for a very long time and then. With the triggering analysis in the reachability nice if we can. To move to the we can quickly point out a starting point of the analysis that you need and then we can extract a sub graph that you really need to work on and then you use OUR were. Dynamic analysis to. Extract exact. Data Quality.