Hey everyone so I'm Kennon I work in the Software Assurance branch. And I want to talk a little bit about the program analysis and in particular vulnerability analysis so just really quickly to kind of discuss what I'm going talking about we're to start by motivating this describing what I mean I'm talking about analyzing software and what we're looking for we're going to look at some of the kind of older traditional techniques that people used to find bugs in programs move into some more modern techniques and then finish up with a look at the future of what I think is going to be the next kind of field in this area. Where So we want to analyze programs to find bugs right so everybody knows these days software is everywhere software is buggy buggy software sometimes means bad guys can do bad things right if you follow the newest at all or really almost if you just a person living right now you're aware of some of these computer bugs right they get very high profile they get all over social media these days the news are reporting them left and right so you've probably seen some of these boys lately this is a nice collage of some of the beautiful social media graphics these bugs get these days. And this is some of the big ones lately so things like heart bleed probably in shellshock blood borne these are all very high profile vulnerabilities. That basically led to panic as people rush to patch other systems get everything fixed. And of course in the past month we've had spectrum melt down these are huge. Major processor vulnerabilities no one is even sure what the impact of these are going to be yet in terms of cost to patch things cost for Intel to fix all this in their hardware. So this is all a big These are huge issues that people are trying to deal with they're getting mainstream they're getting public attention now. And everybody knows that we want to find these and patch them because if we don't it allows bad guys to do things like steal personal information infect your computers of bot net spyware software ransomware becoming popular So what we want to do of course is find all these bugs before the bad guys do right we want to be able to find them and catch them ourselves before malicious actors get a hold of them and can start targeting corporations or individuals So what are we talking about exactly when we say we're looking for these vulnerabilities these security bugs so for the purpose of this presentation we're mostly going to be focusing on memory corruption bugs so this is a particular subset of security issues you can have in a program. It lends itself to a certain type of analysis so in particular memory corruption usually means that a user of a program can either read or write parts of the program that they should not have access to so common things are modifying metadata of the program inserting instructions to execute into the stream or even modifying application logic by changing values in the program so maybe there is a boolean admin flag and if you can get access to that you can change that and then hey sommelier an admin so all these things are kind of in the vein of memory corruption bugs and of course the classic classic example is a stack overflow this is kind of the simplest easiest to understand. And bug so if you have a C. program and you have a bucket as you can see here so there's a ten byte array you have ten by Buffett bucket and you try to copy something larger than ten bytes into the bucket we have a problem right C's not going to do any runtime checks to make sure that the sizes are correct and it's going to overwrite whatever happens to be after the survey on the stack which in this case usually means you can crash your program or make it run arbitrary code which is clearly battery This is exactly the kind of flaw we want to find. And there's all sorts of things like this so other common memory corruption bugs you've got the bugs like keyboard flows double for a reason valid for easy use after freeze type confusion bugs format string vulnerabilities buffer overeats all sorts of things like that that they're all usually high impact bugs and they all belong to the same general class that can often be detected the same methods. So how do we do this how do we actually say I want to find these bugs and fix them. So for the purpose of this presentation are mostly going to be focusing on binary analysis so this would be the case re do not have access to the source code maybe it's an application in the run on your Windows computer it's not open source you can't look at the source code you just have an executable and we want to know how do we take the how do you take this executable that we don't really know much about and find bugs in it so I want to give kind of quasi historical overview of how some of these tools developed hopefully to give you an idea of why people iterated on techniques the way they did why people chose to use different techniques and how they've changed and I'm not going to try to avoid going into our Qur'an this technical detail some of these because a lot of these techniques are rather complex especially at the end but I want to give you enough of an idea to understand the tradeoffs of why certain things are effective why they work for different types of problems and situations and why we've been changing and evolving our techniques. So the most straightforward way to find a bug is through manual analysis so this is pretty obvious if you want to find a vulnerability in a piece of software you tell a human say sit in front of this piece of software and look for it so usually if you're working with an executable you have the source code for it this means using tools like disassemblers decompilers or debuggers so Here's a screenshot of IDA Pro This is the most popular commercial disassembler in the compiler tool and I've just loaded up with a sample program so you can kind of see but this is just an example of what a typical manual analysts would be looking at so you've got a program that has assembly there trying to reconstruct what the original source code of the of what looked like and as they reconstructed there keeping an eye out to find any security problems so they can look at this and find flaws in the original source code by analyzing things like this. And this can be extremely effective when skilled reverse engineer is a doing this so if you know if you guys don't you should be looking at Google Project Zero a lot they frequently post blog posts with really detailed explanations of different vulnerabilities they fired and they find these a lot right there they're very good at finding as the high profile bugs and they often do this through just pure manual analysis they're just looking. And so this is a really great technique but it has a really serious and obvious problem which is that to do this you require a very skilled usually highly paid reverse engineer. And I. I don't need any fancy graphs to show that software is getting more common more complicated there's a lot more lines of code every year and so we have this huge base of software that we want to analyze and not nearly enough manual analysts who are actually capable and willing to do it so there is. It's completely impossible for a manual analyst to keep up with current software so as a little example I just on my computer at home the other day I opened up a game I had installed I wanted to see how complicated is a modern application so this is a games from about two thousand and twelve this is a screen shot of it opened up in a debugger and I found that the game had almost one hundred fifty execute all modules this is basically a shared library and almost one thousand memory regions so these memory regions are things like the different shared library mean executables code data sorts of things like that. And so I took the largest of those one hundred fifty executable modules and opened it up in the disassembler I did detected that this this D.L.L. had forty three thousand functions in it so if you imagine that's a lot even if you have source code available so even if you're someone who is writing software for three thousand functions is a pretty big piece of software be looking at and so then of course imagine if you're not the one writing the software you don't know what it is doing and you have to figure out what all of these functions do and as you do that you're trying to figure out if there's any security bugs in them it's pretty hard thing to try to do for software on the scale doing this kind of thing requires really detailed knowledge of the processor architecture you're looking at so you have to be really good at X. and six or whatever and it also requires really detailed knowledge of the operating system you're working under so this is a Windows program you have to understand all about the Windows A.P.I. how this application is going to interact with it and you also have to understand how it can interact with the other one hundred forty seven Sure and libraries that are being loaded into memory so this is an enormous task for a modern applications of this size. And it's just getting harder as time goes on so to a certain you know through a certain extent you can have these manual analysts who are targeted at really high value things you really want to make sure it's safe but if you want to ensure that even just the most commonly used software is is secure and bug free this isn't really practical in order to keep up with demands of modern software we absolutely have to automate some of this task there's just not enough skilled humans to do it. So and her flaws testing this is a pretty old technique and it's pretty simple really the main idea is you have a program and you want to say what happens if I just run my application with random input over and over again can we get anything interesting to happen and so you're looking for things like crashes hangs anything like that so typically what happens is you're generating random and puts you on the program with that and hey this caused a segmentation fault let me pump this off to a user who can look at the crash and determine OK this is kind of a benign crash like maybe it's a bug that should be fixed but it's probably not going to cause any security problems or hey this is a huge deal we need to patch this immediately. So this works really well. This allows the computer to do almost the entirety of the effort instead of the human after some initial set up the computer can just keep running indefinitely and this paralyzes very well so you can run this on our much harder you want really so this this kind of can scale very well even with software complexity. But it has a huge drawback and is a number of drawbacks but one of the biggest ones is that it was testing in this way only finds what people call shallow bugs so here's an example C. program and you can see that it's basically reading an eight byte number from the user and if it matches this particular magic value it will call this real main function and actually start executing the bulk of the program otherwise it's going to return negative when an exit immediately So clearly if you're trying to find vulnerabilities in the software it's not the vulnerabilities aren't in the main function they're going to be in whatever the rest of the program is but if you're running this against random inputs imagine the odds of guessing this particular sixty four bit number with a random run rate it's almost astronomically low so you're going to be wasting all your time fussing this five line piece of code instead of actually fuzzing what you want to test. So the crux of this is that the set of valid or near valid inputs that you want to fuzz with are much much smaller than the side of actual inputs on a lot. Realistic programs especially the ones that are interesting to fuzz so you know imagine you're testing a C. plus plus compiler like the plus plus and you're just trying to compile randomly generated files with the lexer of the compiler is going to just dump almost immediately every time right it's never going to get to the actual interesting part of the program it's never going to be exercising the parser or any of the semantic stuff of the compiler. So we need to increase the percentage of code that gets tested right we need to somehow. Make sure that our inputs are actually valid or so are mostly valid so that we test most of the program so a really simple way to do this is to start with input seed so instead of just randomly generating inputs why don't we take a valid program and just start doing small mutations on and see what happens so you can just go through and repeatedly do things like add a byte to the end or remove a random byte or flip a bit somewhere in there and you run with this instead and so now you're fuzzing inputs are going to be similar to whatever your program what the program is actually expecting and you're more likely to actually be going through the bulk of the program and this is actually a really important tool or technique even for like more advanced buzzers that I'll be talking about later having a good corpus of input seeds that greatly increase the effectiveness of this technique but the problem is especially with this naive approach having a specific set of seed inputs will restrict what the funds are going to do so if you have a program and it has like say two kind of destroyed forms of input it can take either a fire that looks like a or file that looks like B. and if you see it with a fire that looks like a well it's never going to be testing the parts of the program that are designed to test be so if you have a C. plus plus program and you know it doesn't have any template code and you're mutating this program as a seed Well hey you're never going to test the compilers template code so you're never going to look for bugs there so for this to be effective you really need a set of seeds that kind of encompasses the entirety of what the program can do which in some cases is easy to do but in some cases it's basically impossible. So to kind of summarize some of the older techniques these manual techniques that I've been talking about testing doesn't scale Brannaman seated fuzz testing do scale but they have severe drawbacks as well as testing only find shallow bugs and see if just testing is limited by seeds. So where do we go from here so this is where more modern techniques picked up and I'll talk about two in particular these are all from the past say five or ten years and are pretty much what people are doing these days. So there's use an automatic feedback loop to direct fuzzing towards interesting inputs and I'll define what interesting means in a minute but the basic idea is you have a queue now of seeds that you want to mutate you have some kind of mutator engine that will have a finite number of mutations to apply and you're going to iterate through you're going to pull a seed off go through all your list of mutations and then see what happens and you still are running this you're running the program with these random inputs and you're checking for crashes and things like that just like before but additionally you have some sort of evaluator mechanism and this is judging how the interestingness of an input and if a particular input was deemed to be interesting it will go back on to that scene to you and be a new source for mutations in the future. And so in this way the funds are basically stays directed towards interesting things and is going to it's going to continually march towards these interesting inputs so what are what is interesting mean the rate that we need a specific metric to use. And so this whole technique was popularized by open source Pfizer called A.F.L. it's American fuzzy law you can go download it and try it out yourself. And F.L. uses program covered so in particular it's looking at the basic block coverage of a piece of software. If you're not familiar and have a little diagram to kind of illustrate what a basic walk is basically it's a linear sequence of code that will always execute together so here's a little example python program you can see in this graph the rectangles represent a particular basic block and so the first. One is X. equals into input and X. times equals ten and if you think about it those two instructions will always execute in sequence right there's no branches there's no way to jump in the middle of that those will always be executed in kind of an atomic block so we say that is a basic block you get in this conditional statement here we fork to the left and right depending on whether it's true or not and again on the left hand side we have the largest set of contiguous instructions which is only a single instruction in this case and on the right we have the negative ninety nine A equals zero you can see there's a loop in here to the the graph shows that the loop can either go back into itself or it can exit and then it all joins back up and print at the final print of Y. And so what if L. is doing is basically keeping track of which of these basic blocks have we visited so it's using this as a proxy for basically total program coverage. And what it does is it looks at your program and it identifies every basic block in your. And your binary file so it can actually disassemble it and look at it. And every basic block is going to insert a piece of code that basically just says hey I'm here so maybe you run a program and you get reports from it that says basic walk two three seven was executed in basic block three hundred four was executed. And your whole fuzzing system is keeping track of everything it has ever seen and when you get a new one you say hey that's interesting we found something new in this program we want to focus on that and continue testing it because this is different from everything else we've seen before. So it is as an example here here's a little C. program you can see it it's a function that takes in just a character pointer called Magic buff and it goes through the first four bytes and checks to make sure that they're equal to A.B.C.. And say for whatever reason we want to make sure this function returns zero or maybe it's not a return zero maybe it's another function call or maybe it's a vulnerability or whatever. And if you think about the case where just a random. Generating an input you're only going to reach this most the innermost condition if this basically two to thirty two value is equal to a specific amount but with A.F.L. this gets kind of broken up into more individual tasks so each of these statements is its own basic block and what's going to happen is the father is going to start phasing and you know say it happens to guess a as the first bite so this is only one into fifty six or a two to eight this is not that hard to do really and it will pick a for the first bite and as it runs we now suddenly get to this new IF statement that we've never seen before so the father says hey look if the first bite is a something different happens I want to do more of those so he's going to start doing more fuzzing with A as the first bite and we do a couple runs then you know precision hopefully we say OK what have be is the second bite and the father says hey look when A and B. are the first two bites we do something completely we do something else different so only keep doing without and so in this way it kind of iteratively breaks through each layer of this conditional. And will eventually get this correct A.B.C. D. string and so in comparison with a pure random one this is basically four times two to eight which courses to the ten about a thousand and way more tractable to try every iteration of then to thirty two which you know you can is going to be spending all your time basically. Just trying to get through this little conditional statement so this is great right this solves a lot of our problems this makes fuzzing much more effective but it's certainly not perfect yet so it works really well in cases like this but if we go back to the original example I had it where we're checking to see if it was equal to the sixty four bit number this A.F.L. approach is still going to fail and the reason is because the reason that the left hand side works is because it can break up this check and it different individual bite shocks and the odds of successfully guessing a single bite are pretty high right that's not too hard the odds of getting a sixty four bit number correctly are much harder this task doesn't break up into a kind of it doesn't break up into separate operations it's a single atomic comparison and so if it was never even going to get to realize that there's a new new branch to look at because it's never going to guess this number so it's going to get stuck here just like the simple mutation buzzing would. OK So to summarize a little bit this branch directed fuzzing you get most of the scale ability of dumb fuzzing which is great and you do get huge improvements in terms of performance because it does direct the fuzzier really well and there's actually lots of empirical evidence you can Ethel's website has a lot of that of them finding bugs that traditional done fathers were not able to find so this does help a lot. But it's certainly not a foolproof system there's still lots of issues it can have. You can still have basically areas where it gets stuck. OK so this was this is about the state. You know kind of five or ten years ago there was a bit of academic research that you use this technique called symbolic execution and apply it in kind of a novel way to testing to help solve this problem so I'll start by just kind of briefly going over what normal symbolic execution is it's a it's a very broad topic and it's a much more general technique and then all show how to apply specifically to this problem. So symbolic execution this is a technique in program analysis with really brought up locations. And the idea is that you're going to take a program and certain inputs usually user inputs instead of being treated as concrete literal numerical values are going to be treated almost as algebraic unknowns and you run a special program that's going to emulate your software on your test basically with these special algebraic values and it's going to be to compute different things reasoning about these unknowns so I'll kind of clarify this with an example so here's just a little Python program and first I'm going to show how this sort of run just if you're running it as normal. So say you have this program you have particular function inputs here on the right it's going to execute it's going to try this first instruction and it says hey there's a lot of this value if we look over at our inputs Yes it does so hey we're going to go to the true path. And he. For this particular value and it's going to say no it's not so we're to go to the false path were in a print failed Con two and the programs going to exit and that's how a normal one works so what happens instead if we ran the same function but instead of having concrete inputs for a B. and C. These were kind of unknown so they could be any value. So reach this first conditional statement if a fools in this value and we as I mentioned a is an unknown but we don't actually know what this is so the system can't actually tell which of these two paths is going to take it doesn't know because A is an unknown so it does the only thing it can do which is it tries both so it's going to fork the system and on the left hand side it's going to assume that this conditional was true and on the right is going to assume this conditional is false and so we basically fork the system and try both things and you can see on the left we've now kept track of what had to have been true in the past to reach this particular state which was input a had to be equal to this value and on the right we keep track of. The conditions that had to be true to reach this particular path which in this case the condition is a COULD NOT EQUAL this value and it denoted the read outline just to say this is the end of the functions that returns here and so the emulator will actually stop because there's nothing more to do on that path. But so the left hand side is still active so we continue following that and we reach this statement here if B. was all B.'s and again we don't know what B. is read this is an unknown so we have to assume it could be either and force the system so on the left hand side we're going to say OK well we'll assume that B. was equal to this particular value and the right hand side are going to soon be was not equal to this and this continue Similarly for the last conditional and we now know OK we'll reach goal if all these are true or reach fail condition three if all these are true and so we've now basically explored the entire state space of this program so we've taken all possible ways the user input could affect how this executed and we have basically a listing of the different paths we could have taken and what had to have been true about the user input to reach this path. So. This previous example is what's called static symbolic execution this is kind of the original way it was formulated in the traditional way to do it. Is a lot of really cooperate cations for both computer security and a lot of other problems in general. But. There's a twist on it that people applied to fuzzing. That's kind of interesting and as a little bit different from this so the twist is called Dynamic symbolic execution or can call it execution as well. This was was for it was first done in academic literature on source code then maybe five or ten years ago Microsoft applied it to binary programs and a system called sage and it's now going to become the defacto way fuzz testing is done. So the way this works is you combine the static symbolic execution that I talked about before with a concrete input. And so this time when you reach a conditional branch that depends on an unknown value and forking you look at your concrete input and use that to tell you which path you should take so it constrains how the engine is going to go so this actually it actually executes just like your processor would write it's going to it's going to have these inputs and it's going to take the correct path depending on. On all the user inputs the difference though is at the same time it is also going to be collecting these paths constraints. And also an example of this but this is really useful for directing funds testing so we're going to take the same program again. And this time we're going to try it in dynamic a symbolic execution of it so we're going to start here. We're going to reach this conditional check if a goal is to always look at our inputs we see yes it does and since we're doing the can color the dynamic execution we're going to instead of forking we're going to follow this particular path but at the same time we're gathering this constraint so we're saying I know that we're following this path but I'm also going to. Good the constraint basically that led to this particular paths of what had to have been true in order for me to reach here and similarly for this conditional going to be truly going to follow it and this one is going to fall so you're going to jump to fail condition three and your path constraint is listed here. And so what you can then do with this is you iterate through the constraints applying the logical conjunction operation and invert the final constraint all explain what I mean by this so you going to start by just taking the first constraint and you're going to invert it so we're going to say I want a to not equal all A's. And if you look back here you can imagine if you have some user input for which a does not equal all A's instead of following the path of this execution you're going to fail that first conditional and print fail condition one. And similarly you're going to take the first two constraints and them together invert the final constraint and this time you're going to follow the first path the first hop that was taken by that concrete run but you're then going to do the opposite for the second conditional going to print failed conditional two. And for the third one you're going to take all three constraints and them together and invert the final one and this time we're going to fall the first two hops the way the original program ran and then do the opposite on the last one so we basically now have three sets of constraints that detail how to take each of the paths that we did not take in the original program so if there is if there is a particular program path we can't get through we can use these to generate a new input that will satisfy the opposite constraint and allow us to explore more the program. So this now will work for both of these programs unlike the branch branch based fuzzing you can imagine if we run this right hand right hand application with the user input Eagles or whatever and say we do a quick test and we just say will make the user input zero obviously is going to fail this check and return negative one but we now have a condition that. That was created in the emulator that says input does not equal one two three or five six seven nine zero so we then invert this and make it equal to input does equal this value and generate a new input based on that and hey now we have something that will execute real main So in just a single iteration we've now analytically solved how to explore more of the program. And so I'll mention here I kind of glossed over this the way you actually can avert these constraints two values is through it's called an S M T solver these are obviously very simple constraints right these are just straight variable equals thing variable does not equal thing you can do these you know easily yourself but you can imagine that this could be much more complicated write your processor could be doing math on all these different inputs you know maybe instead you constraint is eight times B. minus C. goals you know D. or whatever so you can imagine these constraints can get very complicated. And these these solvers which can solve these systems of equations are pretty good but they're not perfect right they can't solve any possible system of equations so you can imagine if you have a program that takes user input into Cripps it and then you know jumps based on the first byte for example if you imagine trying to model the decryption process as a series of mathematical operations it's going to be incredibly complicated and there's no way these solvers are going to be able to do anything with it they're just going to basically give up and say this is too complicated they can't do this. So there's still a lot of situations that are are more complicated than this right hand example where you still cannot find an analytical solution to explore more of the program. Another major problem is again I glossed over this but just the act of making these symbolic engines is complicated you can imagine you're trying to represent in an abstract manner all the semantics of a processor which if you're dealing with you know add subtract that's not terribly hard to do if you have to model say like the more esoteric instructions of the next. Six processor this is a lot harder. And much much more challenging to write this actual symbolic engine four. And the other main problem and this is true of both this dynamic symbolic execution as well as directed fuzzing in general is these are all concerned with expanding program coverage which is great right this is a really good goal it's very effective in practice but the problem is that not all program paths are created equal so some things are much more interesting to explore than others so if. These systems can't distinguish between kind of the the meat of a program that contains vulnerabilities and the kind of boilerplate stuff that's not really worth testing. But with with these kind of systems the computer has no knowledge of the semantics of what it's looking at so it's not going to distinguish between the two. And they kind of illustrate some of these problems in a look at the cyber Grand Challenge So if you're not aware this was a DARPA funded Grand Challenge exercise where they basically challenge teams to make a program that can automatically find and patch these more abilities and they would then compete against all the other programs and see which was the best. So a lot of the best security researchers in the world competed in this competition. And at least as far as I'm aware most of the public designs available used what I've been talking about the in this presentation which is a combination of the dynamic symbolic execution and this branch discovery fuzzing. And the by all accounts the challenge was a great success the state of the art was advanced a lot the basic idea was what I presented here but the individual teams came up with a lot of really coopting the stations in doing this that really helped all the systems along. But the challenge also revealed that there's a lot of room for these systems to grow so each team only successfully crashed a fraction of the available challenges and there were many challenges in this competition so this would be an individual program that was unsolved by anybody so no one was able to automatically crash it. And the winning team from the cyber going shall. Which is a ME him competed in the Def Con C.T.F. which is if you're not aware it's a computer security competition probably the most kind of prestigious and difficult in the world so it's competing against some of the best manual reverse engineers in the world. And it got thirteen to fifteen which in one way is extremely impressive because it's a piece of software competing against some of the best experts in the world but it also shows that the best experts are still substantially better than the software is they're still able to find much more of these bugs. So despite the huge advantages that the system has in some respects in terms of ability to scale in such a targeted manual Rison ears are still more effective than the current best automatic systems. So the question then is where do we go from here right so how do we improve the system to make it better so this is a really up and coming area it's just kind of becoming a cool topic a lot of people are interested in it but the idea is to attempt to take the best of the two worlds which are the manual analysts and these automatic systems and so of course Manual analysts this is very effective it doesn't scale these automatic techniques scale pretty well but they're less effective and there are certain classes of bugs they can't find as well. And so we want to combine the two So ideally you want to humans to be able to provide a small amount of knowledge just enough to kind of give a kick to the system and allow it to be much more effective so the hope is to get the best of both worlds so you get the scale ability of the automatic analysis techniques with the efficiency and effectiveness of manual analyst so in order to do this of course we can't have the human doing most of the work otherwise they might as well just being allies in themselves so the systems have to have just a small cost for a human to be involved and so I think the ideal would be you have a bunch of these automatic systems they're testing a whole bunch of different things at the same time and you know maybe a single human can interact with you know a ten or a hundred or a thousand or whatever these automatic systems and just occasionally give a little bit of input to each one so the first paper that I know of that did this kind of thing is called Rise of the hackers augmenting autonomous cyber reasoning systems with human assistance and so what these researchers did was they took the C.D.C. contest in the cyber going challenge contestant mechanical fish and they added a human component to it and I'll explain what exactly they did and why it's so interesting but the the court result was that the humans were able to substantially increase the effectiveness of this tool so I think they brought the total number of crash by energies up from something like forty eight to sixty five or something like that so it was a you know a reasonably substantial leap in the effectiveness of this tool. And so what they didn't particular was kind of interesting So these researchers used a little bit of basic static analysis they gathered all of the ASCII strings from our program and they partition them into an input set an output set and they basically just did this by saying if we're printing the consul say we're calling print Well it's an output string and if we're doing a string comparison it's an input string because the the intuition is if maybe we're comparing against the user input. And they use these so they use the output set as a proxy for program coverage and they basically told users Hey here's a list of all the things this program could potentially print we have already seen A.B.C. and he but we haven't seen D.S. we really want to see D. We need a way to get to this missing string and in order to do this the user is actually constructing a new seeds to give to the fuzz or and they do this by looking at the input strings and trying to combine them. In order to provide a new seat for the fuzz or and so the that what they're actually doing is they're looking at like the semantic English meaning of these strings and using that to target the fuzz or so you know you can imagine an example maybe you're fuzzing like an airplane control software or something and you haven't seen the output string cabin door opened right so how do you do this well OK maybe there's an input string that says command open cabin door seems like a reasonable place to start so the human can just look at the actual meaning of these and use that to see the fuzz or and this is a particularly interesting one because this does not even require that the human is knowledgeable on a reverse engineering right this is just knowledge of language so they don't even have to be a skilled programmer or anything to do this. Which beyond the fact that this is scaling better because the human isn't doing most of the work this can also scale beyond the traditional domain of this which is very highly skilled experts who understand how to do manual reverse engineering so this is a really cool area and there's a lot of different ways people could take this you can imagine probably you know you could do things especially if you open the door for more skilled humans in the loop you can do things like directing the files or down certain paths telling to ignore certain parts of the program do you things like changing how mutations were changing how seed selection work works all sorts of things like that and so I think that this is probably at least for now until the automatic techniques improved drastically this is going to be a really cool area that we're going to see more research on leading to much more effective vulnerability analysis tools and hopefully we'll be seeing more software get patched before it gets shipped or far before a malicious actor finds bugs in it. And last kind of. More scrambling to fix things that are already broken so thanks for your time everybody knew whether any questions. Yeah so. The dynamic symbol of execution is basically going to be it knows that we followed a particular set of paths to the program and it knows hey we missed this one this one and this one how do we get there yeah so if finds a single path and then they can use that you can use it basically in averting the last part of the path to find an input that would do the opposite of what happened in our previous run. And yeah allow it to do more so you know and in this case it's only a single level deep but the constraint it would gather in a concrete one run with zero as an input is input does not equal one two three four five six of the nine zero because that's the condition that had to be true to reach the returned negative one and then if you invert that and solve that you get a new input that will do the opposite which is to say it will call real mean instead. Well. It does kind of depend on these X. system usually with a binary system though you're just dealing with raw bytes without types. They do there's different techniques that apply better to source obviously you have type information things like that when you doing with source which can give you a better ideas of what to target you know common. So that. When you. Do so you know yeah it was just a little. Yeah so the water that. Will be. The same right yeah yeah yeah definitely So the advantage of using the dynamic system for one I didn't mention this but when you're doing static symbolic execution there's what's called the state explosion problem which you can probably imagine if you're forking every time you reach a conditional That's a lot of forks and that's hard you're going to be hard pressed to execute all those especially in a loop and the other thing is if you have one of these complicated constraints that you can't solve very well one thing you can actually do is use partial information from your concrete Iran and simplify part of it and try to get try to use that some of occasion to get a simpler constraint that can be solved by the seventy solvers So that is a that is a quantum zation that people do do in practice and does make this better than the kind of naive case. This is. So. That. Is. Why. The last. Night. So the one of the things with this paper is it it's very cool because it doesn't require particular. Expert use of the human to be useful but as you mentioned you know when we're using just output string as a proxy this does definitely limit what we're doing exactly where you can imagine if you have a more knowledgeable human is able to reason about like the control flow over program instead they can probably get more meaningful jobs here so the output string is a kind of a coarse metric and you know there are at least my intuition is I don't think the paper published enough numbers to be sure but my intuition is that a lot of these output streams are going to be kind of at the final path of the program and so it's not going to be a case where open up a whole new sexy never seen before but if you do you know alternative human instrumentation techniques where the human can look at the control flow graph and say Hey we've never visited this place before and this is like most of the program that is what we should be doing I think you could see a potential you know a further more effective increase. This is the only one that I know of there may be more but this is and this is relatively recent I think this is only officially published like a month or two ago it's been a drought form for a while. To an extent a lot of these systems do you kind of assume they're working with a well formed a binary so you know the kind of thing a compiler would output there's lots of tricks that you know you say you're writing a piece of malware you don't want to be flies tested there's lots of tricks you can do like you mentioned that would make it very very difficult so there's actually. I think it's a French group called quarks lab that experimented with. With things like this so what they did was they were looking at situations where you'd have a an equation that no matter what the input it always results in the same value and they'd basically do like if equation of user inputs run the program otherwise don't and the idea here is if you're running this with a real and put it's always going to be true because the equation always results in the same value but the symbolic and the S.M.T. solver so the actual solver for these constraints doesn't understand that this equation is semantically. Just the like the unity function. And so it's much harder for it to say OK well this is a ridiculously complicated equation and spent all its time trying to solve the thing that's completely useless and so they actually did a cool thing which is to kind of do an alternative symbolic representation called the I don't know what they call they did some kind of normalized form a bit vector math it's pretty cool work and they were able to basically do a much better job than a traditional S.M.T. solver at taking these constraints and like semantically simple flying them to a reduced form which could then be solved so there's a little bit of work in that area but I would say the majority assumes that we don't have things like deliberately obfuscated programs. And. Yeah. Yes yes that's a good point and the research is actually they talked with us in their paper they explicitly selected only a subset of the C.D.C. challenges that had this property where there was you know plain English they could reason about. So I think the this particular example of human intervention is somewhat limited in that respect although the the researchers here did propose an interesting thing which was trying to map. Arbitrary values to kind of an A ran our map program values to kind of an arbitrary real world equivalent and see if just kind of. With a random mapping still correlations within the program would be present in some sort of semantic way that human could just kind of find inference even though it's random but you know a human could say well just by happenstance these are always linked so they propose a perhaps thing like that would work and I think that's at least worth a try so basically just you could map a binary data to a random ASCII string and see if humans are so helpful in the work is a little bit more complicated than I presented in terms of the machine is more intelligent about guiding a user with how to take the combining these different things to form a seed. But I think the real value of this kind of human computer analysis in the future might be things where you have an expert who's more capable of looking at you know say a call graph an either or something and directing the fuzzier or doing some more things like that where the expert human who can reason about the structure of a P.D.F. or whatever. Can bridge that semantic after the computer and help it understand better. You talking about the like the dynamic symbolic execution that I talked about or traditional. Yeah I mean the problem is just scaling it so when you have you see you trying to load like a big program with a symbolic engine it's the state explosion problem is such that it's a very hard to like you can't just like exhaust the stays the permit just way too big and there's a lot of research a school that's going into figuring out how to you know combined have together to merge them to make it more scalable how to prioritize things that look interesting it's not like that but this is still very much an open problem and in fact if you look at. Just pure effectiveness rate if you had to pick one or the other between the dynamic symbolic execution phase testing plus the thing usually the sorry the branch coverage I felt like posting F.L. is actually usually more effective and together they're more effective than either on their own but in terms of just how software actually works and the current state of how static symbolic execution works I think that people are finding that first testing is just kind of more bang for your buck right now I hope that will change because doing this in a more kind of analytical formal way is very appealing right first testing is non-deterministic and you know somewhat random and suffers from a lot of problems so I hope this where it's moving in the future but it's going to depend on computers getting faster and people getting more intelligent about how to handle this path explosion problem anything else that will thank everybody.