[00:00:10] >> I thanks everybody for having me to speak today they saw pic is on improving security through software so they were going to talk about the software bloat problem in general and how software or loading which is a relatively new technique to solve this problem is trying to make strides in improving security by by eliminating software bloat Specifically we're going to look at how the software to bloating affects the security posture of a loaded program variant and we're primarily going to look at how this is related to code reuse attacks and then finally we're going to take a look at some of the measures that are currently being used to measure how much a loading operation has improve the security of a program and whether or not these are the right measures and if they have shortcomings so as a bit of an introduction should come as no surprise to anyone here that software security is a bit of an arms race every time a new attack technique is introduced in academia or into the wild researchers start developing defenses against it which then prompts new research into ways to get around those defenses so because software security is ultimately in undecidable problem this really isn't going anywhere so this off we're going to talk about a method that it's potentially proactive for improving security which is software to bloating and one of the things is particularly interesting about this is it's not necessarily involved in this arms race it's something that we're going to do ahead of time as opposed to responding to a vulnerability that's been disclosed and then trying to patch So let's talk about the problem and that is software bloat. [00:01:51] So modern software engineering practices favor software in systems that are modular reusable and very feature rich which helps engineers rapidly develop complex and widely deployed software but it comes at a cost when software is deployed in ultimately executed by the end user it contains large portions of code that that end user will never use and we call that portion of the software software bloat so software bloat exists at several different levels primarily occurs vertically through the software stack at layers of abstraction so what we're referring to here is the user level program at the top of the stack and how it interfaces with its dependent libraries and utilities and ultimately on the operating system so an example of this is a commonly used library for writing C. programs Lipsey application programmers will access Lipsey through an A.P.I. that contains a large number of common functions such as performing system calls. [00:02:56] Input Output operations that cetera now when an application programmer uses live C. they typically only use a very small number of the functions in functionality that's present within Lipsey However this entire library gets it during the it's loaded into the addressable memory space of the program and it's executed Ultimately this means there is this huge chunk of memory that's being allocated for code that will never be executed by the program running on top of Lipsey. [00:03:22] Additionally software bloat occurs laterally within the software stack as a result of feature creep which is something that you should be familiar with if you've written virtually any code ever. Different features always seem to find their way into a program that are certainly not used by all or some of your end users a good example of this is Microsoft Word here displayed on the left side of the slide this is what happens if you enable every tool bar and every help window that Microsoft Word makes available to you you see you actually have very little screen space dedicated to actually writing and saving and composing a document another good example of this is a common utility called Perl you probably are familiar with this essentially it's a command line utility for retrieving data via a variety of different network protocols so most people when they install crawl on their system those use a package manager to retrieve it and it's on their system without knowing that it actually supports 23 different protocols a good number of which are obscure and not likely to be used by your common end user so you come an end user might use or might want to fetched out of the H.T.T.P. or H T T P S or even F.T.P. and some of its variants but it's unlikely that this machine is sucked up to a serial network and is pulling something down over telnet or using the streaming protocol to transfer data So ultimately when the user executes these programs the entire program is loaded into memory along with its dependencies and all of these unused features contribute code to that memory footprint that is considered bloat. [00:05:04] So just how prevalent is soft or bloat while last year some researchers conducted a study of the bloat present in. Both user level programs and in their dependent libraries as well as vertically through the stack by looking at manage execution engines such as the Java runtime environment and also the A.P.I. made available by different operating systems and some of the high points that will discuss some of the consequences of this talk are as follows. [00:05:35] This study looked at 12 commonly used programs such as Firefox Chrome D.L.C. and sublime and found that approximately 65 percent of the instructions and 32 percent of the functions found independent libraries are actually required this generally means when you on average when you include a library. In your software development process you're at MIT The are on average only going to use about 2 thirds of that code so you're committing to using or you're committed to bloating your program by one 3rd per library on average and looking at the feature set across different deployable across different appointments it found that relatively common functions that we use these programs for require a relatively small amount of the actual code present in that primary For example if you just want to play an audio file using D.L.C. You only need 12 percent of the overall instructions to accomplish that. [00:06:29] Another good example if we want to open a file compose it and then save it using sublime We only need about 27 percent of the overall instructions the other 73 percent of instructions correspond to. Features that the user may or may not use so what are some of the negative effects that software bloat carries with it well they're usually divided into 2 categories performance and security. [00:06:57] Performance is relatively straightforward the more our code is bloated the larger it is on disk the more complex it is for new engineers to come on board and start working on that project when we loaded into memory it requires more memory. Or larger addressable memory space to to accommodate all this bloated code and then in particular importance for embedded devices that increases the power consumption of that device which can be fairly critical one thing that there are one negative impact that's not always present is execution time especially when we're talking about feature bloat feature blowed if they're not in vote typically doesn't impart a large amount of additional execution time for a program however. [00:07:44] Supporting these additional features especially when a program is loading for the 1st time may incur some one time overhead so it is possible that that code bloat increases the execution time of the program but that's not always the case so of greater concern for me in the work and research I do is the effects that software bloat has on security in particular. [00:08:06] Bloated code is harder to analyze with. Different static and dynamic analysis tools so for example if you're using Coverity to scan your code base to look for suspicious usage patterns the more bloat that you have in that code base the harder it is to actually identify those and the longer it takes the more budget you need to actually track down all of those all the alerts that you're going to get associated with that bloated code further It also makes it more difficult to use fuzzing tools such as A.F.L. to fuzz that code because there is now a larger diversity of pass through the program that that goes or will try to exercise to find bugs. [00:08:47] Bloated code also potentially the most concerning of these different effects is a might actually contain reachable attack vectors or vulnerabilities that we don't know about which we refer to as 0 day attacks so if I've deployed a particular software package on my system and I have features included that I'm never going to use I carry the risk that and it's actor might try to exploit that program by triggering some of this on used code which is ultimately useless to me which is a very bad condition. [00:09:18] Additionally software security defenses against. Some of these various attack patterns they incur a significant amount of one time and also per execution overhead and bloated code needs to be protected just as much as the regular code so this could potentially make applying a software defense and tractable for your program if it's too bloated to be useful. [00:09:44] And finally bloat code can potentially be used and a code reuse attack so just like with the vulnerabilities and it's actors we discussed earlier this bloated code is never going to be used by the end user yet might be potentially used by an attacker if they try to mount an exploit and this is going to be the main topic of the remainder of this talk is how software is how software the bloating. [00:10:06] Affects the security posture with respect to gadget base code reuse attacks so let's take a little bit of time to just provide some background on gadget based code reuse attacks you don't have to be an expert in these particular attacks to understand the rest of this talk but it's not exactly something that everyone is familiar with so I just want to spend a little bit of time on it so code reuse attacks are a class of exploits that circumvent some primitive stack defenses such as write exclusive or execute and they do this by using previously existing pieces of the program or it's dependant libraries to accomplish their malicious intent. [00:10:48] So an early example of this type of attack is known as return to in Lipsy just by showing hands who's heard of the return to the attack OK so it's likely that half of us have so ultimately very primitive sacks smashing attacks would try to exploit an unprotected buffer inject a malicious payload into that buffer and then branch execution to that malicious payload to essentially. [00:11:19] Execute arbitrarily arbitrary malicious code segment so defenses like right exclusive or execute prevent that by only allowing you to have memory that is write a bill or executable so if you can write into the memory I you can put your malicious payload there you can execute it so code reuse attacks get around that by using code that's already marked executable. [00:11:42] So in a code reuse attack an attacker will take that same unprotected buffer and they'll attempt to overwrite the return address of the function and the memory region corresponding to the arguments for the for the function to invoke a sensitive function within a dependent library and that's usually Lipsey So essentially what this allows the attacker to do is to hijack the control flow of a legitimately executing program redirect control flow to a sensitive function such as the system call in lib C. and then pass it a parameter like Slash bins last sh which has the effect of opening a shell in the program that was never intended to actually use that So this is definitely an example where software bloat carry some significant risks and so if your program never actually intense allow users open shells the fact that code exists and is loaded into memory space for. [00:12:40] The for your program means that there's now something in attacker can trigger bits that you don't actually need. So the primary drawback to these early code reuse attacks is that they're limited in expressivity I can't write any code that I want I can only use functions that are available to me within the memory space of the program so it's not necessarily a great thing for an attacker but it's certainly usable and functional exploits were built based on that tactic so this all changed in 2007 when the 1st gadget based code reuse attack was proposed called return oriented programming and return oriented programming overcomes these expressively limits by relying on small snippets of code called gadgets as opposed to full functions that are present within within the code earth in the compromise program so what the attacker does is they use the stack in the semantics of return instructions that are part of these gadgets to change short segments of code together displayed here to have the effect of executing those instructions to accomplish their their result and the semantics of the return instruction in the operation of the stack ensures that the malicious control flow is maintained throughout the attack as opposed to the intended and legitimate control flow so ultimately a good way to think about this is return arena programming turns the stack kind of into a virtual machine or we can use existing pieces of the code as complex instructions that are part of an instruction set architecture that we can then use to basically write a program as if we were writing it in source code. [00:14:29] So as I mentioned before security is an arms race and as soon as returning programming was proposed defenses for Turner and programming started falling soon after. Most of these defenses focused on protecting the integrity of the stack ensuring that you can't use it to conduct a malicious attack by hijacking its control flow so in response in 2011 a new variant of return arena programming called Jump oriented programming was developed that eliminates the use of the stack for control flow so rather than use instructions that end in a return jump oriented programming gadgets and in a indirect branch or an indirect jump instruction and can the attacker maintains control flow using a special gadget called a dispatcher the dispatcher essentially sequences which gadgets are to be executed in what order and each gadget when executed uses the jump instruction that terminates it to return control flow back to the dispatcher so you can see we don't need return instructions and we don't need the operation on the stack to maintain to load up the next gadget for us we've done that entirely using gadgets so the dispatcher is one of several what we call special purpose gadgets so jump oriented programming and other variants of it since we are using the stack we need special gadgets they don't actually express the malicious payload but instead are used to construct the exploit provide the scaffolding for it to make it work so that's one of the key differences between return oriented programming and jump oriented programming there are been some additional research and other variants and other flavors of jump into programming such as cop which is color and programming which uses instructions or uses gadgets that end in call instructions however because those called us call instructions have additional constraints and additional operating semantics are more difficult to use and require a larger number of these special purpose gadgets to operate. [00:16:33] So we have some background on code reuse attacks let's tie this back into software blow by blow code whether it's from an unnecessarily live and unnecessary library function or an extraneous program feature it gets loaded into memory and that memories accessible to an attacker and that memory can be used that memory or that access code potentially contributes gadgets to the overall set or their instruction set architecture it adds to this catalog of different instructions they can potentially chained together to make an exploit so because the attacker can use ease and we can't it follows naturally that if we can eliminate this excess vocoder will remove the number of gadgets or reduce the number of gadgets and potentially make life harder on attackers So in response to the software broke bloat problem there's been a significant amount of research especially in the last year into software devoting and software to bloating is essentially generalize with this single equation we perform a deep loading operation on the original program using some additional context as him put in the output is a variant of the original program that satisfies just enough functionality for this particular usage context so software to bloating is kind of an umbrella term it can apply to a variety of different techniques that can be applied at a fire at a number of different areas in the software lifecycle we can deal with the source code before it ever reaches the compiler in a preprocessor pass to get rid of extraneous features we can wait till the code has been converted to an intermediate representation by the compiler and trying to bloat features there we can wait till the program has been converted into a binary and try to debug the features there or we can wait till that program is loaded into memory and then eliminate the memory regions that are associated with that bloat code dynamically at runtime. [00:18:32] So as some additional background we're going to go over 3 methods for deep loading programs are people earning software that have been proposed in the last year all these were proposed and presented at conferences in 2018 so these are relatively new These aren't necessarily. The building methods that are available for off the shelf usage but they will likely inspired generations of dealers that will be used commercially the 1st one to talk about is chisel chisels an automated tool that targets unnecessary features and code and what it does is it takes input a test script that is used in a feedback directed program reduction algorithm to successively with each iteration remove a little bit of the program such that we don't cause our specification to fail our user test a fail and if that smaller program works with respect to our context which is the which is this test script we try to cut a little bit more out and we keep doing this over and over again until we get down to the smallest possible program that still satisfies the original test script and also obviously still compiles and if we remove anything else from it we cause the test script to fail so ultimately this method target source code and tries to cut out as much source code before it reaches the compiler and then converted to a binary. [00:19:55] A technique that targets code when it's in its intermediate form is called trimmer trimmers another automated tool targeting primarily features but instead the user defines a configuration context as a static configuration file essentially a set of constants that can replace individual variables in the program at compile time an example of this might be let's say you have a data transfer application and it can work over ether net and it can also work over serial communications and when you compile the program you tell it whether it's going to work over or Normally when you invoke the program you would pass a command line parameter that says work on the Internet or work on serial and this particular case when we compile the program we told a compile time whether this is going to be deployed on an ether network or over a serial network and we cut out all of the code using aggressive compiler optimizations that's associated with unnecessary interface that we no longer need. [00:20:54] So this is a diagram of how trimmer works they take simple to manifest file that essentially provides compile time constants and it takes in an intermediate representation the program that provides it implements constant propagation and loop unrolling which are common compiler optimizations but it does these very aggressively because it can actually leverage this context greater context for the deployment of that individual program which ultimately results in a binary that satisfies or that satisfies that original context but is smaller than the original so both of those techniques were primarily focused on them in aiding features that lateral bloat so this is a technique that targets vertical bloat and that is bloat that arises from unnecessary library functions that we don't need this is called piecewise compilation and loading which for the sake of time. [00:21:49] So P.C. as another automated technique exists in 2 parts the 1st part is a piecewise compiler that when. The user compiles their code performs dependency graph generation on all of the external dependencies that code relies on to find only the dependent library functions that it actually needs and it keeps track of them embed this data in the binary representation when the program is loaded using a custom loader and all of that library code is put into memory it goes back and eliminates all of the memory pages and memory segments associated with code that we aren't going to use as unused library functions and marks them as non-executable this isn't actually remove the code but it has the effect of denying the attacker the ability to use it in a code reuse attack. [00:22:39] So all of these different. Loading techniques they all attack the problem of software bloated to different place in the software lifecycle but they'll make similar claims on how they improve security the primary one is that they claim that they reduce the number of code reuse gadgets which means there are fewer gadgets available to the program to the attacker and hence it's now harder for them to stitch together and exploit using these gadgets these papers also claim some additional benefits in the form of software diversity meaning if an attacker has figured out how to exploit one version of the program and then we deep load it and we put it on another system they can a surly use all the information they glean from the 1st program to exploit the next one because it's fundamentally different it's had pieces of it removed and also getting back to our discussion on 0 days it's possible that when we remove code that we actually remove owner abilities that were present in those in that code and those vulnerabilities were originally reachable then we've essentially made that Valerie if we've essentially eliminated that vulnerability in the DB loaded variant. [00:23:48] So it's easy to make these claims or say they happen in. Or to say they exists but it turns out in practice they're all incredibly difficult to measure and to prove that we've actually made these security improvements so when it comes to software diversity when we look at the bloating as a software transformation we're intentionally leaving the pieces of code that we want to keep alone or trying not to touch them so that we don't potentially break the program or break one of its functions this means any data the attacker has about the operation of those cuts features actually does still work in the bloated variant Additionally doesn't stop the attacker from reversing this new variant one of the 1st steps in actually conducting a raw purge OP exploit is canning the original binary to find out where these gadgets exist in memory and if we can do that for one program software to bloating isn't stopping us from doing that on another program so ultimately when we talk about software diversity we're just kicking the can down the road we're saying the attacker is going to need more time but we haven't actually stopped them they've just made their job slightly harder now vulnerability elimination is even more difficult to measure so we can show or the people who have presented the authors of these papers presenting these different techniques they'll typically show that you know there's some C.V. or some known vulnerability in a program that's been eliminated by their deep loading technique but who cares we know about these vulnerabilities generally patches exist for a lot of these vulnerabilities it's much easier to packed a package than to deep load it to get rid of it so deep loading is not the answer for known vulnerabilities and then we talk about 0 days how do we know if we actually got rid of it we have no way because we didn't know about the vulnerability in the 1st place so we can say the bloating improved security but we'll never actually know until a new vulnerability gets disclosed and we can then go back and look to see if that vulnerability existed in features that were the bloated in certain variants of the program. [00:25:47] Now measuring how do bloating affects gadgets is actually relatively straightforward or at least that's some that's what we can release it's very straightforward on its on its face of a superficial level existing static analysis tools like gadget will scan a binary for you find all the gadgets and tell you how many of them there they are they'll tell you how many of them work for up exploits how many of them work for job exploits and even find some special or even finds one kind of special purpose gadget called a Cisco gadget. [00:26:18] So it kind of follows naively that we can just count the gadgets in the original program count the gadgets in the deep loaded variant and see how many we got rid of so that's actually what these papers that have recently been published do chisel reported that they were able to on average to below to remove the number of gadgets in the program by 54 percent 20 percent for tremor and then 71 percent for P.C.L. Now these numbers aren't directly comparable because they target different kinds of bloat and they have different measures of what the total value is but either way these are all very good numbers and would lead us to think that security is improved so this is good right well it turns out this is actually much more complicated than we originally thought so some of the research I've been doing over the last few months has been focused on the relationship between software to bloating and how it actually improve security and to give you the good stuff up front. [00:27:13] We found that gadget count reduction is actually too superficial to be an accurate metric for determining how the bloating has improved security so we did this by exploring the relationship between these 2 between software debility and security by taking the attackers perspective and asking how. How does the voting actually make it more difficult or challenging for me to construct an exploit and in our experiments we found that was actually fairly common for software to bloating to achieve really good gadget reduction numbers but actually not improved security in any way that we could measure. [00:27:49] Worse yet we actually found a few instances where software was a bloated we achieved good gadget reductions and the security posture the software was measurably worse. So let's talk a little bit about. The set up of our experiments so we chose 3 different software packages that Barry and structure and operational complexity we chose let us which is a industrial network protocol program or a library we chose B S T V which is a small F.T.P. server program that is designed for embedded systems and then we also chose live kernel which is a very commonly and very widely used data transfer library for each of these different software packages we deep loaded them at a variety of levels of intensity so in conservative scenarios we got rid of some peripheral features that are unlikely to be used in moderate scenarios we got rid of the same peripheral features but we also eliminated some core features So for example for liberal This would be conservative would be we got rid of some of these obscure and the SO terror data transfer protocols and the moderate scenario we also got rid of everything but H.T.T.P. and F.T.P. only the most commonly used features and finally for aggressive scenarios we limited the we devoted the software down to a single core function so we devoted little girl to the point where it only supported T.F.T.P. interactions supported nothing else this is representative of a scenario in which you deploy a piece of software for one purpose and one purpose only and you know that ahead of time. [00:29:26] So we use our own bloater that operates in a manner that's comparable to chisel and Shermer It cuts the program code out before it goes to the compiler before it gets turned into a into a binary and we found that are operated roughly on par with these different. [00:29:42] With these different methods for aggressive scenarios we reduce the gadget count by 30 percent on average 50 percent on average for modern scenarios and 8 percent on average for conservatives and aerials then we focus our analysis at deeper measures of software security relevant to gadgets and specifically we look at 3 areas what's the composition of all the gadgets that are left in the program which gadgets were actually removed what did we actually get rid of and where they useful and also Were there any side effects of getting rid of these gadgets that we didn't intend or that we didn't anticipate 2nd we looked at the veil ability of special purpose gadgets as I mentioned earlier these special purpose gadgets are very important they required to scaffold. [00:30:23] JOP and cop attacks and while they don't actually express the attackers exploit they're still critical nonetheless and then finally we looked at the expressive ity of the gadgets in the package meaning what's the computational power we have with those gadgets so if I remember from our earlier discussion the attacker uses these individual gadgets as complex instructions that do things like add 2 values together load a value in a memory or store a value in a memory or conduct a conditional branch so the gadgets you have available will correspond to individual operations so we looked at the gadgets set in the in the program overall to ask you know essentially what operations as a support and are these operations enough to accomplish an exploit so initially our results are actually. [00:31:15] Pretty surprising we took a look at the gadget composition and discovered that when we did bloated software we actually had some gadget counts that went up. Which was very unexpected essentially we you know from a naive perspective when we cut source code out we anticipate they'll be less code in the binary and hence the number of gadgets will always go down well that was actually the case when we looked at the sum of gadgets using the existing metric that's commonly used in current publications which is gadget count reduction So for example in all 3 of these. [00:31:48] Different packages you'll note that the number of gadgets always strictly decreases when we but we do bloat but if you break these down by type you'll notice that we actually had for live mob us only 99 jock gadgets originally but as we continue to aggressively bloat it we actually introduced more and increase the availability of Jock gadgets so we dug into this little bit more me found out that deep loading that removes code from software actually introduces gadgets and makes new ones there's a couple mechanisms by which this happens the 1st is compiler code generation and compiler optimization when we do blow code we trigger different optimizations and we trigger different code generation choices within that underlying compiler so as an example let's take a look at this function here which is present in lip curled. [00:32:38] This code right here corresponds to a family of protocols that we did we said we don't need so when we do bloated the software we cut this piece out the corresponding code in the binary that corresponds to roughly the bottom half of the source code is listed here on the left all the instructions that you see there in bold are common to both and you can see that changing or that by devoting this code we actually change control flow of this function significantly and this entire block of code is actually removed. [00:33:10] Because the majority of it was deep loaded however this one particular instruction this low defective address instruction was retained. And the devoted version you can actually see that the ordering of these instructions on the left is not the strict ordering on the right the compiler made different decisions about when to do these instructions that run related to each other in order to optimize performance and Ultimately this means that the gadgets aren't the same so we take a look at the gadgets that the original version contributes they're all limited to these last 4 instructions specifically we contribute 3 gadgets we can start it instruction $24.00 and go to the return start a $25.00 into the return and so on there are 3 gadgets now by eliminating this unconditional jump instruction which normally breaks the chain of gadgets upward we now have actually more gadgets even though we have less code so we take a look at the bloated version we have to go all the way up until this function call to. [00:34:09] A To get a crow version string and we actually see that from this instruction downward to the return we're generating gadgets so ultimately this means that it's actually slightly difficult to read but. We can jump to this instruction and execute all of them as one gadget and so on and so forth so we actually have 5 so even though we got rid of a significant portion of code we actually end up with more gadgets and further their difference in this particular block here versus this block only to the gadgets are the same which is the pop instructions before executing the return everything else has a different ordering of the gadgets which may or may not be useful to an attacker So this partially explains why our gadget counts go up but more insidiously is actually due the nature of X. $86.00 the next $64.00 itself. [00:34:59] So because X. 86 and X. 8664 have variable instruction lengths we're finding these instructions execute we actually aren't bound to the original instruction boundaries we can jump to any byte address and interpret that code or interpret those bytes as instructions until we reach one of these key gadget terminating instructions like return and we can actually find new gadgets that are embedded within the gadgets that the compiler placed so we call these unintended gadgets so let's take a look at that final block of code from our original example on the left these are the gadgets that are available for interpreted at the original instruction boundaries if we were to just drop off the single byte value $48.00 and then reinterpret the instructions starting at $83.00 we actually get a different instruction and hence a different gadget so not only are all these gadgets present in the original program or not only are all these gadgets contributed by the sequence of instructions all these gadgets are. [00:36:00] Are made available and you can actually find that being more aggressive with this and trying every individual byte location prior to the byte that corresponds to the return instruction yield some fairly different instructions of what we originally saw So the original suchan sequence added a immediate value to a register popped a couple registers and returned and we've actually just by having that instruction contributed a gadget that performs subtraction and then returns and because we've changed our of gadgets and we've changed what intended gadgets or what compiler place gadgets are available and by lengthening that in our previous example we have actually drastically changed the number of unintended gadgets and also get created So this actually turns out to be a very significant source of gadget introduction and bloated programs. [00:36:53] So we took a look at each original packaging the 3 variants we made of it and we took a look at the number of gadgets that were in the original package and in the deep loaded packaging the number ones that were brand new and that were introduced and it turns out that gadget introduction is not rare and it's also not limited to a small amount we've found on average actually in worst case overall across all gadgets 34.5 percent to 58 percent of the gadgets are brand new this means if we had a 1000 gadgets and we got rid of 500 of them of the 500 that remain 250 of them are brand new on average for these particular these particular variance so we've already said they got to count reduction doesn't really tell us the whole picture and introduction really doesn't really tell us the whole picture either we saw and got to the matter of whether or not these gadgets are actually useful now unfortunate is worth things get tricky and we get back to the situation where it's very difficult to measure because we don't know the attackers names before they set out to exploit our program we don't know which gadgets are actually useful we don't know which ones they need we don't know if it gets rid of those and we also don't know if when we introduce gadgets if we made something that the attacker actually doesn't need so to further study sorry about that to further study this effect we took a look at the population of special purpose gadgets and the expressivity of these gadgets so instead of looking at individual gadgets and trying to compare them we took a look at the sets as a whole to figure out how they would actually affect the attackers ability to construct an exploit. [00:38:29] And we found that the results were just about as. They were the results were as cloudy as as they were when we actually just use gadget our ducks in the 1st place OK So let's take a look at special purpose gadgets 1st. If we take a look at each particular loaded bearing It will notice that in general when we do blow programs we're actually reducing the number of these special purpose gadgets that are available and in some cases we actually found that these gadgets were introduced by deploying meaning one of the newly created gadgets caused by devoting actually fell into one of these categories of these these critical gadgets more interestingly though we found that in no case where a gadget was available or a special purpose gadget was available did you voting actually eliminate all of them so ultimately this means that we didn't improve security by deploying because we did not deprive the attacker of a particular type of special purpose gadget that they would need. [00:39:33] And in one case when we moderately loaded live kernel we actually introduce a special purpose gadget that wasn't there now we only introduced one but the attacker only needs one so in this case is actually a significant security detriment and and our other 8 scenarios we didn't necessarily find a compelling security improvement so now let's talk about the expressive ity of a gadget said so expressivity measures the computational power of the gadget set and typically the high bar for expressivity is turning completeness which means we have enough gadgets to support enough computational instructions that we can write any program we want. [00:40:17] And we can simulate on the Turing machine. However practical ROP exploits have actually been demonstrated that don't require full expressivity So when you take a look at this in a couple of different levels so this is a very difficult thing to measure there are a lot of automated tools that will tell you with a given set of gadgets how expressive is it so we found one research tool that will scan a set of gadgets classify them by the type of operations that they support and then tell you whether or not that gadget that sort of gadgets achieves a level of expressive 80 or not. [00:40:52] So we use this gadget scanner and you know caveat here is that it really only works for short gadgets and it only works for up exploits so this is not a universal set of findings but it's compelling on the last so this is the results of. Our analysis with respect to expressive eating which shows 3 expressively levels one was the minimum expressive and he needed to conduct a practical ROP exploit the 2nd was the minimum expressivity needed to perform that same exploit but in the presence of address space layout randomization and the 3rd is. [00:41:30] Whether or not the gadget set supports turning completeness so for each one of these individual data points we have express how the gadgets that achieve that level express city in terms of proportion so in the case for a limited bus it has gadgets that satisfy 6 of the 11 necessary classes for. [00:41:52] Achieving a level of expressive any required for a practical Rob exploit so I will note here that none of these programs on their own have enough gadgets to achieve these levels of expressive ity But these gadgets are combined with all the gadgets present in all of its libraries and in order to focus just on the effect that the loading had we took a look at this in terms of partial proportion and we found that in 5 and 6 cases. [00:42:19] Devoting actually added gadgets to classes that were previously on Satisfied meaning that now that you've loaded program is more expressive than it was before that it was originally Now we also had plenty of instances where the expressive 80 of the gadgets that was decreased but it turns out to not be in a very large proportion of the of the number of scenarios ultimately in 3 of our 9 scenarios devoting actually work to reduce the expressivity So that's only in one 3rd of our examples whereas keep in mind our regional metric which was just looking at how many gadgets we got rid of that was positive in all 9. [00:42:54] In 3 of the 9 scenarios we had some that increase and some the decrease in expressivity and 2 had no effect and finally we actually had one scenario where the bloating actually made the set of gadgets more expressive So let's take a look at our analysis at a high level combining all these different measures we've talked about in the writing and column we assess the overall impact of loading had on security with respect primarily to these 2 measures that actually address whether or not an attacker can construct a useful exploit from this particular set of gadgets we found actually only one scenario was clearly positive which is a far cry from 9 out of 9 scenarios achieving positive gadget countered of fact we found one where devoting actually made the devoted variant less secure in a measurable way and had no appreciable benefits the arrests were either mixed meaning some things were worse some things were better or neutral where we had ultimately no effect on the bloating. [00:44:02] So ultimately what this says to us is that we need an alternative for gadget reduction we can't just count the gadgets and the bloated variant and say hey we got rid of these many we got rid of 20 percent of the gadgets my programs 20 percent secure that value has no bearing on it whatsoever in fact our data shows as proof of as proof by counter-example that devoting can actually make security worse and a lot of times it can actually fail so what we need to do when we develop programs is use a more sophisticated set of measures we've identified these 4 measures primarily which include some combination of magic and reduction and then also measures that we've also discussed as well as an analysis of external dependencies to determine the actual security posture before and after. [00:44:49] One interesting fact. That makes the external dependency contribution particularly interesting is the original paper on the turn around and programming showed that lib C. contains all the gadgets necessary for a turing complete. Is complete on its own so any program regardless of how will you do it if it links Lipsey it's got a complete set of gadgets so in that case it didn't matter that you do bloated the original program at all you haven't reduced the expressivity of the overall program once it's loaded into memory so a key thing here we don't claim this is an exhaustive list we say that there's significant research needed in this area to figure out what are the measures help us actually identify how impacts security so we put our money where our mouth is and we try to improve upon the current method for de bloating which is largely up which is largely automated where the user provides a specification we produce a binary and then we confirm that we have less gadgets and we say we've improved security so based on our analysis we propose another approach where human analysts with a set of tools conduct than an impact assessment on whether or not devoting actually improve security and if it didn't try again. [00:46:07] So we conducted a case study and found that for our one negative instance where are one in our one negative devoting scenario we found that if we went back and just selected a few extra features just 3 out of the total set that we did loaded and told that a bloater to leave those in will actually turn this negative into a positive we eliminated the fact that it in that it increased the number of system called gadgets that were available and we also. [00:46:36] Reduced the expressive 80 of the gadgets at available after the bloating as opposed to increasing it. So the things I want you to take away from this talk before we get to a question answer is that primarily the relationship between software to bloating and software security is very complicated and superficial measures can give us at worst a false sense of security threat or at worse a wrong sense of security and in some many cases a false sense of security. [00:47:07] Loading for security is really not like to perform for performance the performance is very straightforward if we get rid of code there's just less code so that's a good performance benefit getting rid of more code doesn't necessarily mean better security in fact in our one case study we found that by devoting less we actually achieved an improvement in security and actually eliminated instance where we made it worse so the key takeaway is also not that the bloating is bad or doesn't work to improve security because it can be found instances in our in our scenarios where it actually did the problem is you need to look at this and potentially be willing to iterate with your devoting specification many times in order to actually find a loading specification that improve security as opposing as opposed to making it worse OK this time I'd like to open up the floor for questions for the last 30 minutes or so if anybody has any yes. [00:48:13] Well right now the answer is maybe so J O P and C O P attacks are very complicated and difficult to pull off and. Well not right now but there's significant research being done to use automated tools like S.M.T. solvers and machine learning to make the effort that you have to put into constructing these more you so terribly more complicated it's X. easier for a person to actually find in fact I would be surprised if in the next 5 to 10 years we have tools that just automatically scan a binary and come up with a sequence of gadgets that works and gives you a piece of shell code that you have to inject into that program it's certainly not outside of the realm of possibility. [00:49:01] Was just presented in 2016 so it's unlikely and I am not aware of any C O P exploits in the wild. Yes that's actually a very good point it's part of one of the factors that led us to developing a human in the loop an iterative model it's so you can take those kind of considerations into effect if you're looking at this from a very practical standpoint and saying OK I really don't think that J O P Or C O P attacks are actually useful when I do the security impact assessment I can essentially drop all of the analysis of those gadgets and focus just on our gadgets which means that I make this decision of whether or not to accept this deep loaded variant I can say OK you know this made security worse for G.O.P. and Suki gadgets but I don't care because it's not realistic I'm going to take it anyway or if you decide for your particular application that you are really worried about those because let's say you implement a significant number of R O P defenses but you have no G.O.P. defenses you might be able to optimize your DB loading for that particular scenario so you bring up an interesting point and it's one that is in some ways lost by the automated model because it's looking strictly for what reduces the number of gadgets to the lowest number yes or. [00:50:43] It's possible. These techniques are all relatively recent and there's a lot of unexplored areas so on in particular there are a lot of concerns with chisel producing unsound programs because the user uses a test script to identify code that they don't want to keep as opposed to identifying it manually in source code it's possible that if you write a test script that doesn't include the portion that exercises a key piece of your security posture like checking a password for example it's possible that code could be completely removed so for example if your test scripts you know examines 15 different functions of your program but it doesn't actually. [00:51:25] Exercise any of the logon code the iterative redundant or the iterative feedback directed framework could actually remove all the code associated with a logon page and remove it entirely allowing you to access potentially privileged instructions so there are a lot of concerns with the security from the negative point of view with the bloating as well and that remains open work for the future. [00:51:50] Any more questions. Sorry. That's an excellent question and I would really like to know the answer to it so one of the challenges with with working in the area suffered a bloating is it's relatively new so of the 3 techniques that we talked about none of them were available when I 1st started this research which is why we had to build our own to bloater and use that one. [00:52:25] Since I started this research chisel has been made open source but only in the most recent few months and the other methods they exist only on paper at this point the authors haven't put their code out on out in the public for other researchers to replicate so all I can really speak to is the instances that we do bloated and what we propose and we found in the general case devoting most of the time it doesn't really give you any major improvements in security in rare cases it doesn't improve security and actually makes it worse but there are some cases I think 3 out of 9 scenarios where we actually did improve security to a small degree which is like I said a far cry from what we might imply if we only look at gadget reduction and say hey look the gadgets went down so it's more secure Well it doesn't really have to deal with have to do with inefficiency on the developer's point of view because they really aren't involved in the actual deed loading process so we select features to the bloat the real core of the problem is that we just we can't predict what the compiler is going to do with that code before it makes a binary and because we care about gadgets at the binary level and a lot of deep loading focus is at levels above the binary level we have this we have this entire segment of the software development life cycle where we just can't predict what what's going to happen in the ultimate result is that the results are what we expect so future to bloating efforts might focus on doing this directly in the binary in order to reduce the number of unexpected gadget introduction and might involve using the existing techniques that we have by just using a minute or a fashion to ensure that we're actually getting good results or it might be something that hasn't been proposed yet. [00:54:24] That remains an open area most compiler operations on source code are aimed at performance so there's actually not very many compiler optimizations that I'm aware of that actually take security into account at all so that's another open area of work that's being kind of opened up by. Research into software to bloating Yes I do think it makes a difference but for our particular experiments we kept the bill configuration the same between variants to eliminate that as a can down in variable then we use the default build configuration that's provided with these packages to simulate a realistic scenario so what we didn't do is we didn't take the code and put it through some obscure compiler optimizations to see how that affects things we does use the exact makefile or C. Make build configuration that was provided with these packages to build it to simulate what a what an average end user would actually get if they did loaded their code so that also remains open work for the future is exploring how these compiler optimizations when they optimize performance are they do to my zing for security and how can we potentially strike a balance between the 2 and other questions. [00:56:00] Well. Now that's entirely possible. So yes we are getting into a bit of a. Bit of a divergent path so the answer at the end of this research line might just be that when we do it the security benefits that it has are limited to a particular scope and if we want to rule achieve real software security through software transformation then perhaps I need to take on a different form as opposed to trying to respect code we don't need per haps we address the problem in a different way but that is all. [00:57:00] Research that kind of exists in different veins or. That we leave for future work. Right and he dition questions. OK thanks everybody for coming appreciate it thanks.