Today is a very interesting day. We're actually going to have three presenters previewing the work that they have published at the ACM CCS conference. Anyone who works in system security, cyber security in general, you know that they're spore top conferences in cybersecurity, ACM, CCS being one of the out of the four flagship conferences in our field. And then there's additional ones for cryptography and other, other disciplines, some cybersecurity. But today is all about a system security and I'm going to give it away first two PhD student here at Georgia Tech, john fuller. He's going to talk to us about his work and malware analysis and botnet take down S. Can you guys hear me? My name is Jonathan fuller. I'm a PhD candidate in the School of Electrical and Computer Engineering and I'm a part of the cyber forensics Innovation Lab. The title of this talk is C3PO, large-scale study of covert monitoring of CNC servers via over permission protocol infiltration. Though button it disruptions and take downs are driven by control monitoring before any action is taken, the profile, the botnet, and after to validate success of counteraction it ends. There are four main milestones to any button at the shrub shouldn't take down. The first discovery of a botnet that is sufficiently large to warn a response. Next is monitoring or profiling the botnet to collect information about the botnet weren't illegal. Response from authorities, and also gather details to influence counteraction attempts. Next is the actual counteraction of the botnet, where it's based on data gathered, gathered through profiling. That data is used to come up with techniques to counteract disruptor, take down the bottom. Then finally, monitoring continues to validate the efficacy of counteraction attempts and also to observe the botnet for any resurgence. Though, for example, after discovering trick bought in 2016 and tracking its progression to one of the world's most prolific malware infections, Microsoft's thought to counteract trip, but to protect the 2020 US elections. To profile trick bought Microsoft, collected a staggering 125 thousand samples. Analyze these samples for evidence. They identified mapping between communication routines, located a 128 scenes you servers worldwide. Using this information, there are granted legal permission to counteract trick, but they're also able to collaborate with international organizations worldwide to disrupt the truck bought botnet. After counteraction attempts, they continue to monitor the observe that a 120 of the a 128 servers were successful, you disabled. But the eight, the remaining they were able to continue to monitor, to identify any other avenues of approach or avenues for disruption of despondent. Though there are two approaches in industry and in practice or in literature and in practice to who monitor botnets. The first is the passive monitoring, which is done by injecting a node into the, into the botnet to observe c and c to client transactions. This approach has been used since it's, it's not easily detected. However, it requires a full reverse engineering effort to maintain, to build and maintain since sensor nodes for continual acceptance into the botnet, it also requires lots of time on target or time as a sensor to be able to gather accurate information at the bottom it. And there are several botnet resilience methods that have been developed specifically to thwart center sensors like reputation schemes. Thus, this low and slow approach is widely used. Conversely, active monitoring, where we have to reverse engineer the Mauer to gain a deep understanding of it. Recall Microsoft had to analyze a 125 thousand samples to identify the CNC servers. Though, That's one approach. Another approach is to locate a CNC server vulnerability to achieve access, to exploit to monitor that server. These approaches usually result in more accurate information gathered about the botnet. However, not only is vulnerability identification tedious, active monitoring is easily detected as aggressive techniques often prompt bought orchestrators do use some type of defensive evasion capability. So both approaches have their benefits, like we mentioned, stealth and accurate information. But there are downsides. Make it prohibitively tedious, easily detected, and sometimes inaccurate, ultimately resulting in the reducing the chances of a successful botnet disruption or take down at them. So what if we could achieve direct CNC access under the guise of a real bought. Though we have an alternative approach that seeks to do exactly that. The bots are and trusted agents of the CNC server. In fact, attackers are entirely dependent upon info, information exfiltrated by bots. Again, it's situational awareness into the victim's network. To enable command and control, standardized protocols are increasingly, increasingly being used by bots for file transfer, data storage, and message based communication. However, we found that a lot of these standardized protocols are over permission. That means that they provide feature-rich and unfettered access to the CNC Sarah, beyond a subset of features implemented by a given client. This prompted to key insights. The first is that over permission protocols combined with the trust CNC service place in their bots, exposes a scalable opportunity for covert monitoring of c and c servers through protocol infiltration. So even though this allows us to infiltrate the CNC server, that we just want to access the bot and fumble around? Or do we want insights into the CNC server to perform more targeted search? So this prompted our second key insight, which is that we can figure out what to expect when accessing a CNC server by studying them Our alone. We can identify the CNC service competition and content. So these key insights, if achieved, will allow us do covertly monitor the CNC server or simply put, adopt all the benefits of passive and active monitoring while discarding the cons. But can authorities really rely on this for a scalable approach to covertly monitor it's against usurpers. Oh, yes, they can. And we collaborated with Netskope leading Secure Access Service Edge for by provider. I, built a measurement pipeline and we studied 200 thousand Mauer collected over the last 15 years. Or I should say. They were collected for the study, but they, they occurred over the last 15 years. And we found that roughly one in three of these Mauer use one or more over permission protocols. This proves really promising, especially when considering combating modern malware. Better still, we found a steady increase in Ober permission protocol used throughout the last 15 years does pay special attention to 2015 to 2020, where we found that the majority of over permission protocols were use, accounting for over 80 percent of them. But the trend shows that techniques are not only reliable now, but we'll be well into the future though. But how exactly did we come to measure these numbers? We develop C3PO, a malware steady pipeline that takes a malware binary as input and then an instruments them our extracts memory images for later analysis. And C3PO identifies if it's, if it's an over permission bought, C3PO identifies the protocol, use the infiltration vectors based on key insight one, this allows us a spoof bought the CNC communication and then identify as a CNC monitoring capabilities, which is based on our key insight to which infers a CNC composition and contents. So the first step is dynamic memory image extraction. We know that malware often use obfuscation and packing techniques which constrains analysis and it inhibits large-scale studies. Though C3PO instruments to them Our to bypass obfuscation unpacking, and then it hooks APIs of interests for dumping memory images. Though instead of analyzing the pack malware, we analyze its payload captured for memory. And this is based on two observations, right? So irrespective of the packing scheme, when the Mauer as unpack, it connects at a CNC Caribbean network-related API. Or the second observation is that modern malware have multiple unpacking layer. So capturing multiple memory images at various layers allows us to reconstruct them our payload. Though this approach, as you can see, gives us multiple memory images or malware. Then taking these memory images, we can identify if this malware isn't over permission bought or not. So for each memory image, C3PO constructs a CFG starting at the point that memory image was taken to all reachable code. Than to combine the CFGs into a, a, C to F, G, which is a CFG representation for the entire Mauer. C3po identifies overlapping blocks and murders there, ensuring no duplication. C3po then traverses the CFG to identify invocations of protocol use. But to ensure we counted for, accounted for nuances and protocol implementations, we split how they can be implemented up into two ways, the high level implementation and the low-level patients, though in this case, is an example of the over permission open database connectivity protocol via its high level implementation using protocol specific APIs. Inversely, low-level implementation uses network-related APIs in conjunction with a protocol specific keyword or token that prefix is a message sent. This example, we see the IRC protocol used in conjunction with the send command by sending Nic and a message and then pass in a message to the CNC server. This is considered a low-level protocol implementation. Now based on our own analysis and combined with reports from industry experts, we identify eight over permission protocols that we consider in the study. But C3PO is an extensible framework, though more protocols can be added as needed. After we identify that over permission bots, we need to target our key insight, one, which is to identify the infiltration vectors via iterative selective symbolic execution. Of note. During this phase, C3PO only targets those APIs that give us Bach credential is used to connect to the CNC server. Though in this illustration, it's the SQL Connect API. We want to use symbolic execution to extract the arguments of SQL connect, but we'd likely encounter path explode though. Are we going to do this? We decided to use backwards slicing from the SQL Connect API to identify all parameters that are concretize or where the parameters are concretize on them our hover. There's still the possibility of encountering symbolic loops if we forward slice against this backwards slice. So if we symbolically execute constrained by the slice, we still may encounter symbolic loops are computationally complex functions. But you want to further constrain our analysis to ensure we have a better chance of success while avoiding path explosion. Though, instead of starting at the program entry, C3PO begins at the nearest function and explore them our skew, the SQL Connect API. If the arguments are concrete, analysis ends in this case, we could see that two of the three were captured, but the third wasn't. So not concrete. Still what a C3PO do. It iteratively expands along the CFG, but we are still constrained by the backward slice, and then it resumes execution. If the arguments are concrete at this phase, execution ends. Else it just iteratively expands along the control flow graph, along the backward slice through the control flow graph. So now once their concrete, we have what we need to infiltrate the CNC server. So now that we can access it, we could poke around, start gathering information. But let's remember, we want to remain covert and go undetected. Though we want to leverage Our key insight to the CNC monitoring capabilities. So remember they reveal that the monitoring capabilities reveal the contents and composition of the CNC server. And this is derived from now or analysis alone. Though we can think of the CNC server as a dark house. We can think of the infiltration vectors is the key. Good Think of the CNC monitoring capabilities is a map. And when we infiltrate the CNC Sierra, we don't want have to fumble around or hunt for information because we have that map. This map is an essence six categories of 16 capabilities. So taking the, the sequence of APIs found, C3PO uses our capability models and maps that sequence of API. Like I said, it influences the data exfiltrated to the CNC server and maps those sequence of APIs to a capability model. And it reveals screen capture, her lab monitoring capability, which, which is the category. So now that we're armed with our infiltration vectors and CNC monitoring capabilities. Epo can now proceed covert monitoring at c and c servers via over-promise your protocol infiltration. As an example, we present the debt block malware as a case study. A C3PO analyzes the depth block malware and finds the use of the over permission File Transfer Protocol. C3po then begins iterative selective symbolic execution to extract infiltration vectors. In this instance for deadlock, It's the username, password, her name, and port number. Before infiltration, we need the map of the CNC server. We need the CNC monitoring capabilities. So C3PO conducts that analysis and fines the categories of victim profiling, live monitoring and file exfiltration. Of note, we expect to see the victim profiling capability and screenshots. We expect to see screenshots on the CNC Sarah, because the victim proof or the live monitoring capability, sorry, corresponds to a specific screen capture capability and the deadlock hour. So using the infiltration vector is we achieve cobra to access modern capabilities, we can have a targeted search of the CNC server. Find that the CNC server contains 47 directories and around 2500 files. Of those files, 44 percent of them roughly are PNG files, which makes sense because that's what's inferred virus CNC monitoring capability that we identified on the Mauer. And which lot, while this malicious binaries are not infer from the data extracted from them, our, given our cohort access, we were able to explore the CNC server more and found an additional nine malicious binaries. So this shows us that the CNC Server not only post stolen victim information, but also malicious binary is attending to perpetuate additional cyber crimes. So covert monitoring successfully reveal the composition and contents of the DEP lot CNC server. We can use this information for peer disclosure and victim data recovery. We could identify the geographic scale of the botnet. If we looked at the information stored on the CNC server and then continued access allows us to gather more information that might be useful towards disruptor and later on down the road. Though now we deploy C3PO on our tooth has an malware sample and we first discuss the overall mission protocol landscape. We note that the 62000 protocols identified. We took a little bit more granular approach to identify the number of identifier's. Specifically, we wanted to know kind of how the protocols were implemented across the board. So 65 thousand plus instances of protocol invocations within the 62000 malware we identified. We also identified 8500 malware families that use over permission protocols. Of those FTP, TFTP are the most popular with over 55 thousand instances of older permission protocol US and over 8000 families. And we notice that there's an average of four protocol identifier as per MAU or this tells us a type. The amount of vantage points we have. So if the Maoris using FTP put fog get file, we know that at least will have those capabilities, the infiltrating bots. It's not surprising that we see FTP and our earliest samples that's been around for a long time. But what is surprising is that it's still being used even though FTP has known insecurities. Similarly, IRC, it's 10 thousand plus uses and over 400 families also observe and our earliest samples, but similarly surprising, we know IRC is a frail centralized architecture. But malware authors continue to prefer ease of use, ease of use and efficiency over security. So that's why we believe these over permission protocols are still used today. Interestingly, want to draw your attention to this 11 weird spike MongoDB. We found this interesting, so we did some more digging in. This matches the Chrome Password steel that was found. It was found and it was quickly reported upon. And then it disappeared, that we believe it disappeared because they quickly discovered it. But MongoDB, like FTP Irish seeing the other over permission protocols, also provides a PhD in E sub u. So we believe to see MongoDB research over the next year. So now we delve into the monitoring capabilities. Remember this allows us to infer the composition and contents and the 62000 over Permission box. There's an average of seven capabilities per bot. We find that pot capabilities as expected or are more generic in nature, file exfiltration, victim profiling, and, and buying in lab monitoring. Ftp, an RRC protocol is the boss that use those also account for the most capabilities since they rank the highest in our dataset. And targeting capabilities are limited. So capabilities that are specific to software systems, they're not as prevalently used, which is expected, because they don't have the greatest effect on the greatest amount of Mauer. The last thing we evaluate is the top ranking families. These are the families that show up most in our dataset and over permission protocols and did what ranks the highest with over 9 thousand. I'm didn't remains generally consistent with the use of protocols and capabilities. And it's one of the most recently reported with respect to the other families. Next we see FTP is the most used protocol and these top families, which is as expected from our dataset, um, with the exception of way bot, which is an IRC bot. What is not expected though is as browser passwords dealing being used. And nine of the 10 families, this is deviating from the norm based on what we previously saw them. We are finding show that over-provision protocols are increasing. Making our techniques he implemented and C3PO. Gail button. Last thing means that built infiltration and giving the prevalence of the monitoring capabilities identified, we can perform targeted searching of infiltrating C and C's, ensuring our approach remains covert but also accurate. There's much more in the paper. There's another case study we discuss average here, Adversarial Robustness, evaluate pack malware. We have some ethical considerations as we seek to, to access the bot. And then there's plenty more highlight. So I want to thank our collaborators at Netskope, our team at the cyber forensics Innovation Lab. Go read the paper when it's out. Let me know if you have any questions. All right, let's thank our speaker. Alright, well, our next speaker is setting up. I think we have time for maybe one quick question. Anything from the room for John? I am Joelle and some curious. In the Microsoft like example, you gave it this art, what was needed in the Microsoft? He said they had a 125. Why did to require so many samples? Yeah. So why did require that money to extract the CNC information? Microsoft and go into the specifics of how they attracted the necessary information they need. They just noted that they, they, they collected a 125 thousand trick bought samples and Microsoft and collaboration with other industry organizations analyze this malware to try to locate mapping routines are mapping to communication routines to identify the CNC server. So I'm not exactly sure why they needed that many, but that's the number they reported as using to to identify the service to take out your bot? Yeah. Good question. Maybe time for one more. Anything else? Any other thoughts? Yeah. I this is unfortunately a legal question. Why is it legal for you to be able to hack into the botnet? What's the basis for you getting the data and why isn't that a Computer Fraud and Abuse Act? Yeah, that's a great question. So we take a lot of precedence from Bernstein and other research in this scale. And Bernstein actually mentioned that yet using the channel provided by the CNC orchestrator and in writing that channel back, and it's not fancy or Computer Fraud and Abuse. And furthermore, we did not. We perform what we call metadata analysis. So we, we then open the PNGs to see what was in there. It's for the nine malicious binaries we just computed hashes and then scan those hashes and VirusTotal this either malicious or to. In further militaristic, though, we did not take anything either the server we didn't influence any of the CNC servers operations on. We just use the channel provided by the C and C are the orchestrator to view the information on the CNC. The, our, our case study. And the results may seem limited, but it's the information we were able to extract doing some metadata analysis versus say tortious interference honestly and sees the good question though. That's a backup slide, actually. Yeah. Good question. Right. Thank you so much, John. Okay. Coming up next, we have Carter yank him and he's a PhD candidate as well here at Georgia Tech in the School of cybersecurity and privacy. I'll let you take it away. All right Here. Okay. Thanks, President, for the introduction. Good afternoon, everyone. My name is Carter, and today I'd like to share with you the work that my colleagues and I have been doing on unmade, but hunting with data-driven symbolic root cause analysis. To kick this talk off, I want to start by pointing out that due to the rising cost of successful cyber attacks, we see that accompanies no longer rely on purely reactive defensive, like network firewalls and host-based IDS to prevent attackers from exploiting vulnerabilities in the software that be. Instead, what we're now seeing is an addition to these defenses. Companies now rely on dedicated security analysts who engage in proactive been a bug hunting using techniques like binary fuzzing and binary symbolic execution. The Discover proactively bugs that are lurking in the software that they rely on. And then they want to report these to the developers in the form of bug reports. That way, patches can get issued and deployed before attacker even has a chance to attempt an exploitation. The Wild. The problem though from the security analyst perspective, when they want to perform bug hunting on a piece of software is they have to overcome several challenging technical problems. They have to figure out what inputs are able to drive the program. How to reach deep behaviors in these programs. How to recognize when they've triggered a bug. And then they have to report this bug back to the developer. And if we look at a dominant techniques for doing this kind of analysis like fuzzing. We see that fuzzing leans heavily on the analysts already having familiarity with how the target program works. These tools make assumptions about how the bug will manifest itself, typically in the form of a program crash. Then what you get from this process is an example crashing input that can be shared with the developer. And while these kind of techniques have certainly been effective at raising our awareness of vulnerabilities out in the wild. There's a miscommunication that happens when it then comes time to explain this to the developers. For example, this is a public email thread for the Linux Kernel project. And we see her and analysts pointing out that they found and report a bug using a fuzzer. And the developer responded, great. And you also submit a fix as well. We are drowning and fuzzy reports and just throwing them out. It doesn't really help a one here. A more Thanks, Greg. And for those of you don't know, Greg is, Greg is one of the lead developers unlikes Kernel project. So how can we do better? How can we get Greg the information that he needs so that he can fix this bug as opposed to throwing it on the back burner. So what we realized in this work is that to address Craig's difficulties, we need to stop relying on crash reports to inform developers of problems because the crash reports tell them how to crash their programs. But it leaves the burden on them to then figure out how to actually fix it. Instead, what we need to be transitioning towards is root cause analysis and doing some preliminary analysis for them to ease their burden. The problem though, is that to do this analysis, we need more data into what's going on inside the program when a bug manifests itself. The where can we get this data? So to start this work, what we realize is that most modern day commodity processors have a feature on, that. A lot of people aren't familiar with, called processor tracing. And with processor tracing, you can configure a CPU to output a log of all the control flow events that happened during runtime. And then once you know all the control flow events, you can actually recover every instruction the CPU executed for a given user program. And since this is a hardware backed feature, we can do this recovery process only about a 7% runtime overhead. And what's interesting about a overhead that low is that it then means in terms of where your processor traces can come from. They could come from a fuzzer if you already have one set up. But let's assume the analyst doesn't already know how to drive their program. In that case, you can also get these traces straight off a production servers or even end-user devices. And then this allows the security analysts to do an analysis without actually needing to know how to drive the program and cells that are essentially piggybacking off of the end-users. And if you think I'm speculating here, when I say, when I talk about the feasibility of doing something like this. Grab a copy of Windows 10 right now, that's up to date and look at the install drivers, you'll see one of them is called IPT dots. This, this is a Microsoft implemented and deployed driver right now on your devices or collecting PT data. So this is actually quite viable. And then once we have these real executions, we can then use them for something called a binary symbolic execution. The idea here is that we can take a concrete memory snapshot at the beginning of the execution of our program. We can symbolize the inputs to this program as I'm showing here for one, the command line arguments. And then we can take one of our traces and we can follow that same path and our symbolic analysis. And this gets us deep into the program's logic because this is a real execution. We can then look at the instructions that we execute to reach this point. And we can now start asking ourselves, where any of these instructions interesting from a security perspective? For example, maybe we saw repeat move instruction. And what's interesting about repeat move is that's a complex instruction for copying data from one buffer into another. So a natural question you might have is can this copy operation overflow? Well, now that we have this path, we can go back this and explore an alternate path, namely one where we copy as much data as possible given our symbolic constraints and with, and we might end up with a state like this, where now our program counter has become symbolic. And because at the beginning we only symbolize inputs to this program. What's happening here is that by choosing the values of this command line argument, the attacker actually gets a directly decide which virtual address the CPU will execute next. This is indicative of a control flow. Hijacking is a serious vulnerability that should be fixed. So to go into a bit more detail, I now want to switch to a real-world example. This is some real code that I've taken from a program called NTP Q. And As it turns out, there was a vulnerability in 2018 that could cause a buffer overflow in a control flow hijack, like I alluded to on the prior slide. In this case, the way that we address this is that suppose that we have a trace that goes through this function and we notice this for-loop with this copy instruction side of it. So we decided to stress it by iterating as many times as we can. And we end up with a state like this, where if you look at the return pointer, which I'm representing on the last line of the state. You see that's equal to a symbolic variable in this case as 258. And as 258 came from one of the command line. So once we've identified this memory corruption, in this case we've corrupted return pointer. We can now go further to help break and start doing a root cause analysis. First, we can find the Blaine state, which is the state that actually corrupt and memory. And not surprising in this case, the blame state maps to this copying instruction. We can then ask ourselves where the near, what's the nearest control dependency dictating the execution of this copy instruction. As it turns out, it's the conditionals inside this for loop. We can then ask ourselves, what would it take to not corrupt this return pointer? Well, to not corrupt this return part, we have to go back to our last memory, corruption free state. And we're presented with two possible paths through this function. One path goes around the for-loop again, but that's what corrupted the return pointer. But let's take a look at the other path which is to exit at this point. And when we exit the loop and analyze the end of the function, we end up with a state like this where now the return pointer is no longer corrupted. It, it maintains its original value, which I'm representing here is the constant C1. So as a result of our analysis, we now have these two states, one that triggered the bug and one that didn't trigger the bug. And we can ask ourselves what's different between them? Well, as it turns out, the most concise difference between these two states, the buggy state, we don't see this particular ascii character at this offset of one of the buffers H name. Whereas in the bug free state we do. And what you'll notice is that these two constraints directly contradict each other. So from this contradiction, we can now propose a preliminary patch, mainly, as long as we ensure that this particular type of character occurs within the first 257 characters of h name. This overflow won't occur. In terms of what this looks like in practice. Here's the raw output from our system. I've underlined a couple of key lines here, namely the detection point binding of the blame state and then proposing preliminary patch. Here it's expressed at binary level because we implement our prototype to work directly on program binaries. But then using debug symbols, we can propagate this up into source code and make this pretty to generate a bug report like this, where now the developer has a clearly labeled root cause and a preliminary patch to take a look at. The now our developer, Greg is happy because he has a better idea of what's actually wrong with the program as opposed to just how to crash it. So to quickly recap here, we're using production process or traces as our inputs. We're using a symbolic execution to achieve our depth. Were then leveraging those constraints to detect. Two of our memory corruption. And then we're also using those constraints to get our root cause report, our preliminary patch that we've eliminated the need for the analyst to be familiar with the target program. We've eliminated the assumption that the bug will manifest itself in the form of a crash. And we've gone Greg and more information to help them out. In terms of a prototype, we implemented this for 64 bit Linux. We use anchor as our symbolic execution engine for those of you who are interested. And we created these symbolic models for a number of prevalent types of vulnerabilities, namely control flow hijacking, stemming from stack and heap overflow, and temporal memory safety violations like use atrophy and a couple others. We took 15 real-world Linux programs that then evaluate and prior work. And we collected traces of these until we saw convergence in terms of basic block code coverage that took anywhere between two to 300 traces pre-programmed depending on the size of the program. Each of our processor traces encode anywhere between 1 second, 30 minutes a wall clock execution time. Namely, depending on the type of program, web servers run longer, things like compression utilities run shorter. And then we fed these traces into our analysis, which took anywhere between 1 second, 4 hours to complete. And what we ended up with was a total of 39 vulnerabilities being detected in this dataset. Since these are programs that had been previously evaluate and research papers, 31 of these were vulnerabilities people were already familiar with. However, an eight cases, we actually found novel zero-day vulnerabilities despite these having been programs that had been previously analyzed. And if we look at these in more details, several these became a new CPE advisories. In terms of CVSS score, we can see that almost all of them fall into the high and critical categories. However, you'll notice at this point we did have one vulnerability detected that got a low score of a 3.3. And you might be tempted to think, this is probably a vulnerability that's less important to the developers, as it turns out it's not. And let me show you why this one still mattered. In this case, we're dealing with a library called Babel. Babel has a custom memory allocator called Babel Malik. And babble Malik works by allocating a buffer. It places some metadata at the beginning of this buffer, and then it returns an offset. So the color and the color can now use this to store their data. What's interesting about this design is this first field signature. Signature is a string pointer that at runtime can either points of the string babble memory or it can point to the string so long and thanks for all the fish. And the reason it does this is because the companion function babble free, has a double free check implement inside of it. The idea here is that when a freeze memory it sets the signature to salon and thanks for all the fish. And then if it ever sees that again, during a free, it knows it's already freed this, there's a double free and we should abort. The problem is that there's a flaw in the implementation, which is that before performing this Check, it calls the underlying system screen function, which for Linux is typically live, sees free. Lib C free likes to place metadata at the beginning of any chunk. It freeze. And then the double-check, the double free check is performed. So we've got some problems here. The first is that we're performing, we're triggering a use after free from lid baffles perspective for all it knows, Let's see, has already released this memory back to the OS, in which case we have a crash. Or even worse, this memory could be reallocated to another thread. And now we're accessing someone else's data. But the other thing is that because lead see always places its own metadata when it frees, it always overwrites. Let baffles metadata. Though this double free check will literally never work. And as you can imagine, developers like Greg are interested in knowing that they have a security mechanism and then a code that doesn't actually work. Since the acceptance of this paper for publication, we continue to run this. Our system going through the most popular packages on the Debian average depository. And to this day we're finding new 0 days and most of them are critical and high. The system is also open source. So if you're someone who's interested in bug hunting, definitely check it out and I'd love to hear your feedback. So to quickly recap, defenders are now relying on proactive bug hunting in addition to software, or, sorry, in addition to reactive defenses. Because they want developers the patch their software, make it more. So here, the problem is that when we grow crash reports that the developers, they get overwhelmed because with techniques like fuzzing, we can generate lots of these, these crashes very quickly. But we're leaving it on them to figure out what's actually wrong and how to fix it. So what I showed you is a hardware assisted trace based analysis for automatically performing some root cause analysis and getting to a preliminary patch to the developer can see where the root cause of the bug is. They have an idea of a possible way to fix it and then that gets them close to refining it into a working official patch. So let me thank you for your time. The code and data for this system is available at the provide URL. And I'd like to thank my coauthors for the contributions. With that, I'm happy to take any questions. Thank you. Okay. I think we have time for a quick question from the room. Anybody? Questions for Carter? Made everyone likes blog content. Thank you for bearing with me. We have one more presenter. Just to give everyone a heads up, we are going to run over will just that all of our presenters and this great work. Joan, if you have to leave early, That's perfectly fine. Feel free to drop or duck out the back if you need. But we are going to move on. So see now you're ready to take it away. Hello, my name is 10 Einstein and I am a PhD student at Georgia Institute of Technology, at least by frankly, they, I'm going to talk about the security impact of allowing typos in login passwords. Although passports have lots of weaknesses such as prone to phishing attacks and require memorization for with lots of different website has two parts. I still dominant chord move online authentication. Authentication systems alone. Logan hierarchy, if she corrected types are passwords. However, people often do typos, wild-type think they had passports, which results in failed login attempts. Type of tolerance. To solve these usability ship prior work, it should use typo tolerance passport authentication schemes, which allows search and typological errors in the submitted passports. Why it's still authenticating the user. Dry road. Look at assessing the security and usability benefits of the type of tolerance. And today like salt about the usability benefits. And right now, let's talk about usability benefits though you can't see it. The pipe most common type was made by people on the left side of the slide. For example, forgetting to capstone on our love housing, the most common typos which switching to all the k's, all the password. On the right side you can see the exhibit them when I put subtype of tolerance, which is because of based on the twin for our Dropbox elementary analyses, because of the top three typos, 3% of all users failed to login. And 20 percent of users who make the type three type COS, the lake to login. So I put tolerant hazardous conjugations. He gives fixes those typos using the corrector functions. Here you can see the corrector functions which define the corrective five most common typos that I mentioned in the previous slide. For example, switch case, all corrector corrects the most common typo by swiping the kids of the older Atlas into passwords. Now, the type of tolerance CQMs uses the hypo tolerance Poli sci is which aren't great at combining by the ghost, correct. Their functions, for example, COP 3, correct. Sub-tree hypothalamus poleis. It contains top corrector function on the left side of the slide. In practice, major web services such as Facebook have to have employed typo tolerant password authentication schemes. So prior to test that the secured top hypothalamus schemes by performing password spraying attack, which texts to Victim Services using the most popular pests of our drugs from previous data lakes. Don't even mind. I think her car problems at 1000 times, which is strong. And are you being practical type of tolerance increase the heck success by only three-person and prior. Conclude that hypotonia and password authentication improves usability be negligible degradation, insecurity. This is pretty interesting results because this concludes that we can. And then if it's from usability without trading up in security. Now, prior to birth, conclusion was based on the password spraying got tech. Though. Is this the only threat module DO valve I may consider about other password attacks. From the websites perspective, there are actually additional types of attacks that can happen because of missing data breaches. In many websites have suffered from credential stuffing and potential and attack, which is the most sophisticated version of credential stuffing attack. Did you shortly explain what the RD credential stuffing and the credential to begin got tax. There aren't many missing data breaches out there. I took us collect leaked credentials from these old breaches and uses them to compromise the victim services if the password in the leaked credential for an account with the password for that same Mekong into victim services while I think a successful. Now the main differences between credential stuffing and credential to Vicky attacks is in the credential tweaking get taxed at tickers also used a modified version of the leaked password compromise. The victim service. Users, they can't use slightly different version of the password across the different websites. This aspects make credential clicking that that's even more successful. So I'm motivation is, if you consider more comprehensive threat models, the prior results may not hold. Though we want to reassess the security impact of type of tolerance in discovered while considering more comprehensive threat models. So I approach to this analysis will be empirical to analyze the security of type a tolerant. That's what authentication schemes users to recent MSU data breaches, which are rich compilation and collection. Number one, you can see some of the characteristics on the slide. So these datasets compiled together many different breaches without indicated which credentials comes from dish breaches doesn't negate take the data such together. For each e-mail, we will how list of the passwords, which are users across different websites. Because we have multiple passwords for each e-mail, we need to find evaluating the attack. There is a question of which passwords are involved in that Tech. Tobii need to consider different possible passwords, which are, which are results in different attack outcomes. Next, let's discuss the attack metrics that we use in our work. Though. These attack metrics are specific to credential stuffing and the credential clicking the attacks which are the tuning of attacks that she consider an average. So these attacks contains two password. One of them is the leaked password, which simulates the liquid I should, and the other is the target password, which to laced actual attack password attacks that if the credential match the targets passport or is correct to the target pest once type of tolerance is enabled. So in practice we have a different ordered password, put yours so for each user and a tech success depends on which path is selected. For this. To solve this, we defined three different attack matrix. Upper is if attack sexist for at least one password pair. And which is like the best-case scenario for that, lower is if a tax tax is for each password paid for all password here, which is the worst-case scenario for ethical random is if a text succeed for randomly selected Pittsburgh figure, which should present the more realistic attack outcome. So for this specific example, you can see the typo corrected version of the leaked password under five different type of talers Poli Sci at the Quaker functions which are responsible for this correction. So as you can see other, any other type of tolerance Poli sci class, but I'm tired of Pittsburgh never matched. And for this specific password pair, then you check the top three, top four and top five type of tolerance Poli sci is leaked password and the target pest that matched under those policies TO, because this Poli sci is contains two last corrector which you will the last character of the licked passports. And this caused the match and the target password to undo tap to the top five type autonomous polis. This upper bound will increase because at least one password pair matched lower bound will not increase. Or any older type autonomous polis it because they kind of had a Pittsburgh never match for any pair. So for all of the payer. So if this, if this specific password pairs is the random selected password peer than random metric will increase if not want to increase cell which this attack matrix. Now we can analyze our Ry are linked data sets to understand how enabling type of tolerance impact the attack's success. Today. When you talk about the breach compilation datasets and the top five hypothalamus Poli Sci. But you can find more details about the other data sets, the type of tries to lysis in the paper. Let's see, replicate the secure generalizes from prior work on our need datasets. Brush shows the percentage point increase in a peck success once typo towards is enabled. Which is the, which shows the magnetic differences between two percentages. Do course during the largest attack size, which is evaluated by prior work, which is a causative attack. Enabling type of tolerance is not significantly, increases the attack's success. Though this results is matched with the prior analyze results. Now why don't we take a look the credential stuffing attack, the story is very different. Even gel enabling type of tolerance for the most powerful password spraying attack. They don't increase the attack's success. As you can see, enabling type of tolerance increases the effect of the crud ashes hopping significantly. Let me check the worst-case scenario, which is the upper bond, of course, besides for security, which is the upper bond. You can see that enabling type of tolerance makes credential stuffing attacks successfully attacked the additional 9% of all users. Now we can conclude that hypothalamus makes crisis, I think much more successful or not. Let's take a look the credential for making attack. As you can see, enabling type of tolerance makes credential to making much more powerful, like special tweaking its powerful than crashes toughing attack and also enabling type of tolerance. The success of the crash of tweaking attack. This deep 0.1%, which makes the crash of tweaking cope successfully compromised 15 more person, more users from the total population was hyperpolarized is enabled. So we conclude that type of Chandler's increase the passport usability but secure to dig irrigation, but significantly degrade the security chip. So should we get paid off type of tolerance altogether? It will be shame because we will lose all the usability benefits. Oh ******. There might be other ways that we can mitigate the security lost while maintaining the functionality. The type of tolerance we aim to predict if a given password is susceptible to credit, isn't nearly susceptible to credential stuffing attack one's hypotonic is enabled because such affected password, as susceptible as passwords, it's such a capability. And normalize terms will identify and selectively disable type of tolerance for the users who are negatively affected by the type of PIDA authentication schemes. I preserving its functionality for the rest of the users. Just goal, we explored machine learning approach to a binary tree decision, binary decision tree classification model used for the six features in total. Though, these features are categorical, Boolean, non-numeric features, we can group those features under passport structure, corrupt too specific passport characteristics, and password popularity. The labeling the password in our machine learning problem is an important and actually tricky task. Even a password may be used by the different users and its use may be susceptible under some users and not others. As a consequence, training machine learning model on all instances will results in noisy signal. To write the same password. It will have a different labels depending on its instances. To avoid unclear decision boundary is we train our machine learning model on distant past works with the consistent label for each password input. And the next question will be, what the password label should be? We explored password labels depending on whether the proportion of the past versus uses that are susceptible is exist that threshold, we call such threshold as the label threshold. The unlabeled threshold, prioritize the security and the labeled password if any of its instances are susceptible. Higher labeled trade shows, trade-off security functionality that we'll be able to fit our model on data labeled 0%, 10 percent, and 25 percent. They both thresholds. So we used binary tree decision, binary decision tree classification for our model and use recompilation dataset. So the ragged we do white mother password emails into two groups. One is not, one consists of 90% of all the males, which is our training data. And the other one cross this time percent of the emails, which is the holdout test data which you will going to use to evaluate the Irish in Nordic model performance on the password instances. So we trained our model on the passwords, passwords for different legal threshold and the different type of tolerance policing. So we can find more details about our training process in the paper. Today, I will present only one evaluation result, but many more are in the paper. So for each e-mail, you randomly select a password. Here one is that leaked password and warnings that target password. So we identify whether the type of tolerance for extra leaked password to match the original random selected password. And if the AR model predicts susceptibility and if the model predicts susceptibility, but the emails password is not susceptible. In reality, we call such e-mails as a false positive. So in practice, our machine learning model disabled type of tolerance for those emails unnecessarily. So if the model does not predict susceptible at that, but the emails passwords is actually susceptible. We call such e-mails as a false negative. Now in this table, you can see the predictions susceptible raise prediction, false positive rates and prediction plus a1, a2 rates. What does the best-known the operating point and the data trained on ten persons labeled threshold or under, under five different type of tolerance policing. Now, as you can see, our model predicts 25 percent to 41% of the users as susceptible, with the to face ranging between 23% to trigger 9%. False negative rates ranging between 23% to 35 percent. In practice, this means that our model, stable type of tolerance, 25 a quarter or the chart of the users while eliminating them as my own native security degree show for tree catch off to users. Let's take a closer look. The top five type autonomous polis. It, they'll eat here. I modeled this Abel's type of tolerance for charging 9% of the users while eliminate the security degradation for 2006 percent of the users. So in practice, deploying such models will results in disabling type of tolerance or the dollar amount of the users while preserved, while, while. You eliminate, think it's secured a delegation. Conclusion using machine learning models to execute the laws can be reduced while preserving punch Nokia part, most users from our work be nearly discovered that type photonic authentication improves usability, but secures a degradation more is more svd down previous day understood. However, instead of wholesale abandoning the type of tolerance machine, Arctic models can be used for trade-off between security and functionality. And each word can explore other race to harden type of tolerance as well as this personalized version. The x-y direction will help us better understand the security impact of type of tolerance. And it's feasible for adding users in password authentication. So thank you for your attention. Okay, Wonderful. Any questions from the room? Now? Oh, nice. Thanks Joseph. Though I guess someone. But this idea of using the machine learning, machine learning model to decide whether or not to use the IPO. When a new workflow would you imagine them doing that? They only do it when you give it new, when you sign up with a new password or they somehow going to apply it to their old password, the password reset and change process. And yeah. Neither thoughts. Questions. Though my question is, is there any way for the attackers to leverage the rate of false negatives and target those specific false negatives that are susceptible but aren't graded as such. For these needs to know how like you have your model defied the password. So I like, I don't think that I may have that they felt like, for example, if pest boy, Facebook use this and it will train this model, are using its users passwords. No, I don't think that attackers may know that all the Facebook users password. If they know, they may just compromise. Directed. That comes, right. Okay. Thank you so much. Let's thank our speakers one more time. All right. Thank you. Thank you, everyone. See you back here. Same time, same place.