They. Are your son in. Law you know. You're the boss of the year this is the first to. Meet me here last. Night. You read me you write in his recent right. To privacy you. May. Also want. To see more of them or even hold those you want to review. And we just took the day to be with. That you so you know. Yes thank you for your kind reduction indeed you know what we do this very sort of practical and pragmatic This is the logo of our lab it's called Prag SEC which is this portmanteau word of the word spread Matic in security and some people ask me you know why do you have an open lock I can there because the closed lock I can doesn't sort of look like a G.S. much right so so the title is secured in privacy issues of modern web browsers and sort of Who have you recognize either of these two programs here. So a couple of people right not that many so these are you know. Two of the first web browsers that there ever were right so this one on the left here is called the World Wide Web browser that was created by certain Berners Lee who is also the creator of the web and on the right here we have mosaic which was the first popular browser able to show graphics right and so you know you can sort of peer to tell that we've come a long way right since these first browsers so what we have today is we have these browsers that are constantly expand. Right in terms of their ability to do things so you know what to browsers have streamlined extension frameworks where you can extend the browser in ways the browser vendors never thought possible or interesting you have push notifications where websites can send messages to you even when your browser is closed if custom web components were developers can create their own H.T.M.L. tags you have end to end video and audio chats and now you even have these upcoming payment request A.P.I. is where your browser is handling your payment data and you stop giving it to each individual website as sort of H.T.M.L. data right and if you think about this browsers are really expanding all the time right of course an expression of features means in expansions of code so we have if you look at Google Chrome today it's approximately sixteen million lines of code Mozilla Firefox that eight million lines of code eighteen and the Linux kernel is sixteen point eight right so Firefox is bigger than the Linux kernel right that's that's where we're at right and those of you who are security inclined hopefully most of you understand that of course you know adding features does not come for free in terms of security right because as you're adding features you're expanding the attack surface of the browser there is more chances for things to go wrong there's more chances for these sort of unpredictable interactions between components in your browser and so legacy in my class right where to security sort of flaws arise they can arise of the design level where you thinking about the program and you're not thinking about it correctly so you you code the logic of your program in the wrong way that can be abused they can arise during limitation right and you hopefully all of you know about buffer overflows in the name pointers across it scripting and all sorts of interesting sort of attacks there and configuration flaws like taking a piece of secure software or save software and installing it for example with bad credentials right and so today I'm really going to focus on design flaws right sort of. Bad things that can happen in one browsers not because you know we use them in the arc up instead of using an S.T.R. and copy but because of sort of our inability to reason about these large piece of software. So to day's work is based on three papers which we publish this year the first one is X. found quantifying the thing to be in your browser extensions that Oakland then we had extended tracking powers measuring the previously fusion enable browser extensions. And hindsight understand evolution of Evelyn abilities in mobile browsers published at C.C.'s right and so if you have any questions that I'm not able to cover today please feel free to sort of refer to these papers for all the technical details and with that I will start sort of tell you about X. HUNT So Who have you here use a browser extension and so that's most people right so most people want to extend their browser in a way that sort of not extend that's not like out of the box and most people would probably use an ad blocker right that's probably one of the most popular extensions but you have all sorts of extensions that do things right you have ones that for example manage your passwords using X. sort of a cloud password manager you have other ones that allow you to sort of read things later when you have more time or allow you to automatically find these count coupons for all the online shops that you're browsing and in general you have a lot of extensions that are very popular you can see here so we have more than tens of millions of users and the most popular extensions right so extensions are creatively very popular and from a privacy perspective they are also supposed to be a little bit more private so who have heard of the term browser fingerprinting before. So a couple of people writing is the idea that you can extract attributes from a browser that are seemingly benign like you are the make of your browser the dimensions of your screen your number of fronts your list of leggins and when you combine them you can actually arrive at a fairly unique fingerprint of your browsing environment and that fingerprint can be used as a tracking identifier right you can browse to their website they can create a fingerprint you can go back to that website tomorrow your underlying attributes have not changed therefore the website can sort of generate the same fingerprint and it can tell that it's still you whether you have deleted your cookies or whether you were in private mode or any of that right and so. Plugins right like Adobe Flash job some sort of web conference call plug ins they were the most potent fingerprinting vectors out there right they were the most they had the most discriminatory power but recent research is showing that actually plugins are fading away and this is expected because browsers are really moving away from proprietor some proprietary code and they're moving towards open standards and things that you can implement using H.T.M.L. and javascript and getting stylesheets right and so and important thing to remember is that a website could ask for your plugins like do you have flash of Java but it can also cannot ask for your extensions right and there is no officially P.-I that a website can use to understand whether you're running an ad blocker or whether you have you know whether you are trying to find automatically discounts or whether you have an extension allows you download You Tube videos right and so since there is no way beyond to retrieve that piece of extensions could we sort of you know think that extensions are really an detectable right they're more private than plugins used to be. And I want to sort of tell you that it's not right and I'll do that by an example so you can see here this trailer for the hackers movie right I want of the best sort of movies out there and here we have no extensions whatsoever while we're on You Tube dot com And here we have this extension called Magic actions for You Tube or right and magic actions for You Tube actually adds buttons and elements in the page so that the user can do things with a movie like download it right try to find subtitles you have this little sort of light switch that wasn't there before that you can flip in order to make the rest of the site dark so that it can sort of enjoy the movie better and the point is that these sort of extra controls they're added in the dawn of the page right that's how the user can interact with it right and because they are added in the DOM of the page this means that a website can find out that there is elements there that it didn't place right so you got a century saying wait a minute these buttons here and this thing there I did not add right therefore they were added by an extent. Right and you can actually do it on tribute and say you know there's only one extension that does this right it's magic actions for you to write so by using side effects we're actually able to discover what extensions you have installed on it and all of these popular extensions they have side effects that manifest in the page right and some of them you may be able to see like this light switches in other cases they may be more sort of hidden from you but they're still changing the DOM right so a website that knows where to look for can actually discover if any extensions manifested themselves in the page that you're looking at right and who of you has sort of ever gotten this warning that says please whitelist the ad blocker before you can proceed right that's because that's how this works right their Web site checks whether that is the relevant DOM elements that are sort of related to the ads and if they're not there in fairs that you're running an ad blocker OK so. Being able to tell whether you have certain extensions present First of all has privacy and security implications in terms of security if you have extensions if I know what extensions you have I can try to understand whether you some of these extensions have an abilities and I can then sort of abuse them in order to be able to do some attack in your browser of privacy I can know things about you because of your choice of extensions right so for example if you have this honey one that automatically finds coupons for discounts I can understand your price sensitive right the pending weather using the Trump field there or the Hillary filter and I'm not making this up these are real extensions right I can sort of understand you know how you are sort of politically inclined right and if you're using mail for lope or sort of some sort of V.P.N. extension I can understand that you're sort of technically savvy right so I mean fairly things about you just from your mere selection of extensions and we can actually go one step further and we can use it as a fingerprinting vector right so assuming that different users have different choices of extensions we can actually use this extension sets in order to discover you again. And AGAIN and AGAIN and not extensions as a fingerprinting vector All right so the objective for our work for X. Holland was I want to understand how many of the popular extensions introduce on page detectable side effects we wanted to be able to do this automatically another large scale right we didn't want to sort of seed manually and inspect every single extension what kind of extension what kind of changes to presidential introduce housing or printable or the extension profiles of the real users is this sort of happening out there in the wild. And how can an actual tracking script check for the president rather extensions and I will skip this sort of interest in the interest of time today but at a high level again there's just not enough time to go in all the technical details this is the architecture of X. hont which is sort of short for extension hound what we're doing is we're taking popular extensions from the browser market and we're patching it's javascript so that we know what kind of Dom A.P.I. It's calling. And then we have a filtering phase where we understand whether extension eighty wants to run on youtube dot com Therefore we only need to take it to youtube dot com in order to find out its side effects or whether dunce he wants to run on the entire web in which case we have to sort of resort to other techniques so we run a scheduler will essentially do this dynamic analysis we're taking these patched extensions and we're navigating them around the web we're doing certain tricks in order to make this easy for us and we have these so-called honey pages which are pages with content that would trigger extensions like we have H.T.M.L. forms and phone numbers like some of you may have the Skype extension that of the magically turns every phone number into a clickable link right so i Phone numbers we have extensions we have images we have forms web advertising right anything that would trigger this generic extension to do something OK and essentially were then comparing you know the before and the after and any change that we find when we're testing each and every extension we can attribute them back to the extension being tested OK so we can find out that extension so and so you know when you have a form in a page it added a button when we had to when you had an added remove did write it change the DOM element so and so and it added this attribute. So we can get really sort of fine grained sort of modifications of what its extension does right and the results we apply these on the top ten thousand chrome extensions we found that nine point two percent introduced the technical changes on any arbitrary you are all right so then percent of your extensions are actually fingerprint able on any website of the Internet right if they just have the right content that will trigger your extension to do things sixteen point six are introduced detectable changes in popular domains right like them like the magic actions for You Tube It would only reveal itself when you write or there's other ones that only reveal themselves from G. mail or on Twitter right so they're these Web sites they have the ability to fingerprint all the genetic style of all the genetic type of extensions in addition to these extra ones that are sort of meant to provide you know better you ice or extended you eyes for them right and you can see here the bottom we have this sort of relationship between the popularity of an extension. And the fraction of extensions and you can see the slightly negative relationship right as we're going toward sort of lower bins of popularity who are actually getting less fingerprints will extensions and our understanding is that essentially less popular extensions are offering less value to users therefore they're doing less things that would produce changes in your dorm and therefore they are less fingerprint able alright. And you can see for example a difference between an being fingerprint and any arbitrary U.R.L. or only in popular euros All right so this is the style These are the modifications of these extensions make right so the one thousand six hundred fifty six technical extensions out of the Top ten thousand. Seventy eight percent of them will adding You don't element in a page it wasn't there before Rights forty one percent will change actually be it's fifteen percent will remove existing downloads like an ad locker and four point seven percent of them will actually change the text of the page they're running in and what we found let's reported is that eighty six percent of these six hundred tensions have at least one change that is unique to them right by look. At that single change you can attribute that change back to that one specific extension right so there's not sort of a lot of aliasing in terms of the types of modifications all right and you can see here the breakdown in terms of category right so for productivity these are all categories of extensions right so and in this breakdown you can see actually that shopping extensions are morphing in printable right now the rest of extensions and again this has to do with a change of the shopping extension need to do like fetch gaunt in the Black want to create new buttons show that you know and you coupon was applied discounts were achieved and so on right so shopping is sort of you know one of the major culprit with you know a third of them essentially being fingerprint of will on some U.R.L. on the Internet All right. And what I want you to understand that this this is not a chrome specific attack right as long as you have a browser that's extended will and the changes are applied in the same dorm that the pages has as has access to then you have the same thing every ability so we perform the same experiment for the top one thousand five hundred extensions switching over to using the Web extensions A.P.I. that chrome introduced and we found again that sixty percent of them were being a printable and least one U.R.L. right of a popular site and seven with three percent on any arbitrary domain right so a very close match to the sort of percentages that we had for Chrome extensions right so this is an inherent problem of extendable architectures it's not something that is chrome specific or Firefox specific. And then we wanted to understand do different users use different sets of extensions like I said earlier so what we did is we coded our own extension that to be true is the least of extensions and just sort of sends it to our servers and we found eight hundred fifty four participants to participate in the survey from Mechanical Turk from students and from colleagues and overall we were able to sort of discover nine hundred forty one unique rows of extensions one hundred seventy four of them were fingerprint able and ninety three of them were fingerprint able on any arbitrary euro right so users definitely are using extensions of in the printable and this is. Essentially the anonymity sets when you're looking at different extensions and the way to read this is for example if you look at the here are the students who wrote. This first sort of black segment it's about let's say twenty percent twenty percent of students are part of an anonymity set of one This means that twenty percent of students were uniquely indefinable just you know there was open all these twenty percent they had a unique sort of set of extensions that no other student had in this been right and you can see for example that as we're making that means larger like you know aliasing through the twenty users we have students we have sort of a larger here fraction and if you look at the callers just for a second you will see that friends and colleagues actually sort of bigger especially in the two to twenty category right and this is to do with our occupation right so I have more extensions installed than your average Internet user because I use them secure myself because I use them for research and so I am more fingerprint able right and I'm more likely to have a more unique signature of extensions installed than sort of you know an average user online All right. And so we also wanted to compare the entropy that you could achieve using browser extensions as a discriminatory power compared to existing tributes from from sort of well known finger printing projects and what you can see here is that for all sort of users we could always have more discriminatory power than time zone than the screen resolution and then the list of funds all right and specifically for the friends and colleagues group which are people that sort of are have more extensions and average more custom style and average these were actually right these were higher in terms of the normalized percentage than the use of plug ins right would used to be sort of the you know the gold standard of a powerful discriminatory. Property of your browser right so and before moving on I just want to sort of convince you that this is not a problem that you can easily fix right we're not finding some artifacts and we're using that artifact to fingerprint extensions we're fingerprinting. The very same functionality that makes them useful right so an extension would not be useful if you couldn't click on buttons and ads would not be hidden and it would remove the things that you would expect it to remove right so in order to fix this you would have to have some sort of cross-cutting changes either in the browser or in the extension system and in the paper we sort of proposed two ideas we haven't implemented these are just sort of conceptual one of them is about the encapsulation of side effects in shadow DOMS and the second one is essentially pollution of the namespace right so if I can't hide my extensions maybe I can pepper in fake side effects of extensions so that when you're fingerprinting me you can immediately tell whether what you're finding is really there or is just pretending to be there OK And so we're sort of you know we're exploring this right now. As a sort of a follow up word so and we don't want to move on to the second paper which is about. Leakage and sort of privacy diffusion that is enabled by browser extensions I don't have any questions here that you would like me to answer before I move on. OK I guess I'm very clear all right so. Again depending on sort of your security background how interested you are in these things you may sort of be well aware that there's a lot of malicious extensions and there is sort of the malicious kind that steals your credit card and there's the other kind the ad were kind that just steals your interests right for example there was this a web of trust extension that was very popular it was supposed to tell you the security status of the website that you're visiting and actually it was discovered that he was selling user data right. And there's actually a lot of cases online where someone you know approaches a user that has a simple extension that has like ten thousand users user base and says OK can I buy it for you for five hundred dollars right Bill says sure you know like I'm not making any money off of this and they buy it and immediately the ad ads were in it right so essentially now they have inherited this user base and they can immediately start. Stealing data from their users right. And those of you that have made extensions that have sort of played with extensions you Mander stand already that runs that extensions have this privileged position in your browser right they have access to A.P.I. is that normal websites do not OK So browsers browser extensions can for example make cross origin requests and they can read the responses so they're not restricted by the same origin policy they can read your cookie jar of your browser so they can read the cookies of any domain they can access your jewelry your bookmarks right so it's a good place to be for an attacker and in addition they have the ability to include quantum scripts and web pages this means that they can actually inject javascript in a page that will change the page and this is the very same thing that we're using in order fingerprint them in next home and write their ability to inject on the scripts so and you think about this right what would an extension be stealing right this is not the Android environment where these A.P.I. is where I can get your phone number I can get your contact least I can get your name right this just a generic browser and extension can steal whatever sort of these browsers access to right so it can steal your browsing history and interests right pretty much what you're sort of what websites you visiting it can steal your search queries right it can steal any data that you enter into web forms and you can also want to learn information about other extensions that again could be used in any five things about you that you may not sort of be super interested in sharing right and in our paper what we do is we sort of the find this idea of privacy diffusion we're actually reuses to my knowledge from an older paper. And the idea is that if you have this if you're extension is telling third parties about your visits to sites right so we're using example dot com and your browser extension here is leaking things to log the Tracker dot com Right it's leaking the fact that you just went to example of com that is privacy diffusion right and it's also collecting user data without mentioning privacy policy the description the Chrome store right so if you're collecting user data I should be able to find the words privacy policy in the description on the Chrome store right that's kind of a shorthand version of finding out that a place you fought about the. Right I listed a very shallow level right so and our objective was centrally again to perform a dynamic large scale study where we install extensions and we find out whether they're leaking or P.I. to third parties right and so essentially what we do is we trigger. Seven hundred eighty or the most popular you or else of popular domains on the web we are met with a magic we do Google searches and we search for leakage of these queries through third parties we submitted forms or the magically search again for leakage and we also identify whether someone is stealing your list of extensions. And we're trying to understand you know which expenses are doing this and what do they have in common right and essentially this sort of top part here is really prime right so we took our framework that could automatically install an extension and they go browse around and detect what's happening and we retrofitted it in order to be able to do this sort of analysis of leakage The only difference is that now that we're collecting all the traffic going to the outside world we can do this for both H.P. and H.P.'s and decrypt the latter because you know we have the keys. And we're searching in that traffic for information being leaked to third parties right and we're not doing something very very smart we're not doing to influence Alliss we're essentially looking for either clear text leakage or for a brisk Asians trying to support as many physicians as we're aware of right so for example you know if you're using multiple basically for encoding we can detect that we can keep on the coding that until it matches or until we hit some sort of upper limit we can deflate things and we can sort of you know reverse Jason's string vibration and we can again sort of you know go back to getting original U.T.S. encodings in order to find out the original information that the extension is trying to hide right and the whole idea is that you're in Corning in some way you're open skating it to make it hard for someone to just simply find out that you're looking right. And we have this approach where we do this at the rate of you're right for so for every step of the process we look at all the requests going out. Towards their parties we isolate each each sort of parameter that's being leaked and if it's not sort of clear text in some way we're trying to recursively decode it right so you can see here we're starting from this jumbled mess right we will automatically the called these two adjacent object and in here we find that it is basically four encoded a property and we go to magically one you know one level deeper and we are we decoded to read bicycle right so in this case for example we can find that someone used leaking the fact that you made a search and you search for the word for the phrase red bicycle right and so in terms of results first of all we found out that thirty eight percent of extensions call third party domains right and you may expect this and this doesn't mean that something bad is happening but essentially you know most extensions reach out to Google Analytics right to report something or to Google A.P.I. still Facebook Twitter and so on right so this is happening and it's not just yet and what we find then is at six point three percent of extensions browsing and said she's three right and the next part is something that we didn't expect right most of these blog posts where they're the kind of motivated our work that we're talking about militias extensions right that are sort of purposefully stealing your data in order to monetize in some way what we found actually to seventy percent of our leakage that we discovered was actually what we call accidentally right so who knows of this thing called the H.P. a ferret or. So a couple of people write the H.P. Federer is this header that sort of probably the single worst design decision of the H.P. brought a whole where you're telling this idea of using now where you were before right so if we were in Facebook on when you click on a link to C.N.N. There is be fettered are going out to C.N.N. that says hey I was on Facebook when I clicked on this link OK And so your browser what he does is actually at the magically embeds this with better header with all outgoing requests right towards their party sites. And what we saw happens is that Google Chrome was the first browser that decided to stop supporting toolbars OK so you have all these people that used to bars the. Now could not make their toolbar sing the normal sort of chrome of the browser right so what they did is they actually create these toolbars using H.T.M.L. and C.S.S. and Javascript and they inject the toolbar into every page or you're visiting OK at the very top and these toolbar steeply refer to third party things like you know I can store information and as your browser goes out to fetch the icon of Alexa or the sort of being logo it's actually leaking through the referrer header where you are right now OK and because we're injecting these in every single page we actually your browser history piecemeal right to all sorts of third parties right and so we have these three extensions here so a vast safe price Mars bar and that's you know quick you can see here they are installed by a large set of users and they're accidentally leaking right your current site all the time to various their parties right and again seventy percent of the leakage we found is actually accidental OK. So on the left here you can see the domains that are receiving your traffic accidentally right so Google Analytics Google E.P.I. is Imus on doubleclick Facebook they're all receiving traffic because of this accidental sort of leakage the other day for a header OK And on that of course you know because you're receiving it accenting Doesn't that you can't use it intentionally right so that's sort of a separate discussion. On there I do have extensions right and you can see for example that again this a vast safe price used by more than ten million users in six then licking your browser history to one third party if you go look five spaces down you find my smart price installed by almost a million users linking accidentally to eleven third parties right and again those third parties can receive the data and use it in whatever sort of way they want even though they did not ask for it OK So and their remaining part of we found was what we call intentional leakage right and intentional leakage I want to sort of make sure that you understand that I don't mean malicious right I don't mean that this thing is licking your traffic maliciously I mean that is leaking it intentionally right so for example. Again this coupon extensions that are the matter you find discounts they need to send the current U.R.L. that your ad in the current site to some sort of back and to see whether you have coupons for the side of your own right so it's intentionally leaking your or sending your sort of current U.R.L. to a third party right and whether this leakage is malicious or benign you have to decide based on sort of what you think the functionality extension is right so we found three hundred seventy three extension to do this more than ninety three percent of them do not use complicated depreciation which is just that in the majority of cases they're not trying to hide what it is that they're doing. And it actually sort of turns out that they're not all sort of leaking to the same block to the same place right so we essentially found that these domains resolved that were intentionally receiving traffic that is up to three hundred and fifty this thing that Peter says we could class of them down for two hundred IP address right still not sort of this you know malicious mastermind receiving everyone's traffic so and on the left here you can see essentially the domains that are receiving the intentional leakage in clear text like Google Analytics Google come short term Facebook and so one on the right you see leakage a popularly gets the disciplining in an opposite fashion right so short a mix panel S.P. and P. patch whatever right and again the point here is that if you're leaking it and it's intentional and you're trying to obfuscate it what exactly are you trying to achieve maybe we should sort of investigate this a little bit more right. Here is again this graph that shows you the popularity compared to the number of looking extensions and we can see this idea again the more popular extensions leak more right and this really stems from the same sort of underlying principle as the one before right more popular extensions of more functional as you are doing more things in the user's browser there's just more chances for you to accidentally or intentionally leak browsing Houston M.P.I. I write to a remote site and here you can see the overlap between intentional accidental right and they get the shows you that because you are intentionally sending. Data to your own back and doesn't mean that you will not accidentally leak data to other third parties right that's kind of the idea here alright and again extension categories our usual shopping category is again sort of the most popular leaking category so if you're using a shopping extension you may or may not continue using it or you may not sort of just use it when you're buying something and then disable it for your for the rest of your question. No So these are the these are the categories from the Play Store. And any questions about this. OK. So and what we propose in the paper if you are sort of accidentally leaking is because you are trying to embed remote continence a fashion so what we're what we're proposing is kind of client side so-called proxy where the browser Woods reach out and get these resources for you and he would do so in a sort of a secure and private manner so that you're not looking the other fair header and we have this very simple sort of extension that I believe is online you can play with it if you want you can use if you want where we're sort of combating intentional leakage by poisoning again right so if I can't help that your data that my data is being leaked online I might as well poison it so that I make it harder for a third party to find out what I really care about right so we have this extension will browsing fog that sits in your browser and detects what side you're visiting and so he then tries to balance the ones that you want of his thing right so if you're using only support sites then you know the extent to automatically visit non-sport site so that it brings your interest sort of in a uniform distribution roughly right so like a third party would not be able to understand that OK this person cares more about this type of stuff versus that type of stuff very simple this idea sort of exists about you know poisoning data and you can sort of get in and play with it if you if you care about that. Other any question before you proceed to the third and final part yes this is. What they're seeing. So these are all that these are always sort of home grown set ups so what we're doing is we're essentially doing extensional dramatically so we're installing it in a sort of headless browser I mean it's a normal browser that we make headless by using sort of visual frame buffer. And then we are navigating this extensions around right and depending on whether we're doing the X. hound or the extended tracking one we're doing different tricks right one of the things that we kind of we are proud about is that in both cases we're not actually visiting the real popular sites like You know Google face when so on we're pretending to visit them by using established over so that the extension things that I am at Facebook and it actually goes ahead and does what it does right and our honey pages are actually on the fly creating the dam that the extension expects right so if the extension says document a good element food we actually create when we give it back to it right there we sort of have this you know dynamic analysis going on right any other questions you know yes. Great. You would have to sort of analyze the specifics tension right but it's really the same idea that you would you know if you have if you want to hide your data and you can stop the user from making search queries you just add search queries to the mix in order to try to hide right your real interest yes. I think so yes and I sort of I have it on good authority that some labs are doing this already because you know we are trying to account for as many of these you know. As we can but if for example you know an extension is using encryption right we can't sort of decode that in any way so maybe there is more malicious stuff that we just can unpack so we don't know what we don't know and I think that would be a good sort of use case for that. All right so with that I will move to the third and final part and this sort of security part this is this will be presented C.C.'s this has never been presented before so you never get your money's worth OK. I hope you didn't pay anything. So so and I hope that I don't have to convince you that smartphones are sort of overtaking the world right so there's clearly sort of this increase in devices and an increase in how much time we spend on our mobile devices and there's this Com Score study from this year that essentially counted you know of the total digital minutes of each user how many of them were spent on a mobile device right and they found for example that in Indonesia ninety one percent of the average users total digital minutes were on a mobile device right and the U.S. for example in China is at seventy one percent so clearly right we're spending more time on our mobile devices sort of these the average user does than they are on their traditional sort of desktop and laptop platforms right and so if you think about security now of the mobile platform most of the research has been dominated and the discussion from our malicious apps what do they do how do these sort of infiltrate the police or how do we detect them how do we contain them right or we have these sort of many stakeholders part of the seamy P.K. how do we isolate different privileges different stakeholders right but there's actually right there is this very powerful app in there the mobile web browser that people haven't really looked at all that much right and the mobile web browser is first of all as vulnerable as the best of browser right you can still be you can still have process scripting attacks and cross eyed requests forgeries and sort of the browser participating mob or ties and activities and in addition to the standard attacks that you already have you have this additional class of attacks that only apply to one mile devices right and so I'll through an example next slide but you have this in the secrets of the mobile platform and first of all you have limited screen real estate right so your phone regardless of how much money you gave for it it's not as big as your laptop and it's for good reason because you will be able to hold it well right so and. This will sort of constrain the browser and how will sort of try to show a Web site that you're visiting right. This is combined with a desire of mobile Web browsers to maximize their real estate that they give to sites right so if I have a limited screen I need to maximize the fraction of that I'm giving to the site so that you can sort of enjoy browsing that site and having that computing power which means that in terms of defenses you can run anything heavy on the user's device because he will be running out of battery all the time all right so let me give an example right so let's see this domain here right so who can tell me what is the real domain of this long and yes. Yes So this is the real domain this I own this for experimental purposes. So what did us right but any domain owner can can can create an arbitrary number of subdomains next to it right so this says secure logging in the portal the paypal dot com the body Good us right so if I show this to you in your web browser in your desk browser you'll be OK You know yes this robotic dude but what happens when you put in the mobile right if you see here Chrome which is now installed according the Play Store again from between one and five million users chrome today's version actually does this correctly right there showing you the part that they should be showing you they didn't do it correctly six months ago OK Firefox which is installed by a hundred between hundred five hundred million users you can see actually that starting from the left most part I hope you can see this a little bit gray right but here it says secure logging the portal the people the com and it's just a little bit sort of gray here right it's telling you that there is something there but you know who which user is trained to understand what this sort of change of color means right and Dolphin which is a popular voice controlled browser present you know installed by between fifty million and one hundred million users again start from the left so it's a secure logging the portal dot pay that's all that fits and then three dots right to try to tell you there's actually more here than I can show you right so clearly right two of these three browsers and then these are sort of the latest versions that you could download today from the place. Or are vulnerable to this attack right I can create is the means I can put phishing pages on them the most users will actually see this part right they will not be able to see their real domain OK So and we thought at the time you know like we we can't be we can't sort of possibly be the first people to think about this right and unfortunately we're not right so we read and we found out there was this research sort of thing you know two thousand eight thousand ten thousand and eleven some of it actually come from Georgia Tech right where people sort of investigated mobile devices and they said OK look you have this limited screen real estate and things would fit well and we could have a tax write and so. And that stopped right and so that was the research and we thought OK we read all of this work and we thought that there is common limitation across all of it right and this common limitation is that their experiments were done once and they were done manually OK so every searcher pretty much made the pool of browsers that were available to them at the time they performed a bunch of experiments and they told the paper OK so this means that we have problems today right because once unmannerly does not cut it OK so you have number of browsers in the market which is your first implication right mobile browsers are sort of there's a very large number of about browsers you have more than one hundred mobile browsers in the Android App Store right and I'm happy to tell you why if life right you have speed of new leases right so you have a release every other week or every third week right so again you make your experiment today three weeks later you're not sure if what you found still holds right and you have a number of attacks that can be very large Right so if you start multiplying these things you can see that this is sort of creating going away from what we can do manually OK so we cannot with using once and manually We cannot answer what is the most secure browser today what is the new secure browser today what browser family has experienced any Gresham right which want to start fixing bugs so we cannot answer any of that and so we decided that we want to be able to answer all of that right so we built this the first browser Goss think we're going to be testing. Framework for mobile browsers right and we called it hindsight right and the idea is that now we know all of these attacks we can actually go back in time and expose all the browser these attacks and track the evolution of an ability over time right and so the high level idea and you will see that this appears deceptively simple and it actually wasn't and I will explain why it is that we collect as many mobile browser specific effects as possible both existing ones from related work as well as in all the ones that we sort of you know evolve and then we collect as many mobile browser versions as we can from as many families as we can when slowly chip or it's browser extension of best advice we expose it to the collected attacks and we analyze the collect data right simple then a year and a half later you realize it wasn't so simple so I will really be if you talk about this I know it's a lot of text you can try to read it but you don't have to so we collected attacks and we called them at that building blocks because you can actually take each attack and combine it with other attacks in order to make a more potent attack OK So we have six excuse me five types of building blocks right so we have attacks that are affecting the event routing of your browser right so some of these work that came out of George that we found that in mobile browsers you could have overlapping elements from Cross origin and sort of you could have the second or third domain receiving the click it from the click event rather than the top most and you can abuse this to do sort of weird attacks. The U.R.L. category of building blocks again have to do with this how does your browser show long domains what if it's alongside the main What if it's a long file path right what do we do when sort of it's an international as the main name versus an ASCII one. A lot about the address bar right so can we actually try to hide the browser the address bar of your browser right because if we can then we can confuse the user about which browser which side they're visiting then we have secured indicators right like so where is the vacant place this is place next to where the S.L.I. counties so could sort of the user be confused about whether there's a cell or not you know do we show whether we have stopped makes inclusion. Not in the continent so government can i frame sort of expand its dimensions past what the parent page wants it to become or will be will mixed contemn inclusion sort of work will sell side certificates work and so on OK And so I want to show you my favorite attack which is a combination of four ab's. So here you can see we are on this site I hope you can see Phantom Menace that come and there's a lot of those in this interesting article with a lot of text OK so. This is where we are and I'll sort of pause to show you what's happening OK So you see here the browser bar went away so chrome when you see when you scroll enough in the page it hides the bar to maximize the real estate that gives the page OK So we have achieved sort of number one and we arrive at this like button right so you really like this article and you want to like it OK so we click like OK and we have this navigation to the face of the com OK. And you will see here the user who tries to log into Facebook come in you will see here we're trying to type in again. And I'll show you this now. OK. So we're actually still on Phantom Menace right so what happened is when the user clicked on like we imitated every direction we showed an address but that's loading We were still on the same side nothing has changed OK And now we have this separate problem right chrome that not every browser does this that's the whole point of the building blocks in the automated testing when you have an H.T.M.L. input element and you tap on it it pops up the keyboard and actually shows you the U.R.L. the U.R.L. bar again right in order to tell you this is where you're typing right now OK so what we're doing is that these are not input types these are canvas elements and this is a fake software keyboard that we're making we have javascript right which is reacting in exactly the same way as you would expect your normal keyboard to react but as far as the website is going said we're just tapping on the screen there is no keyboard present OK And so this is the attack right and we're for building blocks going on here. All right so important stuff. So. Let me sort of just briefly tell you about our data we have thirteen ranks of browser family starting from the popular chrome installed by billions of users to Firefox to Dolphin all the way to shark browser installed by between five hundred and one thousand users. And we're collecting from various sort of their party sources are all data so the forty six hundred browser ab keys and we move duplicate so non-modern browsers and things that crash and we reach these final twenty three hundred case that would a magically test in hindsight. And this just sort of give you an example if labor of the kind of data for Chrome our oldest version was twenty thirteen we finished our data collection process in twenty sixteen so we had a total of forty one unique versions with an average of ten versions per year and you know for some of them like for Firefox or for Dolphin we have more versions per year depending on what's available online and what we also do is we had then to fly the year that every browser was released and we tried to find the sort of reasonable is the K. to assign it to right because you can take in a browser from twenty eleven and try to install it on a twenty seventeen you know recent Android and sort of expect that everything will work well right so we we just didn't want that so. And as I said what is important for us we want to do this analysis in a completely browser agnostic fashion right we don't want to say OK this is chrome therefore this is what's happening this is Firefox therefore this is what's happening once a this is a browser figured out OK so we do this application level U.I. analysis right because we want to magically find where is the address bar we don't know that right so in some browsers the address bar is on the bottom in others is the very top in other section of the top but after some space because there is sort of some space to show you the title of that web page right where is that if I can place again in relation to other elements. So we do a lot of clicks right where we sort of use U.I. Automator to extract the eczema layout we identify. What's what we use a lot of sort of optical character recognition and image comparison because we really care about what the user sees right so if I access this property here the U.R.L. in some programatic fashion I will get the entire string but I don't care about the entire string I care about what the user sees right and the user here sees only part of this string and this is sort of how the attack would work in their real life and so we're using a lot of sort of tips and tricks sort of techniques in Nora to automatically you know in a best effort fashion on allies you know any and all browsers that you sort of throw at hindsight OK and there was this problem that we really didn't expect but it is obvious in hindsight right another sort of in terms of that word right browsers have splash screens you install a browser for the first time he was chrome it will ask you to sort of potentially accept links accept them as in conditions and will then ask you to sort of link it to a G. Mail account or continue without it right if you don't do these two steps you will not be able to expose this browser to any attacks and so high inside must be able to dramatically bypass splash screens but this is the part browser it has four pages where you extols the greatness of our browser right so you have to swipe left four times before you have the ability to get it five minutes OK before you have the second thing also for the fact that we started late. OK thank you ten minutes I'm all right. Now OK so just making sure OK you know so you have to do all of this OK and we're doing all of this again as best as we can and I want to again show you well actually let me tell you what the architecture first and I'll show you this the high level overview hindsight we have the is the game assignment component which essentially identifies what is the right in which it should install the current the currently analyzed they became Winstanley then we tried to bypass the splashing as best as we can and then we have a series of these attack building blocks that have both a testing logic and evaluation logic right so it has a testing logic impressed perform the attack like automatically Scrope and then the evaluation logic says is the address bar still the. Right. So all of these are installed in real Android devices using the need to be right and there is a server side that collects server logs and Ajax in order to help us understand whether it's at that field or not and so I want to show it to you in action. Again it's a sort of a highly visual project so or make sure you appreciate it. So this is our sort of hindsight and we have forty vices you know again so it's all there to new or different as the cave versions installed in each one and now so we have hindsight that's running and it's running it's sort of a Simon algorithm and you will see here we have the first browser automatically installed and this is Dolphin and Dolphin has splashings So we need to sort of bypass again this is happening in a completely agnostic fashion right we're trying to in real time detect how we should bypass the splashing and proceed with bypassing it we're not hard coding any sort of logic right and so here again it's chrome we need to go to device checks here you know we need to allow you see a browser to make phone calls and manage our data why you would ask we don't know but we just you know we say aloud in all cases right and so some of these are sort of slower than others because we need to do or C.R. and if that fails we need to do other things right but again we do this in this sort of best effort fashion and you will see that you know once they sort of kicks in that we're actually here we have bypassed the splash streams of all three browsers and we're now exposing each browser to you know every to each and every A.B.B. one of the time right and I'm waiting for this before I post the video. Come on. It will happen I promise. You here you go OK So if you can allow it to magically and we proceed and we can actually and you can see here that these are not helping us right we have various pop ups that explain to us that this is not the most recent version that we should update and we need to sort of handle all of these in order to keep on evaluating the current version of We want to value it right. All right. So overall what we did is we exposed twenty three hundred eighty case each one to twenty seven attack building blocks totaling a total of sixty two thousand of our ability assessment reports right so clearly far far far from what you can do manually. Hindsight is creating self-contained files that include screenshots and can tell you whether hindsight thinks that the attack worked or not and an analyst can it can go in there and actually inspected that we did that so my student sort of put a lot of long hours into this and we found that the error rate of hindsight is three point six percent right including false positive false negatives and this includes one point five percent where. The errors are detected by human experts and the rest are actually hindsight knows that there was an error made the browser crashed and it's telling you that something is an error it's not telling you that it's a sort of a yes or no in terms of the attack having succeeded around and because of this error all our results are reported the relevant ones with upper and lower bounds lower bound is all the errors would have been security says these all the errors are actually by inabilities and so these are results here's the C.D.F. or the number of these fraction of a big case and what you can see is that ninety eight point six percent of all of our day to day because are vulnerable to abuse want to be right so being vulnerable is not some sort of you know exception it's the rule right and in fact if you sort of drag this to fifty percent you will see that fifty percent of our browsers are actually vulnerable to more than twelve right so there's a lot of an abuse out there. And different classes of attacks have different potency So you are the tax and security indicators and contant are actually working more often than event routing attacks and address bar attacks right. This is sort of going a little faster than running out of time I'm happy to sort of take questions or discuss of line this is an unusual analysis of the extent of the vulnerabilities here we have the average number of years and you see actually we're kind of going up right so we have more vulnerabilities in twenty sixteen now in twenty twelve OK and here this. There's a big gap between because there wasn't there weren't all that many occasion that here and there's jumbled mess you can sort of see different browser families and here for example you can see that Firefox is actually becoming better over the years whereas operand chrome right there going up right so there's more to be working more recent versions of chrome than they used to work in older versions of chrome OK. This is sort of a very rich graph and I will you know it's just in her into this kind of work all I want sort of want you to see is that we have the rank of installations these are the very popular browsers these are the least popular browsers and these are the vulnerabilities of these browsers right and what you can see here is actually that for example for the most popular chrome we were able to for these four categories your average sports a good indicator is constant we're able to have at least one successful attack but a category for every picky that we tested OK if you go all the way here to the end shark browser right we have zero one zero here right and this is here is because of because of sort of the web view that it's using this is you know it's because the shot browser does not try to do anything smart with the address bar it just put his address bar there which will stay there forever OK it will not be hidden if you swipe if you sort of go to landscape if you do other things and because of that there's actually you know it's behaving much more security right so the most popular browsers are not necessarily the most secure with a caveat that security is only are we again we don't know what we don't know about other attacks right these are H.P.'s related avi's you can see that we have mixed content attacks that work the most we have for Viking placement that works very little but we actually still have and again I don't know what are the versions of these self signed websites that load without warnings right this may not be of course the most recent browsers but this has happened with the sort of browsers that were out there in the wild OK. And the final sort of thing that I want to show you is that we want to understand the patterns of our ability right so how are attacks introduced How are vulnerabilities introduced and once they're introduced do they stay there or are they fix them by. And then they become sort of honorable again and what we found is we found actually a fair number of examples of the yes no yes pattern which means it was vulnerable it became unmanageable and then it went back to being vulnerable right so here's a dolphin browser Invasion eight point five one this is again this long U.R.L. You may be able to see that it's actually starting from the left so this is logged in the People dot com whereas in fact we are on that robotic do door sort of test website vision eleven point four point nine we actually start from the sort of feel the plus one part so we're showing the right domain but a couple of minor versions later we're back to the original behavior OK so we're temporarily fixing it and then we're going back to being vulnerable again OK And so this is sort of were hindsight shines right because it gives you sort of these insights into the evolution of our abilities right and sort of summarize. As a functional your browser's expanse right so does that attack surface and whenever we have new features we must be able to reason about the interaction of those features with the existing features that we have and whether we have some sort of unwanted You know. Cases in terms of our security in privacy and I hope also that you've sort of picked up on this theme of our work which is that automation is your friend in terms of security situational awareness right so you can automatically detect which extensions are fingerprint ible using X. hound so you can pressure the extension authors to make them less bring it readable by refusing the surface you can automatically detect which extensions make B A I and again you can push extensions to fix it when it's accidental or you can identify evil doers when it's intentional and you can automatically that which mobile browser vulnerable to what attack so you can prioritize patches and you can revisit U.I. design so we really need to be hiding this U.R.L. bar or make sure we maybe stop doing that right so thanks for your attention this is my team at Stonybrook none of this would be able we were you know it would be possible without them and I'm happy to take any questions that you may have. Thank. You. Right so yeah so depending on who's reading this work you know you would sort of you know do different things if you're a browser vendor you would want to understand how you're vulnerable and maybe prioritize fixes if you're a user you must develop this healthy sort of skepticism about the Web so that you're on right so always double check that you are where you think you are by scrolling by tapping right so the answer would vary depending on who I'm talking to and what we are hoping that this work will sort of shine light so the browser vendors can pick this up and say OK we really need to fix our sort of you know our problems in this this and that category and so the end users would then automatically benefit from sort of patches from these vendors. Right right so we did not look into I.O.'s soley because you know we think it's a sort of a closer platform right so there may very well be sort of the equivalent of a D. be on I O. S. right but this kind of walled garden idea and this may be harder side loading of extensions of apps right it is something that we're thinking about but when we're sort of coming up with a project we thought that Android is the way to go also in terms of sort of market penetration we think it's more relevant. Yes. Correct. Right I think the only sort of advice that I could give as an outsider is for these teams to be able to sort of chap more with each other right you can have your security team in the U.I. team and they sort of they never have lunch together right you have to sort of see them both together and so that the security scene can understand how the U.I. decisions are affecting security and then they can sort of these jointly decide how they go about having this sort of very beautiful kind of you know fluid you wise without sacrificing security that's sort of my high level advice. Yeah. Right. Thank you.