[00:00:10.02] Welcome to another talk he said the security [00:00:14.06] [00:00:14.06] there is my name is frankly I'm in the system professor here at Georgia Tech. [00:00:18.21] [00:00:19.22] And it's my pleasure to be here to introduce any mom who is. [00:00:23.17] [00:00:25.05] A colleague and friend of mine I'm I've actually known Danny since the summer of [00:00:30.01] [00:00:30.01] 2015 when we were both research in turns on the same team all literally [00:00:35.05] [00:00:35.05] sat right next to each other when I got to know Danny Care about his research and [00:00:40.06] [00:00:40.06] also meet his adorable Momo guest appearance at some point [00:00:45.08] [00:00:45.08] this presentation so then he is just starting off. [00:00:49.07] [00:00:50.21] By you at university where he is in the seat of our. [00:00:54.12] [00:00:55.19] Before that he was a post doc at Princeton which is where I believe he [00:00:58.19] [00:00:58.19] basically is right now working with Nick Easter he did a p.c. [00:01:03.15] [00:01:03.15] at u.c. San Diego working with Alex earn Carolyn Chang go and [00:01:08.21] [00:01:08.21] in general do anything research is about $35.00 c. [00:01:12.13] [00:01:12.13] a consumer technologies and he's done a really a lot of really awesome were on [00:01:17.01] [00:01:17.01] the looking at online of youth and if you've ever criminal and [00:01:21.23] [00:01:21.23] Morse we've seen more recently particularly during this post through his [00:01:25.20] [00:01:25.20] attention to Iraq she the Internet things are looking at the security and [00:01:29.21] [00:01:29.21] privacy of that ecosystem and I think that's what his talk will focus on. [00:01:33.05] [00:01:34.11] So with a let any Take it away and if folks have questions please chime in on. [00:01:40.07] [00:01:41.22] The go ahead and the script out here Ok Thank you Frank. [00:01:45.07] [00:01:47.20] If there's a dog barking just excuse us. [00:01:50.08] [00:01:51.16] Because she may or [00:01:53.05] [00:01:53.05] may not be in the room and I don't know where she has become a drop in sync by but [00:01:57.23] [00:01:57.23] anyways yeah I'm Danny Yeah I know Frank since and well enough Paul Pierce as well. [00:02:03.10] [00:02:05.07] Anyway yeah I own teeth though you know these days are just around [00:02:10.11] [00:02:10.11] the bio teasing that was io t. there are many definitions but you know I think of [00:02:14.19] [00:02:14.19] them asked what smart appliances there are not your phone or your computer or [00:02:20.11] [00:02:20.11] your tablets you know things that you know sit in a room there function sometimes [00:02:25.11] [00:02:25.11] without noticing them like you know this Alexa Amazon Eckel thing and [00:02:30.23] [00:02:30.23] they are around us you know there's so much news report about them watching us or [00:02:34.12] [00:02:34.12] listening to us or spy on us but today I'm going to talk about a way for [00:02:39.07] [00:02:39.07] us watch these io t. devices instead. [00:02:43.07] [00:02:44.11] To give an example of the kind of smart things [00:02:47.21] [00:02:47.21] they're watching us here's a t.v. in my office it. [00:02:52.22] [00:02:53.23] Here's a row good t.v. which is one of the biggest largest market share [00:02:58.12] [00:02:58.12] smart T.V.'s in United States and hero us in my office watching the c.b.s. [00:03:03.09] [00:03:03.09] news on c.b.s. news staff on broken t.v.. [00:03:06.11] [00:03:07.20] So you know most of passively Washington t.v.. [00:03:10.03] [00:03:11.13] And then I have another app which I talk about her a bit in the background [00:03:16.22] [00:03:16.22] to monitor my t.v. and this life raft basically shows this [00:03:21.23] [00:03:21.23] on the vertical axis is the number of bits bent by the t.v. but [00:03:27.10] [00:03:27.10] a horizontal axis is the time play that hand have to see in sync with the t.v. [00:03:34.03] [00:03:34.03] footage every color bar in this chart the notes a 3rd party advertising and [00:03:41.13] [00:03:41.13] tracking service that the t.v. is talking to and the biggest ones actually [00:03:46.12] [00:03:46.12] in pick they'll be marketing clout this is not here though. [00:03:50.23] [00:03:52.00] I was a sitting there watching the c.b.s. news app passively without touching [00:03:56.17] [00:03:56.17] anybody and he was just talking to Ed what 3 different every passing services [00:04:03.07] [00:04:03.07] was going on here though turns out that this is just the tip of the iceberg for [00:04:07.19] [00:04:07.19] the kind of you know security and privacy problems that we face on a daily basis and [00:04:13.17] [00:04:13.17] this is one once he while the other t.v. is all about other smart [00:04:17.11] [00:04:19.03] the 1st part of the talk I'm going to talk about a way for us to after all and [00:04:23.21] [00:04:23.21] here scrub sourcing you know for help together smart t.v. there or [00:04:29.04] [00:04:29.04] smart device there from around the world so [00:04:33.23] [00:04:33.23] this is what you did explain what this means here's a typical way for [00:04:38.00] [00:04:38.00] researchers to study security privacy of smart devices in the lab so [00:04:41.23] [00:04:41.23] here the top we have a brother and in the middle we have a computer that runs a p.c. [00:04:46.22] [00:04:46.22] secrets or any software like wireshark that analyzes network traffic [00:04:51.16] [00:04:51.16] the computer broadcasting a wife a hustler and that's in this case a very interesting [00:04:56.02] [00:04:56.02] story Deborah So we plug in a camera to the lab in Connecticut to connect [00:05:02.05] [00:05:02.05] the camera to the computers by hasa And while we're using the camera [00:05:07.05] [00:05:07.05] while we're interacting with camera we look at his knee down traffic and you know [00:05:11.02] [00:05:11.02] if we look at this operational traffic of this smart device because it's being used [00:05:15.16] [00:05:15.16] you know this traffic you ask questions like you know are connections encrypted or [00:05:20.12] [00:05:20.12] which peer nester Vista's is the device talking to or [00:05:23.22] [00:05:23.22] you know what their last being said by the device. [00:05:26.13] [00:05:28.08] Dolph the operational traffic of this device may tell us you know these [00:05:32.12] [00:05:32.12] answers to these questions and this is just for one camera but [00:05:37.11] [00:05:37.11] there are many other cameras made her slight legal or Amazon or [00:05:42.03] [00:05:42.03] damsel are just big names you probably have heard of that make smart cameras but [00:05:46.14] [00:05:46.14] there are other device makers that you may be less familiar with like sell me or [00:05:51.09] [00:05:51.09] doll our vision these are Chinese companies and [00:05:54.21] [00:05:54.21] turns out that these are also huge camera makers in fact ignitions one of [00:05:58.20] [00:05:58.20] the largest manufacturers of the smart cameras and that got him arrested on here [00:06:04.19] [00:06:04.19] on the right many of them were involved in a mirror I bought an epitaph that took [00:06:09.19] [00:06:09.19] down major Internet services but anyways these are speeding samples of a smart [00:06:15.01] [00:06:15.01] device a smart camera vendors you probably have heard of or never heard of but [00:06:19.07] [00:06:19.07] this is an example of sticks what you want it will steady you know all cam want to [00:06:23.09] [00:06:23.09] make some arguments about different kinds of cameras you have to buy different [00:06:28.06] [00:06:28.06] kinds of cameras in the lab and depending of how much money you have or how much gas [00:06:32.11] [00:06:32.11] in that space you have you know you can study some number of cameras but [00:06:37.07] [00:06:37.07] you're still emitted by scale. [00:06:39.21] [00:06:39.21] So. [00:06:41.08] [00:06:41.08] In general this is for cameras there are many other kinds of smart io t. [00:06:45.17] [00:06:45.17] devices help study them s.q.l. different vendors different models. [00:06:49.15] [00:06:51.03] Like a researcher still basically. [00:06:52.21] [00:06:53.22] So here to explain this I'm going to contrast number of existing methods [00:06:57.22] [00:06:57.22] here at the top there's left out either serious getting there scraps or [00:07:01.12] [00:07:01.12] see just 3 examples and here on the vertical axis and we'll explain in terms [00:07:05.14] [00:07:05.14] of you know what can we say what can we achieve out of these methods l. [00:07:11.14] [00:07:11.14] device labels and operational traffic which are showing a little bit so [00:07:15.06] [00:07:15.06] lab studies something that I just mentioned you know by means of devices in [00:07:19.10] [00:07:19.10] the lab and study them but the problem is that it's hard to teach scale but [00:07:24.04] [00:07:24.04] one of the biggest areas have seen is from i.m.c. last year from folks in [00:07:27.23] [00:07:27.23] North Eastern University where the study about 80 something devices. [00:07:31.16] [00:07:33.04] So it's hard to that and that's the biggest I've seen one of the biggest I [00:07:37.13] [00:07:37.13] don't know the lack of skill is one problem but [00:07:41.22] [00:07:41.22] a good thing is that researchers know about the vice labels meaning they [00:07:47.02] [00:07:47.02] they know their research just know exactly what devices they're talking about and [00:07:52.23] [00:07:52.23] it may be having a lab so you know here I have I have I don't exactly that I have [00:07:57.23] [00:07:57.23] an Amazon Akhil dot kids edition in my office here in my lap don't know what hell [00:08:02.20] [00:08:02.20] a device is I'm talking about I'm dealing with exact model exact manufacture [00:08:06.13] [00:08:08.05] another benefit is operational traffic meaning network traffic that I can [00:08:14.01] [00:08:14.01] do basically because if you don't want in up to a lot of devices being operated that [00:08:18.06] [00:08:18.06] helps us answer questions like you know is connection right connections encrypted or [00:08:23.08] [00:08:23.08] you know what what companies are devices talking to. [00:08:27.02] [00:08:28.15] So these are the pros and cons of that is that he's And one way to make up for [00:08:33.14] [00:08:33.14] lack of skill is to scan the area for example scan entire IP for [00:08:37.13] [00:08:37.13] say send out he sent packets and you know get back say at the base of the packet to [00:08:42.00] [00:08:42.00] get back analyze potential security pass issues so one benefit is [00:08:46.23] [00:08:46.23] you know you've got some scale for example Basin studies are from last year. [00:08:51.14] [00:08:52.17] Some researchers study more than 7 people close to 7000000 devices but [00:08:57.23] [00:08:57.23] one problem is that this is just don't know exactly what kind of devices they're [00:09:02.12] [00:09:02.12] talking about but some devices may say you know I am device x. y. [00:09:07.12] [00:09:07.12] z. based on say for instance there there are I don't know telnet there is or [00:09:11.18] [00:09:11.18] based on their you p m p a messages but many of us just don't have and [00:09:17.00] [00:09:17.00] there's on port 22 don't support 100 you know you can be ports it's very hard for [00:09:23.07] [00:09:23.07] us to you know learn the exact identities of easy devices so [00:09:28.18] [00:09:28.18] device labels is kind of a question mark there. [00:09:30.23] [00:09:32.02] And thirdly operational traffic would be very very difficult to pay out all [00:09:35.07] [00:09:35.07] of the way through that basically can't be felt because you know [00:09:39.20] [00:09:39.20] the researchers are not on the same network at these devices these devices [00:09:43.05] [00:09:43.05] somewhere Internet and yeah it's basically there's a way to read the speed up. [00:09:47.17] [00:09:49.00] So it did achieve scale and all the other parties Another [00:09:53.21] [00:09:53.21] approach is to go out to the grass or say one way is you know for [00:09:58.23] [00:09:58.23] example that by the flames there is 2 you know people are built custom routers and [00:10:04.21] [00:10:04.21] deploy these rather stoop one hears you know have their other sitting on chairs [00:10:08.19] [00:10:08.19] Holmes and capture traffic from all and hears in the n.y.c. traffic so [00:10:12.18] [00:10:12.18] the benefits are you know the volunteers can tell the researchers Ok I have device [00:10:17.14] [00:10:17.14] x. y. and z. cell research know exactly what devices that they dealing with and [00:10:21.20] [00:10:21.20] the others have to traffic so basically the researchers can you know analyze their [00:10:26.10] [00:10:26.10] p. haps and you know find out what's going on with a security press issues [00:10:31.08] [00:10:31.08] the scale is a little bit of a problem because the scale depends [00:10:36.01] [00:10:36.01] on how much money you have how much it will into you know a paid [00:10:40.04] [00:10:40.04] a lot of your successor how much money you have for building hardware so [00:10:44.19] [00:10:44.19] in snatches last year of study statistics Holmes and then they looked at [00:10:50.09] [00:10:50.09] the scale about $100.00 or so something homes close to 1000 I think but [00:10:54.23] [00:10:54.23] my goal is to actually get even bigger I want to outsource from thousands of homes [00:10:59.15] [00:10:59.15] even hundreds of thousands of homes in a way that allows me to know what devices [00:11:04.19] [00:11:04.19] I'm dealing with so the device labels and also in a way that allows me to look at [00:11:09.15] [00:11:09.15] the operational traffic basically running p.c. on smart devices don't [00:11:14.15] [00:11:14.15] you see this call here's my plan I built software tool that provides [00:11:19.23] [00:11:19.23] volunteers it useful insight why are these appear in privacy with one click yet [00:11:24.23] [00:11:24.23] there are 2 colors orange and green orange software and one click why [00:11:29.10] [00:11:29.10] software Well we don't want to build hardware we don't want to ship [00:11:32.17] [00:11:32.17] the hardware to users it's cumbersome so hopefully something that users can just [00:11:37.04] [00:11:37.04] download and use a computer is and has quite a few months and it should run and [00:11:41.18] [00:11:41.18] they'll help us collect traffic from these volunteers. [00:11:44.03] [00:11:45.13] Then the next question is you know why would volunteers even bother to collect [00:11:48.16] [00:11:48.16] traffic Russ you know we can pay them but that's a limited scale so. [00:11:52.18] [00:11:55.08] Therefore we have the Worthing green usable insight this offer it'll [00:12:00.04] [00:12:00.04] not only help us but it's also help the volunteers to learn something [00:12:04.06] [00:12:04.06] new about their say Smart home security privacy you know whether they're home [00:12:09.01] [00:12:09.01] smart if devices are talking or sitting there to some some 3rd party companies [00:12:13.15] [00:12:13.15] both usable insights will hopefully provide incentives for [00:12:17.14] [00:12:17.14] volunteers to help us collect they want traffic to this is and [00:12:22.02] [00:12:22.02] will that will call out the inspector you can actually download the right now by [00:12:26.08] [00:12:26.08] going to have the spectacle or basically volunteers get this or [00:12:29.19] [00:12:29.19] that you know you can go on this website download a Windows or Macs to open it they [00:12:34.06] [00:12:34.06] will click on it and then they will see a dashboard of their smart devices and [00:12:39.09] [00:12:39.09] that from this dashboard they can actually n.y.c. network traffic like in this case [00:12:43.10] [00:12:43.10] where you know I was using the software I have the specter of inspect [00:12:47.06] [00:12:47.06] my own broken t.v. and find out that whole just by streaming [00:12:52.11] [00:12:52.11] apps passively streaming on the news tacitly I could find out that much he [00:12:57.11] [00:12:57.11] was actually talking to a number of ever tightening in tracking services [00:13:00.19] [00:13:00.19] shown here in different color of ours so in general there are 2 components [00:13:06.21] [00:13:06.21] dieties vector one is a client one is a server the client part is open source [00:13:12.04] [00:13:12.04] it's really entirely in Python we have the Windows and MacOS built but [00:13:16.19] [00:13:16.19] you can actually download it yourself and so I think 3 called it allows users to [00:13:20.21] [00:13:20.21] view now an activity see if there are any security privacy issues and [00:13:25.20] [00:13:25.20] label devices basically telling us the researchers you know what vendor a device [00:13:30.20] [00:13:30.20] is what model devices and all this information is sent back to us. [00:13:36.09] [00:13:37.15] And there's only and allows us to do [00:13:40.09] [00:13:40.09] number of research was official talking in a little bit about a little bit. [00:13:43.13] [00:13:44.17] So the question is correct I asked him if this can evoke or can Windows version [00:13:49.12] [00:13:49.12] run in white so we don't have an innocent excludable that it's all Python 3. [00:13:54.07] [00:13:55.21] Of the packages and just though I don't run that it simple as that but [00:14:01.20] [00:14:01.20] he will have a number of students in n.y.u. right now working with me to [00:14:06.02] [00:14:06.02] do have an experience hopefully that makes the process a little easier for [00:14:10.05] [00:14:10.05] an exhibition so he can run but your little respite I don't that's how he's [00:14:15.09] [00:14:15.09] better client answer so how does factor we make a number of his [00:14:20.16] [00:14:20.16] contributions one has to have tool in fact we deployed in April 29th teen and [00:14:25.23] [00:14:25.23] since then we have more than 55000 anonymous users as of today and we're [00:14:31.18] [00:14:31.18] still gaining users and I think there are you know actually guilty I have these have [00:14:34.16] [00:14:34.16] to work in barrel on your own windows version is very stable that persons [00:14:38.22] [00:14:38.22] know that are there because you know where you don't have to Kelly the bill yet [00:14:43.10] [00:14:43.10] you know your status is you know if the source code is not good hope it is you [00:14:48.08] [00:14:48.08] I thought run at it so our users are less but us of users have come out [00:14:53.02] [00:14:53.02] until this is using a software examples include journalists from what n.p.r. [00:14:56.23] [00:14:56.23] and Washington Post they independently aided and [00:15:00.07] [00:15:00.07] guided that will give this tool and grow their articles about smart t.v. [00:15:04.13] [00:15:04.13] privacy also folks from a consumer reports told us they are using this tool [00:15:08.22] [00:15:08.22] to benchmark some devices and so folks from n.y.c. [00:15:12.12] [00:15:12.12] your Cece ever come into the series this tool for some reason. [00:15:15.03] [00:15:18.23] That's a tool in terms of deaths and we have created the largest [00:15:23.21] [00:15:23.21] known death set of Internet can any vice traffic collected by [00:15:28.16] [00:15:28.16] academic researchers So so far we have collected more traffic [00:15:33.20] [00:15:33.20] from more than 55000 Internet connected devices some of which are you east. [00:15:38.20] [00:15:39.23] And then out of these devices we have about 12000 devices [00:15:43.23] [00:15:43.23] they're available by users basically you know we know that what vendor what [00:15:48.23] [00:15:48.23] manufacturer what ma though what type of ice say this. [00:15:53.17] [00:15:54.19] Because this is a large this is actually you know the largest known data set of [00:15:59.18] [00:15:59.18] traffic with a number of organizations requesting that access to conduct research [00:16:04.13] [00:16:04.13] that includes University East as well as industry partners like i.b.m. [00:16:08.23] [00:16:08.23] and the Microsoft research interest it is the hood and [00:16:13.11] [00:16:13.11] because we have the such a unique Deseret we have generally some interesting [00:16:19.04] [00:16:19.04] insights I thought if you hear one this encryption wants privacy and a few others [00:16:23.23] [00:16:25.11] plus this how this I'll expect to work so I'm going to explain. [00:16:31.15] [00:16:32.23] How these have to work in terms of a strawman model in our life way for [00:16:38.16] [00:16:38.16] I have the separate the work for users to valid their tool and you know x. [00:16:43.11] [00:16:43.11] to their rather through either net and have the tool automatically create a wife [00:16:49.07] [00:16:49.07] I hostile or not and have you know smart devices connected this new hotspot so [00:16:54.22] [00:16:54.22] that's one way but there are 2 problems number one ask yourselves [00:16:59.16] [00:16:59.16] you know how many of our laptops actually have an even net cable even at Port [00:17:04.21] [00:17:04.21] maybe your computer scientists you know it's a biased sample but you know for [00:17:07.12] [00:17:07.12] many consumers My guess is that every beers don't have even imports. [00:17:12.14] [00:17:14.09] But the bigger issue is actually the smart devices [00:17:18.22] [00:17:18.22] if you ever own one of these smart devices before weather is a lax or [00:17:22.08] [00:17:22.08] light bulb you want you can reconnect to different manners network [00:17:26.15] [00:17:26.15] Well if you want to you know have a talk to different wife and network you know and [00:17:31.10] [00:17:31.10] it's like this they have to like hold down several buttons at the same time for [00:17:34.23] [00:17:34.23] like I don't know 10 seconds until sure my Flash is an e f used the app to be [00:17:39.20] [00:17:39.20] connected a different life I s.-s. id and this is just for this state and [00:17:43.17] [00:17:43.17] you ever on the light pulse right like well from Philips You have to like [00:17:46.20] [00:17:46.20] like it or out like an office which like 10 seconds and you know reconnect it. [00:17:51.03] [00:17:52.08] Is quite a pain to reconfigure a smart device to a different wireless network [00:17:56.12] [00:17:56.12] just so that you can expect it and then when you're done you have to tear down [00:17:59.20] [00:17:59.20] network and reconfigure devices back to use the old network dissipate. [00:18:05.12] [00:18:06.22] To counter that issue you know somehow we have to make io t. [00:18:10.14] [00:18:10.14] inspector on the path of communication in particular you know let's say [00:18:15.23] [00:18:15.23] it's more camera is to talk to a router the tropical switch we know you never. [00:18:21.19] [00:18:23.00] Really has to somehow reroute the traffic but the traffic between a camera rather [00:18:27.07] [00:18:27.07] have to go through a camera you go through the eye of inspectors computer [00:18:30.23] [00:18:32.01] dull in the way it works is that we use a technique called harps moving the way it [00:18:35.21] [00:18:35.21] works is that every few seconds computer running how his doctor sends out to [00:18:39.23] [00:18:39.23] a gratuitous who are packing for example you know kept at the computer and [00:18:44.20] [00:18:44.20] sent a packet to the camera that says hey correct I'm rather the computer [00:18:50.08] [00:18:50.08] also send another packet to the router that says a rather I make error. [00:18:53.22] [00:18:54.22] In do you sell out his back there convinces the camera there is a rather [00:18:59.02] [00:18:59.02] this is a rather is a camera the traffic between a smart devices actually go [00:19:03.03] [00:19:03.03] through housing sector and there are cases where you know some brothers or [00:19:07.11] [00:19:07.11] some smart devices have some kind of are looking detection or [00:19:11.23] [00:19:11.23] you know there might be drops in the. [00:19:13.10] [00:19:14.17] Packets and [00:19:15.11] [00:19:15.11] that case we use he asked sequence numbers to infer any missing packets. [00:19:19.22] [00:19:21.19] Are going to take a woman a civil order and they have to take any questions [00:19:25.14] [00:19:34.06] you can have a little bit long water breaks of you know [00:19:37.01] [00:19:37.01] type your questions are asked any questions. [00:19:39.02] [00:19:44.11] Act. [00:19:45.02] [00:19:49.13] Right look Weston's This is our spoofing and [00:19:54.11] [00:19:54.11] because we're able to you know be rather traffic without users having to anything [00:20:00.02] [00:20:00.02] we wish it we're able to achieve some scale in particular [00:20:04.18] [00:20:04.18] we deployed in April 29th even in the 1st 2 months of deployment. [00:20:09.00] [00:20:10.15] We have 180 active users. [00:20:12.23] [00:20:14.02] And as of today as of January [00:20:17.21] [00:20:17.21] we have about 4400 I'm sorry 5000 more than 5000 users and [00:20:22.20] [00:20:22.20] more than 50000 devices we don't collect the IP addresses or [00:20:27.12] [00:20:27.12] e-mails or names from users we only collect their time zone information we [00:20:31.16] [00:20:31.16] know that about 63 percent of users are in time zones between San Francisco and [00:20:36.15] [00:20:36.15] New York and about 20 percent of users in time zones between London and Moscow and [00:20:41.10] [00:20:41.10] the rest in Asia and here's the chart that shows in the matter of the active uses for [00:20:46.07] [00:20:46.07] day USCIS for [00:20:47.13] [00:20:47.13] the Cloyd it is a stack area chart into that into new users and existing users. [00:20:53.06] [00:20:55.20] A little question Gregory asked what pile interfere with their app collecting their [00:21:01.19] [00:21:01.19] answers no for those who are not familiar Hi Ho is a respirator [00:21:08.04] [00:21:08.04] rest Perry pipe app that blocks the n.s.a. requests if the Us [00:21:13.13] [00:21:13.13] requests is going it's about the rains for every passing a trucking company. [00:21:17.15] [00:21:18.21] Basically the n.s.c.. [00:21:19.16] [00:21:20.20] We don't interfere with the other service all we're doing is that we are just we [00:21:24.08] [00:21:24.08] rally traffic packets between the route there and the and smart devices [00:21:29.23] [00:21:31.16] and that allows us to actually see the traffic between Rather and smart devices. [00:21:35.05] [00:21:36.08] Hole will interfere and be well and I don't want interference as well. [00:21:40.19] [00:21:42.03] So back to this life. [00:21:43.05] [00:21:45.01] And because we have such a large scale with such a large scale we in fact we have [00:21:49.07] [00:21:49.07] more $54000.00 devices but not all devices are labeled so [00:21:53.21] [00:21:53.21] we have an interface for users to optional even label their devices you know tell us [00:21:58.18] [00:21:58.18] what vendor what carrier a device is et al these are the 4000 [00:22:03.15] [00:22:03.15] devices which we have that have traffic for only about 12000 [00:22:08.16] [00:22:08.16] devices are labeled by users who have their vendors type model. [00:22:13.13] [00:22:14.23] And the promise that you know even for [00:22:16.08] [00:22:16.08] these 12000 devices there are inconsistency say for [00:22:19.03] [00:22:19.03] instance you know we have Amazon echo of thought here some people label it as [00:22:24.17] [00:22:24.17] Amazon echo or Amazon or x. or echo Alexa in different kinds of variations so [00:22:30.01] [00:22:30.01] we have to create manual rules to collapse a collapse in the visa labels together and [00:22:34.07] [00:22:34.07] there are some misspellings us well you know Bill that spelt with one elbow or [00:22:38.11] [00:22:38.11] 2 L's Shalmi cell incorrectly so you know we have to use a rule [00:22:44.17] [00:22:44.17] that things like the edit distance to sort of collapse different labels together. [00:22:50.15] [00:22:50.15] And I do know that we're down to about 7000 ISIS so even when the 7000 [00:22:56.02] [00:22:56.02] devices there could be terrorists in use or labels we have seen cases of a smart [00:23:01.21] [00:23:01.21] band you know talking to what appears to be a Google Facebook Amazon lots [00:23:07.00] [00:23:07.00] of different kinds of domains that turns out that it is like in the user actually [00:23:12.02] [00:23:12.02] inspected the phone instead not the smart fan so there are errors in labels so [00:23:17.01] [00:23:17.01] we have to you know we have a number of validation methods to check against labels [00:23:22.19] [00:23:22.19] looking at the n.s.a. requests the advice makes or been Ok you have the messages or [00:23:27.18] [00:23:27.18] you know the mac address to make sure that you know at least one of these are vision [00:23:32.20] [00:23:32.20] is met that is consistent with the user labels and if so [00:23:37.15] [00:23:37.15] we're down to about 6000 devices across 81 vendors but [00:23:43.07] [00:23:43.07] there's a lot more work miscarry device validation but [00:23:46.05] [00:23:46.05] the here's what we have here and about the 6 out for the 6000 devices [00:23:51.21] [00:23:51.21] we're going to use this ask this f. or a section of talk and [00:23:56.04] [00:23:56.04] be stated this device to stand on the popular names like who Amazon so [00:24:00.01] [00:24:00.01] knows to lesser known names like you know those and [00:24:03.07] [00:24:03.07] Asia as well and this is a spare the. [00:24:06.15] [00:24:07.17] Ice is like various kinds of devices like little home eco smart plush camera [00:24:13.14] [00:24:13.14] light bulbs of even have 6 s. us all x. [00:24:17.16] [00:24:17.16] I think based on user labels don't know this that you know the green card the 6000 [00:24:23.02] [00:24:23.02] devices it's just a tiny fraction of the entirety of the of the desk and [00:24:27.21] [00:24:27.21] we'll we still have 41000 devices with looked at and [00:24:32.07] [00:24:32.07] this is your work which I'll talk about it so for now I'm going to focus on the 6000 [00:24:38.07] [00:24:38.07] devices across one vendors and because we have. [00:24:43.19] [00:24:45.05] That's unique perspective from within real user homes we're able to [00:24:49.23] [00:24:49.23] generate a number of unique insights I'll talk about encryption privacy and [00:24:54.06] [00:24:54.06] local ports in which the 1st question is. [00:24:57.04] [00:24:58.09] Is encrypted traffic we don't have a payload so we don't know exactly [00:25:02.20] [00:25:02.20] you know whether devices traffic is encrypted we don't look at entropy but [00:25:07.05] [00:25:07.05] one proxy for inclusion is look at any traffic. [00:25:10.07] [00:25:11.18] In this plane taxation piece so about there is expensive percent of the vices [00:25:15.19] [00:25:15.19] communicate over port $80.00 pay taxes like the traffic on encrypted and [00:25:22.04] [00:25:22.04] these there is express a percentage wise is covered are 6169 how it went different [00:25:26.10] [00:25:26.10] vendors that includes our trannies not Michael maker as well as makers [00:25:31.10] [00:25:31.10] like Amazon in both in those cases these are smart T.V.'s the cameras on the road. [00:25:36.01] [00:25:37.12] Itself you know there are other devices 10000000 crip there are traffic but [00:25:43.03] [00:25:43.03] not all devices do encrypted properly and [00:25:46.10] [00:25:46.10] here's limeade merely a device could be using s.s. or using T.L.'s but they could [00:25:51.20] [00:25:52.20] be using some libraries are you know the developers may have not configure their [00:25:57.20] [00:25:57.20] libraries that correctly what is that system devices that used outdated presence [00:26:02.22] [00:26:02.22] of the Ls the current most awesome and most popular rezzes 1.2 There is 1.3 but [00:26:08.16] [00:26:08.16] we have seen devices that now used ls 1.0 or creating that s.s.l. [00:26:13.09] [00:26:13.09] 3.0 to 10 percent devices that you know you still ask Use out the persons and [00:26:18.14] [00:26:18.14] that covers about 26 vendor secluding Amazon this you know some some and [00:26:24.02] [00:26:24.02] now I look at the traffic and for [00:26:26.00] [00:26:26.00] the vices that's responsible turns out that most of these are snark Evarist. [00:26:30.06] [00:26:31.11] The implications are because of a lack of encryption or [00:26:36.00] [00:26:36.00] because of the lack of proper encryption. [00:26:38.08] [00:26:39.17] To not have a patter and have attacker get this your traffic and [00:26:44.00] [00:26:44.00] I think that in the attacks of you know I speak in pleasure sit this one for [00:26:49.16] [00:26:49.16] Apple but I think. [00:26:51.04] [00:26:54.17] In terms of privacy there are other issues as well the one [00:26:59.18] [00:26:59.18] we define prayer is issued as you know the outflow of their from [00:27:04.21] [00:27:04.21] devices to some 3rd party ever ties into traffic companies this one threat model [00:27:10.13] [00:27:10.13] we look at a company comes to privacy of course there are many others but [00:27:13.10] [00:27:13.10] this is one way if we define privacy at this particular session talk so [00:27:19.06] [00:27:19.06] pretty vices that send out there are save some 3rd party advertising attract these [00:27:23.10] [00:27:23.10] you know what's the most popular such device that asked to their party [00:27:28.20] [00:27:28.20] every entering company's president these are cameras we have at least 400 cameras [00:27:35.05] [00:27:35.05] in this data set at least because there are a lot more we haven't accounted for. [00:27:39.21] [00:27:41.05] Even within this small deficit of cameras foreigners Marty I'm sorry Marty and [00:27:45.20] [00:27:45.20] then you have not Harrison tv's and it's peace partner smart T.V.'s and there's at. [00:27:50.21] [00:27:52.05] You know we looked at the other register domains contacted by the Smart T.V.'s and [00:27:56.07] [00:27:56.07] about 22 percent of them are going to or [00:27:58.19] [00:27:58.19] they ever got in trouble services based on some blacklist on this disk [00:28:02.06] [00:28:05.07] audience poll tonight let's do a poll here they figure that the most t.v. [00:28:11.01] [00:28:11.01] thing off to which ever ties Ingin tracking company [00:28:15.21] [00:28:15.21] is create obvious pull here 8 if you think it's cool b. [00:28:20.21] [00:28:20.21] think it's Amazon the Facebook and the others. [00:28:26.10] [00:28:26.10] They'll give you about their 15 seconds to fake which is the most popular. [00:28:30.04] [00:28:31.07] Ever has in tracking company for. [00:28:32.21] [00:28:34.12] For Smart T.V.'s take a quick while a brick or [00:28:38.11] [00:28:38.11] folks that I have read this paper their picks for. [00:28:40.18] [00:28:41.21] Their authors this paper here. [00:28:44.01] [00:28:52.02] They the postal active I'm going to count down 5 seconds. [00:28:55.03] [00:28:56.05] Seconds by or 3 to one all right of course. [00:29:03.05] [00:29:04.23] So the most popular answers each. [00:29:07.13] [00:29:08.21] Other is followed by 8 Google No I'm [00:29:13.23] [00:29:13.23] sorry Charlie the most popular answer is actually be Amazon. [00:29:18.15] [00:29:19.16] Or fall by others although Google and finalists face look Ok [00:29:25.20] [00:29:25.20] folks think anything else on is the winner and others but l. is the president what. [00:29:31.15] [00:29:33.02] Turns out there who is actually the biggest. [00:29:34.15] [00:29:36.13] Look at the net but which 34 percent of Smarties talk to their other popular ones [00:29:42.04] [00:29:42.04] like score car research so I want he had I think about these 2 companies is that. [00:29:46.09] [00:29:47.23] These companies are. [00:29:49.06] [00:29:51.10] Kind of also popular at all where there's Well you know if you're [00:29:54.14] [00:29:54.14] the New York Times dot com or c.n.n. dot com You know you chose or [00:29:57.18] [00:29:57.18] you will see your web pages talking to these 2 companies but [00:30:03.17] [00:30:03.17] there are other companies there are less keen on the web and Apple world that's [00:30:08.09] [00:30:08.09] Comcast particular this city the company this critical domain is responsible for [00:30:13.07] [00:30:13.07] mostly video tracking T.V.'s So that's a smart t.v. privacy [00:30:17.22] [00:30:19.15] if you questions here that missed Ok no more questions [00:30:24.16] [00:30:24.16] than us privacy issues the 3rd example of [00:30:28.21] [00:30:28.21] insight we have taken from a stair set is and used local ports and here's why any. [00:30:33.02] [00:30:34.10] But imagine you have a light bulb and have it in your house and you are control that [00:30:38.04] [00:30:38.04] you want to use your cell phone to turn it on and off so one way is to for [00:30:43.00] [00:30:43.00] traffic go through the cloud and up there but there are other ways. [00:30:46.06] [00:30:47.06] To save on cowbell is different since some smart devices will listen to local ports [00:30:52.21] [00:30:52.21] you know opening a portal port opening opening up a port on any issue the p.c. [00:30:58.02] [00:30:58.02] Ed Smart Devices can you know this coverage devices through say you p.n.p. [00:31:01.18] [00:31:01.18] and you make a direct connection if you label why not in this control it [00:31:05.22] [00:31:07.10] without ever going to crowd out this one way but we found that in many [00:31:13.07] [00:31:13.07] instances of such devices local ports they also listen other local ports like say [00:31:18.11] [00:31:18.11] at this age which were never used in any communications we found and [00:31:24.01] [00:31:24.01] why do you bother about s.s.h. Well you know some devices are known to have [00:31:29.13] [00:31:29.13] you know sh port open protected by weak passwords like root password. [00:31:34.01] [00:31:35.16] And in fact a potentially you know a shell access to such smart devices [00:31:40.04] [00:31:41.07] in that business one example the kind of ports being open they're not used [00:31:45.22] [00:31:45.22] by not use I mean we did not observe any communication. [00:31:48.18] [00:31:49.23] In general we found these ports open a local network by conducting a since again [00:31:54.23] [00:31:54.23] on the what the like 50 ppm to t.v. for device control except try and [00:31:59.19] [00:31:59.19] be sure the number of devices in a desert that with the feast course open. [00:32:03.22] [00:32:05.14] For these open ports solo are never used for [00:32:09.04] [00:32:09.04] example as a fish they are here in a percentage of ISIS but [00:32:13.14] [00:32:13.14] we do not see any traffic to s.s.h. port in 100 percent of cases [00:32:18.14] [00:32:18.14] in that we want to highlight 3 examples at this age telnet it hasn't been. [00:32:22.01] [00:32:23.04] As in be with us for a cough or of sherry. [00:32:26.14] [00:32:27.23] These for all cause are associated with potential security [00:32:31.10] [00:32:31.10] non-security more abilities and the fact that the scores are open but [00:32:35.11] [00:32:35.11] they're barely you suggest they're you know [00:32:37.21] [00:32:37.21] maybe these are good idea if you look at the sport in the 1st place and b.c. vices. [00:32:42.01] [00:32:43.05] Are prone to such attacks so [00:32:48.09] [00:32:48.09] that's one comports the area it's. [00:32:51.08] [00:32:53.02] Run those complaining on a browser you'll be shocked [00:32:56.10] [00:32:56.10] at the collusion between websites and attack and trackers that. [00:33:00.10] [00:33:01.15] Gregor is there the point basically in the web world. [00:33:05.06] [00:33:06.20] Websites after traverse. [00:33:08.15] [00:33:10.01] That's also the case but he's a slot which I'll talk about a little bit but [00:33:14.11] [00:33:14.11] let's go back to how it's better they are set up these are 3 examples [00:33:18.22] [00:33:18.22] of insights you can gather from our desks and [00:33:22.21] [00:33:22.21] in general that there are a lot more we can do you know the stairs it opens up [00:33:26.21] [00:33:26.21] including the stairs and this set who opens up new opportunities for [00:33:30.21] [00:33:30.21] future research in different areas less security and privacy for instance. [00:33:36.19] [00:33:38.05] Number of Crowther is in use cover me we're building the firewall [00:33:42.22] [00:33:42.22] you know we want to see you know if there are ways to build the rules for io t. [00:33:47.09] [00:33:47.09] devices and we can block certain you know potentially malicious connections [00:33:51.23] [00:33:51.23] that's one example of cases or building and because we have such a large user. [00:33:55.10] [00:33:56.14] Of you know what how are the devices behavior network we can potentially use [00:34:02.04] [00:34:02.04] what we are actually using the stairs at the bills prior to the farm bill. [00:34:07.04] [00:34:08.08] The next is home that one measurement of perfection that's in collaboration with [00:34:11.12] [00:34:11.12] folks in Stanford University actually. [00:34:13.09] [00:34:14.23] For those who know this. [00:34:16.02] [00:34:17.13] Motivation here is that you know we're increasingly working from home and [00:34:22.07] [00:34:22.07] our homes are getting you know smarter and [00:34:24.02] [00:34:24.02] you know we have that in few years that we only have like just a laptop phones and [00:34:28.04] [00:34:28.04] tablets but right now even smart devices and we have the vices of the enterprise [00:34:33.23] [00:34:33.23] but home environment isn't an end to Torricelli battling for [00:34:36.22] [00:34:36.22] security because in many rather Still if I like your corporate environment your [00:34:41.18] [00:34:41.18] home is basically an open area for potential abuse there's no like [00:34:46.08] [00:34:46.08] you don't run a Cisco router on your home to you know say blocks or traffic you will [00:34:50.21] [00:34:50.21] see have a consumer rather stat basically don't do anything number one number 2 [00:34:55.02] [00:34:56.07] like actually in a previous use light any smart devices supports open [00:35:00.05] [00:35:01.19] like well helmet ports open potentially a malicious that piece of software can [00:35:06.15] [00:35:06.15] communicate between your computer and another smart device and you know [00:35:12.05] [00:35:12.05] have we might observe you know lateral movement of now where from your computer [00:35:16.19] [00:35:16.19] to smart devices in your smart device may be compromised and also one last place for [00:35:22.02] [00:35:22.02] privacy in many devices have open ports maybe they're used for [00:35:27.22] [00:35:27.22] important reason that's usability you know your less There's a reason news [00:35:32.23] [00:35:32.23] if you days ago that's that's that is that shows how you're Alexa and [00:35:37.01] [00:35:37.01] also that it is governor of New h.p. switches and [00:35:39.17] [00:35:39.17] you know make connections between a street such as h.p. printers and [00:35:44.03] [00:35:44.03] access and so they can you know usually voice control critter in the fact that [00:35:50.01] [00:35:50.01] these smart devices have open ports there that welcome you [00:35:55.01] [00:35:55.01] communication the just tell you no it's not your world you know these devices want [00:35:58.20] [00:35:58.20] to talk to each other but at the same time there are opportunities for [00:36:04.23] [00:36:04.23] her for privacy leaks you know a. [00:36:08.15] [00:36:10.04] A web page for [00:36:11.03] [00:36:11.03] instance can potentially use javascript to scan a local network and use. [00:36:15.14] [00:36:17.02] Due process origin requests on the smart devices and [00:36:20.02] [00:36:20.02] control these devices or extract. [00:36:21.16] [00:36:22.17] Information from these devices as we have shown in a previous paper so whole network [00:36:27.06] [00:36:27.06] is actually how secure and private home network is are we don't know and [00:36:32.21] [00:36:32.21] we're planning to use our They're set to find out the answer and [00:36:36.07] [00:36:36.07] finally are working with folks from CMU to other stand how I have users [00:36:41.05] [00:36:41.05] perceive pads and what kind of devices they are defining them a graphics dump [00:36:46.01] [00:36:46.01] these are kind of researched is secure in privacy also we see aisle the specter [00:36:51.19] [00:36:51.19] status that as a good tool for machine or a thing and network here's what I mean. [00:36:55.22] [00:36:57.10] If you know images that image that is a large collection of images but for [00:37:02.10] [00:37:02.10] vision training or networking there's no such data set that is [00:37:09.13] [00:37:09.13] that that allows you to look at and they want traffic and tell us you know what. [00:37:14.17] [00:37:14.17] Device terrorist this traffic or whether this traffic is you know normal or not and [00:37:20.10] [00:37:20.10] we want to build this death you know the same way emission it provides a large [00:37:23.21] [00:37:23.21] label to set for vision researchers who want to label their set for networks and [00:37:29.02] [00:37:29.02] the researchers the one area is actually called device and [00:37:33.14] [00:37:33.14] efficient which I'm starting to work with for [00:37:36.14] [00:37:36.14] folks at Microsoft where we're trying to identify you know given some traffic [00:37:40.20] [00:37:40.20] can we tell what device this if it's a simple question that's a question [00:37:46.04] [00:37:46.04] there's actually a lot of research in this field but the work I've seen so [00:37:49.12] [00:37:49.12] far is in a Coast environment where you know the research just by [00:37:54.07] [00:37:54.07] her dozens of devices you know do training process audition have been you know these [00:37:58.11] [00:37:58.11] dozens of devices in the lab and that's it they want this to be open where a problem [00:38:02.16] [00:38:02.16] where you know given a new device and me tell you what I tend to fit it. [00:38:06.13] [00:38:07.17] If I think is important because it allows potentially now or any strangers to know [00:38:11.11] [00:38:11.11] what devices get plugged into a network and assign appropriate policies imagine [00:38:15.21] [00:38:15.21] you know someone popping in and out of a dock camera in the network you know as [00:38:21.05] [00:38:21.05] a citizen in you want to really plot this device or you know isolate this device so [00:38:25.08] [00:38:25.08] that he doesn't start you know it arcing all your other devices the network [00:38:29.14] [00:38:30.15] doesn't vice IP and so that's one kind of labels that it's. [00:38:33.21] [00:38:35.01] The other other kinds of labels include in the labels for example [00:38:39.20] [00:38:39.20] you know when you open a fish store and you know what has happened you generate or [00:38:43.06] [00:38:43.06] when you travel a lot like well what kind of try to going to generate these labels [00:38:47.14] [00:38:47.14] are useful in helping researchers determine if they're say another nice [00:38:52.03] [00:38:53.13] but these are examples of research in machinery. [00:38:56.09] [00:38:56.09] And health I'm actually working with researchers from [00:38:58.23] [00:38:58.23] Department of the Preventive Medicine in Northwestern University where you try to [00:39:03.21] [00:39:03.21] infer our wellbeing of users it's our that. [00:39:07.01] [00:39:08.08] Traffic in some cases Coates human activities like sleeping or not or [00:39:13.08] [00:39:13.08] even sometimes over eating or not and we're trying to use network traffic and [00:39:18.21] [00:39:18.21] give us a proxy for human Welby and there are other use cases like you know [00:39:24.06] [00:39:24.06] building a test to file the devices so that users can students can learn i.o.t.. [00:39:29.13] [00:39:30.19] Again privacy without actually touching a device physically and I we're also [00:39:35.18] [00:39:35.18] continuing software development of his that are with the community so [00:39:40.03] [00:39:40.03] that hopefully we can get go beyond 5000 users and gain a lot more users [00:39:45.12] [00:39:49.02] from wanting a quick want to break here in case there are any questions asked. [00:39:53.10] [00:40:00.13] One limitation of the I think Spector's that it doesn't pay. [00:40:04.08] [00:40:05.11] Well you know I've seen this I've seen this like here before in the local t.v. [00:40:09.06] [00:40:09.06] he's talking to a number of advertiser and trucking company East the problem is that [00:40:13.18] [00:40:13.18] you know based on how the inspector both from a users and the response of you [00:40:18.17] [00:40:18.17] don't know what they're seeing set you don't know what smarty apps [00:40:23.05] [00:40:23.05] Senator because you know as researchers all we have is just the headers [00:40:27.21] [00:40:29.12] you can't answer these questions so here's Are these questions we have to actually [00:40:34.02] [00:40:34.02] take these devices suspicious devices you see on the i o. [00:40:38.14] [00:40:38.14] he's going to get a set and bring the lab for analysis. [00:40:44.12] [00:40:44.12] That brings me to 2nd Harvest paper this talk on undersea smart T.V.'s. [00:40:48.21] [00:40:50.11] Will tell us that you know how to use it here's a typical Santa you know say for [00:40:54.11] [00:40:54.11] instance we have a computer running this be done. [00:40:56.09] [00:40:58.16] And you have a if you like rock and in my case I want to study [00:41:03.19] [00:41:03.19] how apps interact with 3rd party advertising and tracking companies so [00:41:08.20] [00:41:08.20] you know that's a Roku I you know Haitian relief I open the Asian movie ad hoc group [00:41:15.18] [00:41:15.18] I use my remote control to interact with it at the same time I look at his little [00:41:19.17] [00:41:19.17] traffic to see you know how the app is behaving in terms of network traffic but [00:41:25.05] [00:41:25.05] happens that you know one of the 3rd party advertising trackers it talks to is quite [00:41:29.21] [00:41:29.21] exchange a pretty big everything friend company also in the Web environment and [00:41:35.12] [00:41:35.12] turns out that you know so much traffic is an encrypted look at the traffic and [00:41:40.01] [00:41:40.01] I find out there who the name of the movie I'm watching still young and [00:41:45.03] [00:41:45.03] is part of the plain text traffic dense rom I wrote with t.v. [00:41:50.04] [00:41:50.04] to some 3rd party ever that interesting service a little creepy. [00:41:54.09] [00:41:56.05] But this is just for one app on one t.v. [00:41:59.05] [00:42:00.08] how do we end allies the traffic of thousands of Roku abscessed scale or [00:42:05.17] [00:42:05.17] thousands of Amazon abscess emmas our t.v. apps at scale and [00:42:10.13] [00:42:10.13] in this case we happen to have just you know clear text traffic What about traffic [00:42:13.23] [00:42:13.23] that is encrypted al-Tikriti the traffic the answer these [00:42:19.01] [00:42:19.01] 2 questions we want to build a smart t.v. crawler there achieves 2 goals one [00:42:24.00] [00:42:24.00] to ship you know behave like us humans he should be able to install ins and [00:42:28.19] [00:42:28.19] launch pad using remote control in place of videos and [00:42:32.16] [00:42:32.16] he should also be able to you know the crypto traffic and identify any private [00:42:36.19] [00:42:36.19] information being sent to some 3rd party ever not in tracking services. [00:42:40.13] [00:42:41.14] And you know genius goes there are some complications going I'm going to [00:42:46.09] [00:42:46.09] address the challenges with respect to weapon mobile research so [00:42:51.11] [00:42:51.11] to interact with web pages and mobile apps there are those tools with known feedback. [00:42:55.18] [00:42:57.05] Differences you want to interact with dollars on the web pages you so [00:43:00.03] [00:43:00.03] many of what interact with enjoyed the apps use enter debugging rich [00:43:05.09] [00:43:05.09] you know he's a popular tools it provides you with no feedback like you know if you [00:43:09.04] [00:43:09.04] want to click the button on a domination m.l. page you know whether this could be [00:43:13.11] [00:43:13.11] successful or not you know what's been rendered on screen but for [00:43:18.07] [00:43:18.07] smart t.v. search things a little bit different there are no known horse and [00:43:23.06] [00:43:23.06] best done platforms play the role who and Amazon provides unlimited a.p.i. [00:43:27.23] [00:43:27.23] all this are control of their remote control and basically allows if you were [00:43:32.05] [00:43:32.05] to move the cursor left right up down could Ok click that button that set and. [00:43:38.13] [00:43:39.16] These a.p.i. tend to provide a limited feedback you know if you click the Ok [00:43:43.21] [00:43:43.21] button you know using this a.p.i. you can't know whether the p.c. [00:43:47.18] [00:43:47.18] receives it or what's been drawn in Teesri that's a challenge [00:43:52.11] [00:43:54.01] the other challenge is actually having privacy the you know in a web and [00:43:58.05] [00:43:58.05] mobile k.z. if you want to keep the traffic sure [00:44:01.03] [00:44:01.03] you know just change the route certain you know there you go you met a little traffic [00:44:05.09] [00:44:05.09] strategy nice bunch much harder to change a rooster to it sometimes it's [00:44:10.07] [00:44:10.07] easy it's virtually impossible because we're talking about a park tree. [00:44:13.19] [00:44:14.22] On platforms here. [00:44:15.21] [00:44:17.02] Those who address these challenges belittle its Martina to general purpose [00:44:21.18] [00:44:21.18] martini crawler we use this tool to conduct the 1st ever large scale [00:44:26.13] [00:44:26.13] notice of starting the absolute rock when Amazon which actually have the biggest [00:44:31.02] [00:44:31.02] market share you know States in terms of smart T.V.'s and we find as being [00:44:35.12] [00:44:35.12] a private information being shared with 3rd party every time travel companies and [00:44:39.23] [00:44:39.23] we also identify tracking on children without parental consent and [00:44:44.15] [00:44:44.15] anticipation violation of the Children's Online Privacy Protection Act a copy and [00:44:49.15] [00:44:49.15] I actually gave this talk to our folks from the feather tray commission and [00:44:54.00] [00:44:54.00] hopefully they are looking at the issue but these are the contributions. [00:44:58.15] [00:45:00.22] To How does the smart t.v. quality work how does it Emily human t.v. [00:45:05.12] [00:45:05.12] interactions in the 1st place well we build a system is such a way that [00:45:10.14] [00:45:10.14] we have a crawler running on a computer and have it lol We have a smart t.v. [00:45:15.14] [00:45:15.14] it could be broken we could be Emma's I could be sent some general purpose and [00:45:20.10] [00:45:20.10] over puter here you know like how we typical say we screen [00:45:25.08] [00:45:25.08] a wife a hostile **** in his her traffic the computer also [00:45:30.11] [00:45:30.11] uses the platform specific remote control commands to emulate you know the radio [00:45:35.18] [00:45:35.18] remote control basically left out of right Ok buttons so [00:45:40.10] [00:45:40.10] this year that orange yellow orange arrow [00:45:44.12] [00:45:44.12] is one direction it goes down because now we can issue these you know but [00:45:49.09] [00:45:49.09] the presses don't get feedback we don't know what's being drawn on the screen [00:45:53.18] [00:45:53.18] we don't know if the Ok and on actually succeeds because Ok result. [00:45:58.05] [00:45:59.14] Given this particular problem we have to get feedback from t.v. So what we do is [00:46:03.18] [00:46:03.18] that we type a thesis if give outputs to the issue by capturing the Christian card [00:46:08.11] [00:46:08.11] which costs about 300 bucks on Amazon and this card heist back the audio [00:46:13.07] [00:46:13.07] media back to us by u.s.b. so that you know the t.v. looks like it's where. [00:46:17.19] [00:46:19.10] So in that way we're able to you know emulate the remote control and [00:46:24.20] [00:46:24.20] interact with t.v. And you know start planes or [00:46:27.14] [00:46:27.14] videos there are a lot more technical issues I'm going to for [00:46:31.02] [00:46:31.02] the sake of time I'm going to just revered it and show you a. [00:46:34.17] [00:46:35.17] Live video of a crawler interacting with a single girl who has so well who's hosting [00:46:41.21] [00:46:41.21] you know are probably are automatic opens a jape her with this is Jay w. app. [00:46:47.00] [00:46:48.04] Supposed to remain contents go through menus in place of it [00:46:51.23] [00:46:53.04] there's a lot more challenges like you know what Brown buns to press [00:46:56.06] [00:46:56.06] you know how do we make sure that it's a video being played but [00:46:58.17] [00:46:58.17] more of that you know after this title I can talk about this offline or in a more [00:47:02.15] [00:47:02.15] in the in the paper so you know that we're able to in English the interactions with [00:47:07.14] [00:47:07.14] T.V.'s The next question is how do you detect privacy leaks. [00:47:11.06] [00:47:12.08] So we have a system. [00:47:13.06] [00:47:15.05] Media who thinks them or want repeat or count with a person information so [00:47:19.14] [00:47:19.14] you made platforms like rope with a house on a require you to sign up in the count [00:47:24.00] [00:47:24.00] and a very in ink so we sign up I happen to be [00:47:27.13] [00:47:27.13] Macy leaf 47 as you know that that's my fake persona. [00:47:31.08] [00:47:33.07] Yeah it's never used anywhere and to detect any private information being [00:47:37.07] [00:47:37.07] shared in the wire and not for information like you know whether my e-mail address or [00:47:42.05] [00:47:42.05] my zip code or not your credit card number is sharing clear text if not I look for [00:47:47.15] [00:47:47.15] it hashes on the e-mail address or credit card or not being shared over clear text [00:47:52.22] [00:47:52.22] while you are up or care about hashes the 3rd parties with hashes [00:47:58.08] [00:47:58.08] the 3rd price don't really know what might really not address this why do we care. [00:48:01.21] [00:48:03.14] The problem with our human hashing is that if their party will be able to [00:48:07.13] [00:48:07.13] obtain a somewhat persistent identifier about this person they see the. [00:48:11.13] [00:48:12.20] You know even with the hash an e-mail address. [00:48:16.16] [00:48:16.16] If this is a hash of his email address on t.v. [00:48:19.12] [00:48:19.12] And if this hash again if I you know are invited on other my smartphone [00:48:24.12] [00:48:24.12] unit using the same e-mail address they can live Raftery on its heel without [00:48:28.23] [00:48:28.23] a smartphone and they can do cross the last track so that's why I look for [00:48:34.01] [00:48:34.01] you know my personal information shared either in carry fax or [00:48:38.16] [00:48:38.16] you know hash or sometimes hashes of hashes through other network [00:48:43.13] [00:48:43.13] any cases where the traffic is encrypted you want to mental connection but [00:48:48.03] [00:48:48.03] how do we could you know if you could a traffic. [00:48:50.00] [00:48:52.14] Breakdown in terms of ams on the route because Amazon a special case where we [00:48:56.17] [00:48:56.17] have amazing rescue and was able to remove the device and change the rules if it gets [00:49:01.12] [00:49:01.12] and for apps that actually know do certain in a used fear to bypass the search and [00:49:07.18] [00:49:07.18] are preparing an app trap basis and he's able to basically decrypt almost all. [00:49:12.09] [00:49:14.00] This technique will hopefully presumably work on other enjoy Base Marty but [00:49:19.04] [00:49:19.04] we're talking Hendra smarties for [00:49:20.11] [00:49:20.11] their other smarty suddenly have a generic solutions for say Roku or [00:49:24.22] [00:49:24.22] the other platforms where we can't trust in that case we 1st look for [00:49:30.16] [00:49:30.16] traffic that are on encrypted but in fact about 47 percent have some Roku [00:49:35.21] [00:49:35.21] at least $180.00 connection that is encrypted to add and [00:49:40.06] [00:49:40.06] tracking services for connections there are actually encrypted we do [00:49:45.12] [00:49:45.12] this technique we have this new technique opportunistic minimal Here's how it works. [00:49:50.08] [00:49:51.18] Let's say you have an app that talks to 3 different ever testing servers [00:49:57.14] [00:49:57.14] and many apps or it brings such a way that they use some 3rd party in [00:50:02.10] [00:50:02.10] question library sort advertising libraries to talk to the staff surfers and [00:50:06.19] [00:50:06.19] the interviews so happens that you know the 1st 2 are connections are based on one [00:50:11.02] [00:50:11.02] library that the Valley's difficulties and the last one doesn't really this difficult [00:50:15.15] [00:50:15.15] maybe because of you know of her error in configurational every store because of how [00:50:20.14] [00:50:20.14] libraries are built or space but you know for us you know we don't know there what [00:50:25.22] [00:50:25.22] we don't know what connections are are are you know there isn't happy are you got no [00:50:31.05] [00:50:31.05] if we just blindly do mentally the 1st you can x. is not a little through and [00:50:36.18] [00:50:36.18] many smarty after simply crash because you can't talk to evidence or. [00:50:40.06] [00:50:41.18] We do here or you do here if this we'll find [00:50:47.13] [00:50:47.13] the maximum maximum subset of connections that we can [00:50:53.18] [00:50:53.18] mend it up without crashing the app in that case we find a method [00:50:59.22] [00:50:59.22] such as Actually this petition connection if you mental this particular action. [00:51:03.14] [00:51:04.14] You know the absolute runts So in doing so we are able to decrypt. [00:51:08.23] [00:51:10.17] At least one connection from 5 percent of the apps [00:51:13.17] [00:51:16.21] I present Why do we care. [00:51:19.23] [00:51:22.16] But the 5 percent of what n.b.c. Fox News and other apps [00:51:27.23] [00:51:27.23] that have potentially huge audience base and we were shocked that [00:51:33.11] [00:51:33.11] you know these popular apps use libraries that they're not valid the certificate. [00:51:38.08] [00:51:39.15] That shows 2 things Number one you know some little community could be shopping in [00:51:43.16] [00:51:43.16] the Ritz too they'd potentially use some horrible code or libraries its path. [00:51:51.04] [00:51:53.03] Belly's for us to be able to keep this traffic they were able to find out what's [00:51:57.01] [00:51:57.01] being sent what previous data is being sent through with hand tracking services [00:52:01.19] [00:52:01.19] do you have examples of the kinds of havoc they are being sent from Roku and [00:52:07.04] [00:52:07.04] number of apps that use up and same case for Amazon and [00:52:11.03] [00:52:11.03] we'll have a few numbers here one is serial number the cameras your number [00:52:15.18] [00:52:15.18] because it is a number that doesn't change it goes on it's tighter harder [00:52:21.10] [00:52:21.10] you know other ideas like say add id to get a reset. [00:52:25.03] [00:52:26.07] But seriously don't change so now can know that and [00:52:31.14] [00:52:31.14] it is you it is this device even after you do a factory reset or [00:52:35.12] [00:52:35.12] after you change log in using different e-mail address Amazon and [00:52:40.22] [00:52:40.22] similar and will have a particular field wife in a society that basically what [00:52:45.23] [00:52:45.23] this life says is that back to Brazil as an Amazon share your wife at [00:52:50.16] [00:52:50.16] a society with some 3rd party advertising train services Why do you care you [00:52:56.02] [00:52:56.02] know what has id my is what my eyes net here at 347 Why do we care if you know [00:53:01.05] [00:53:01.05] some other companies know that you know my essay this wacky Sneck you know 347 [00:53:06.02] [00:53:08.18] Have you heard of the term heavy her Web site called google. [00:53:11.03] [00:53:12.04] It is an open source yes that that basically translates quite as society to [00:53:16.03] [00:53:16.03] precise you locations on the road to Google occasionally vi [00:53:22.02] [00:53:22.02] allows you to escalate you know to other societies into as precise your location [00:53:26.23] [00:53:28.09] in fact we did this project in Princeton we just entered our weather society into. [00:53:32.13] [00:53:33.23] A cool place if you die and you know cool his a.p.i. [00:53:38.09] [00:53:38.09] was able to pinpoint us down to the correct building in short haul [00:53:42.06] [00:53:43.12] to what I have to say the least little kitchen badly and [00:53:48.13] [00:53:48.13] next question is you know 2 percent absolutely care. [00:53:52.04] [00:53:52.04] Well these are present in what c.n.n. and [00:53:55.20] [00:53:55.20] all that the Discovery Channel again absolutely there's the huge audience [00:54:00.19] [00:54:00.19] base after then surely their view was by no hundreds of thousands [00:54:05.14] [00:54:05.14] of users you don't know and all these cases we have users you know [00:54:10.02] [00:54:11.10] there are why societies fleet is some 3rd party ever had tracking services who not [00:54:15.23] [00:54:15.23] only have the IP address of the users which they'll really be few occasions but [00:54:19.21] [00:54:19.21] you're potentially at the exact location so that's pretty bad. [00:54:23.14] [00:54:28.23] And we'll take a quick water break in case anybody has questions [00:54:32.15] [00:54:32.15] a lead a sentence here out there head. [00:54:34.14] [00:54:40.21] Back. [00:54:41.09] [00:54:46.14] To this privacy do I ask you know what can you about it some smart t.v. [00:54:51.19] [00:54:51.19] site recruit the enemy this is not an assassin we're trying or bad tracking or [00:54:56.13] [00:54:56.13] limit and tracking then case for Amazon or [00:55:00.15] [00:55:00.15] maybe down a menu there's an option to turn off into space advertising [00:55:04.14] [00:55:06.22] that is that end of the world is a good all good [00:55:10.21] [00:55:10.21] let's say return you know let's say we turn off track and [00:55:14.23] [00:55:14.23] we're back to the same you know this is what we show early in previous like what [00:55:19.16] [00:55:19.16] happens if you disable actress him which of these numbers will become 0 so [00:55:24.20] [00:55:24.20] in that still quick poll here poor a.b.c. let's look at poll [00:55:31.00] [00:55:31.00] to see you know slack way if you think that if you disable ad tracking [00:55:35.22] [00:55:37.09] 0 of the apps 0 percent the absolute correct will send out that ad id [00:55:41.08] [00:55:42.11] that believe you think that as soon as it happens with the sale that Apne so [00:55:47.22] [00:55:47.22] and so forth Ok a new poll is life. [00:55:53.18] [00:55:56.04] Let's. [00:55:56.16] [00:55:58.05] See who answers her count down. [00:56:01.15] [00:56:02.23] And seconds per folks to take the poll and [00:56:07.06] [00:56:07.06] night is me 8. [00:56:10.22] [00:56:12.17] X. y. or 3 to [00:56:18.11] [00:56:18.11] one stops Ok majority of folks think that [00:56:25.23] [00:56:25.23] if you turn off after tracking 0 percent they have 0 who was sent out had id. [00:56:31.23] [00:56:33.07] Follow equally by his wrist b.n.c. at the. [00:56:35.21] [00:56:41.05] Review time. [00:56:41.21] [00:56:43.03] Eternal at tracking only this number becomes there all the rest remains. [00:56:48.20] [00:56:50.04] There like Zacchaeus and I you know there are some ups and downs but [00:56:54.10] [00:56:54.10] that the becomes 0 in sweep the indication is that if you're a turn up at tracking. [00:57:00.07] [00:57:01.18] Your robot **** will not stand at id perhaps but [00:57:06.08] [00:57:06.08] it will still stand other information to errors in tracking companies [00:57:11.20] [00:57:11.20] whether these Everybody 3 companies actually used to stay or [00:57:14.12] [00:57:14.12] be sent we don't know but at least we know that after still sending this information. [00:57:18.20] [00:57:23.21] In summary we built a general purposely Crowther we conducted all of the 1st large [00:57:28.15] [00:57:28.15] scale notice of the absolute proof when Amazon on grass is actually using same [00:57:34.07] [00:57:34.07] is actually using the same system to cross stamps on this well we found private data [00:57:39.10] [00:57:39.10] being shared with 3rd party every train from police and I didn't allow this but [00:57:43.20] [00:57:43.20] that we found tracking on children without consent basically children oriented [00:57:48.15] [00:57:48.15] apps their collective such information without parental consent which is safe but [00:57:53.05] [00:57:53.05] is a violation of caught her baby a child protection and [00:57:56.04] [00:57:56.04] press attention asked in a game of talk and [00:57:58.18] [00:57:58.18] I felt a commission about this issue and hopefully they're working on this. [00:58:01.17] [00:58:03.01] And with that last light basically summarizes what you can do [00:58:07.19] [00:58:07.19] out of the specter status at how these pictures are platforms and [00:58:12.08] [00:58:12.08] what we learn from our Smart T.V.'s and I'm happy to take any questions [00:58:18.01] [00:58:18.01] thank you for questions and thank you for this mean polls [00:58:20.18] [00:58:25.17] Gregory asks if your papers are linked on my website yes answers Yes On arriving [00:58:30.17] [00:58:30.17] to the website at Sema let's say Now here. [00:58:35.02] [00:58:39.12] It is actually just curious since you've done so much work in the i.o.t. [00:58:43.04] [00:58:43.04] secret impediment if you want to talk about what what you think. [00:58:46.17] [00:58:48.01] We should be doing as a community kind of moving for early to talk about working [00:58:51.01] [00:58:51.01] with the f.t.c. or at least talking with the f.e.m.a. be that it's a massive But [00:58:55.09] [00:58:55.09] what are the Nicole social technique of kind of the base directions that you think [00:59:01.02] [00:59:01.02] we can be forgiven of my biggest interest that is actually [00:59:06.04] [00:59:06.04] transparency brain transparency light is a clear you know showing the world whether [00:59:11.02] [00:59:11.02] it's policymakers or consumers or companies that you know I know what you [00:59:15.18] [00:59:15.18] guys are doing and you know in the back that's why we I want to keep. [00:59:21.11] [00:59:22.19] The market izing you know. [00:59:24.03] [00:59:25.16] That they're using a highly Specter you know consumers should be able to find [00:59:30.01] [00:59:30.01] out more about their home network their smart devices feel better and [00:59:33.13] [00:59:33.13] their researchers who actually conduct more measurements of home networks smart [00:59:37.15] [00:59:37.15] devices using the specter So transparency I think it's one of the key points [00:59:42.22] [00:59:42.22] I'm working on you know with transparency that brings us Imperial evidence and [00:59:47.01] [00:59:47.01] with evidence then we can talk about ash it's about you know whether it [00:59:51.18] [00:59:51.18] be technical actions or actions through policy makers. [00:59:55.00] [00:59:57.13] That is unless there were yeah definitely I think it was a question into. [01:00:02.14] [01:00:04.02] How the other doesn't acquire or. [01:00:07.07] [01:00:08.23] Is there block that tracks this research as a website so [01:00:13.03] [01:00:13.03] on this website you can actually see links to papers as well as blocks write ups or [01:00:18.21] [01:00:18.21] even the newspaper reports. [01:00:20.11] [01:00:24.02] Greg and ask if i installer inspector How do you opt in and [01:00:27.06] [01:00:27.06] I say did I share with you Ok here's how it works if you install out inspector. [01:00:32.04] [01:00:33.16] Is reticent to ask you for root permission because we need to be absolutely and [01:00:39.17] [01:00:39.17] at runtime you know to see the source code at run time it generates a random if [01:00:45.11] [01:00:45.11] random you you you need to identify that sent back to us and [01:00:50.18] [01:00:50.18] that's the only persistent and persistent identifier we have [01:00:55.06] [01:00:55.06] that is generated at run time at random we 1st saw. [01:00:58.23] [01:01:00.03] If you were to suffer such a cli it would be the same right and it had been a fire [01:01:03.21] [01:01:03.21] you know remove the software installs will be different identifier so [01:01:08.03] [01:01:08.03] that's how we're able to you know type different traffic the same user but [01:01:12.07] [01:01:12.07] we don't know your IP address we don't know anything else about you and [01:01:17.11] [01:01:17.11] we collect the headers only except for a few cases except for [01:01:21.12] [01:01:21.12] our client holster T.L.'s and the n.-s. that's enables us to do so [01:01:26.15] [01:01:26.15] now us until their security and we are able to remain in for [01:01:29.20] [01:01:29.20] what their main aims are actually use their communication. [01:01:32.13] [01:01:34.18] But we take it a privacy very very seriously [01:01:38.09] [01:01:38.09] other than as it can only be accessed by researchers right now [01:01:42.13] [01:01:42.13] that are on the ira the protocol is not open access for now. [01:01:45.23] [01:01:49.02] But that's a. [01:01:49.14] [01:01:50.19] Many very questions Gregory just jumping through was the one. [01:01:54.09] [01:01:57.05] That so is that Peter asks. [01:02:00.02] [01:02:02.12] Why do you choose that history are smart T.V.'s out of all the smart home house. [01:02:06.11] [01:02:08.03] Well in the 2nd part of talk we study us northeast because [01:02:13.10] [01:02:13.10] we looked out Inspector stares at. [01:02:14.23] [01:02:16.14] One of the devices that are most likely to talk to some 3rd party ever present travel [01:02:21.16] [01:02:21.16] services personnel that start he's come out jump out the virus here though [01:02:29.08] [01:02:29.08] it doesn't mean that other devices don't track it here's a reason. [01:02:33.20] [01:02:35.12] Why yes when I say everyone tracking companies I'm using a list caught this [01:02:40.01] [01:02:40.01] connect blacklist this last blacklist cure rates [01:02:45.01] [01:02:45.01] are advertising tracking companies there are seen on the web [01:02:50.06] [01:02:51.19] and turns out as of the many of these companies also do start tracking as well [01:02:55.00] [01:02:56.02] but this list is not documented any tracking companies are I o t [01:03:00.22] [01:03:00.22] only if I don't know of I'm not aware of any such list and [01:03:06.02] [01:03:06.02] because we don't have such a list we have our analysis for tracking is actually [01:03:09.18] [01:03:09.18] biased toward devices that top the companies in the disconnect list. [01:03:15.03] [01:03:15.03] That being said we have some preliminary analysis of potential tracking [01:03:20.19] [01:03:20.19] Smartie on the other smart devices they're not smart T.V.'s Examples include. [01:03:25.15] [01:03:27.01] Here why a Chinese company that. [01:03:29.15] [01:03:30.19] Is at that maintains a large. [01:03:32.21] [01:03:33.23] Control performs so when your phone wants to turn on a light bulb or [01:03:38.03] [01:03:38.03] Smart life a plug the traffic typically you know goes through the cloud [01:03:43.19] [01:03:43.19] through what's known as you know sometimes confuse he service and there are. [01:03:48.18] [01:03:50.04] Specific color providers for heavily into t.v. [01:03:53.03] [01:03:53.03] and one of the displays actually Koch we have so. [01:03:55.23] [01:03:57.18] Many just how the traffic control traffic for [01:04:00.05] [01:04:00.05] hours clocks cameras many kinds of devices and [01:04:04.19] [01:04:04.19] potentially they have you know information on who the users are their IP addresses [01:04:10.22] [01:04:10.22] as well as a habit of users when they turn on t.v. screen times are when they [01:04:16.03] [01:04:16.03] turn on lights later enough lights built such light habits credential be used for [01:04:20.14] [01:04:20.14] you know behaviors that users don't have this is not where well studied but [01:04:25.09] [01:04:25.09] I think that's the next scary frontier bench if there's a question [01:04:30.10] [01:04:33.11] Ok cool then I think were 5 minutes after the hour. [01:04:36.21] [01:04:39.01] Because. [01:04:39.13] [01:04:41.23] Yes So thank you so much for [01:04:43.18] [01:04:43.18] giving this presentation then it was really interesting and [01:04:45.21] [01:04:45.21] Earth Science at a number of people thought that with a lightning and [01:04:49.10] [01:04:49.10] kind of terrifying at the same time the really interesting work of thank you for [01:04:53.21] [01:04:53.21] giving us this presentation and thank you everyone else for it. [01:04:57.21] [01:04:59.12] But yeah happy to have offline Ari Stinney us. [01:05:02.19] [01:05:05.06] Great thank you. [01:05:06.01]