Spam or Ham? Characterizing and Detecting Fraudulent "Not Spam" Reports in Web Mail Systems

Ramachandran, Anirudh; Dasgupta, Anirban; Feamster, Nick; Weinberger, Kilian

Title:

Spam or Ham? Characterizing and Detecting Fraudulent "Not Spam" Reports in Web Mail Systems

dc.contributor.author	Ramachandran, Anirudh
dc.contributor.author	Dasgupta, Anirban
dc.contributor.author	Feamster, Nick
dc.contributor.author	Weinberger, Kilian
dc.contributor.corporatename	Georgia Institute of Technology. College of Computing
dc.contributor.corporatename	Georgia Institute of Technology. School of Computer Science
dc.contributor.corporatename	Washington University (Saint Louis, Mo.)
dc.contributor.corporatename	Yahoo! Research Labs
dc.date.accessioned	2011-04-19T13:54:40Z
dc.date.available	2011-04-19T13:54:40Z
dc.date.issued	2011
dc.description	Research area: Information Security and Cryptography
dc.description.abstract	Web mail providers rely on users to “vote” to quickly and collaboratively identify spam messages. Unfortunately, spammers have begun to use large collections of compromised accounts not only to send spam, but also to vote “not spam” on many spam emails in an attempt to thwart collaborative filtering. We call this practice a vote gaming attack. This attack confuses spam filters, since it causes spam messages to be mislabeled as legitimate; thus, spammer IP addresses can continue sending spam for longer. In this paper, we introduce the vote gaming attack and study the extent of these attacks in practice, using four months of email voting data from a large Web mail provider. We develop a model for vote gaming attacks, explain why existing detection mechanisms cannot detect them, and develop new, efficient detection methods. Our empirical analysis reveals that the bots that perform fraudulent voting differ from those that send spam. We use this insight to develop a clustering technique that identifies bots that engage in vote-gaming attacks. Our method detects tens of thousands of previously undetected fraudulent voters with only a 0.17% false positive rate, significantly outperforming existing clustering methods used to detect bots who send spam from compromisedWeb mail accounts.	en_US
dc.identifier.uri	http://hdl.handle.net/1853/38592
dc.language.iso	en_US	en_US
dc.publisher	Georgia Institute of Technology	en_US
dc.relation.ispartofseries	SCS Technical Report ; GT-CS-GT-11-06	en_US
dc.subject	Bots	en_US
dc.subject	Detection methods	en_US
dc.subject	Email voting	en_US
dc.subject	Filtering	en_US
dc.subject	Spam messages	en_US
dc.subject	Vote gaming attack	en_US
dc.subject	Web mail accounts	en_US
dc.title	Spam or Ham? Characterizing and Detecting Fraudulent "Not Spam" Reports in Web Mail Systems	en_US
dc.type	Text
dc.type.genre	Technical Report
dspace.entity.type	Publication
local.contributor.corporatename	College of Computing
local.contributor.corporatename	School of Computer Science
local.relation.ispartofseries	College of Computing Technical Report Series
local.relation.ispartofseries	School of Computer Science Technical Report Series
relation.isOrgUnitOfPublication	c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication	6b42174a-e0e1-40e3-a581-47bed0470a1e
relation.isSeriesOfPublication	35c9e8fc-dd67-4201-b1d5-016381ef65b8
relation.isSeriesOfPublication	26e8e5bc-dc81-469c-bd15-88e6f98f741d

Files

Original bundle

Now showing 1 - 1 of 1

Name:: GT-CS-11-06.pdf
Size:: 597.58 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Research Publications

Title: Spam or Ham? Characterizing and Detecting Fraudulent "Not Spam" Reports in Web Mail Systems

Files

Original bundle

Collections

Title:

Spam or Ham? Characterizing and Detecting Fraudulent "Not Spam" Reports in Web Mail Systems