TY - CONF T1 - Unwanted SMTP Paths and Relays T2 - 2007 2nd International Conference on Communication Systems Software and Middleware Y1 - 2007 A1 - Palla, Srikanth A1 - Ram Dantu KW - Computer science KW - content-based spam filters KW - Counterfeiting KW - Credit cards KW - email spam filters KW - emails wantedness analysis KW - end-to-end path analysis KW - Filters KW - IMAP message status flags KW - information filtering KW - information filters KW - Information security KW - Legislation KW - Multimedia communication KW - relay analysis KW - Relays KW - SMTP paths analysis KW - unsolicited e-mail KW - Unsolicited electronic mail KW - Web page design AB -

Based on the social interactions of an email user, incoming email traffic can be divided into different categories such as, telemarketing, Opt-in family members and friends. Due to a lack of knowledge in the different categories, most of the existing spam filters are prone to high false positives and false negatives. Moreover, a majority of the spammers obfuscate their email content inorder to circumvent the content-based spam filters. However, they do not have access to all the fields in the email header. Our classification method is based on the path traversed by email (instead of content analysis) since we believe that spammers cannot forge all the fields in the email header. We based our classification on three kinds of analyses on the header: i) EndToEnd path analysis, which tries to establish the legitimacy of the path taken by an email and classifies them as either spam or non-spam; ii) Relay analysis, which verifies the trustworthiness of the relays participating in the relaying of emails; iii) Emails wantedness analysis, which measure the recipients wantedness of the senders emails. We use the IMAP message status flags such as, message has been read, deleted, answered, flagged, and draft as an implicit feed back from the user in Emails wantedness analysis. Finally we classify the incoming emails as i) socially close (such as, legitimate emails from family, and friends), ii) socially distinct emails from strangers, iii) spam emails (for example, emails from telemarketers, and spammers) and iv) opt-in emails. Based on the relation between spamminess of the path taken by spam emails and the unwantedness values of the spammers, we classify spammers as i) prospective spammers, ii) suspects, iii) recent spammers and iv) serial spammers. Overall, our method resulted in far less false positives compared to current filters like SpamAssassin. We achieved a precision of 98.65% which is better than the precisions achieved by SPF and DNSBL blacklists.

JF - 2007 2nd International Conference on Communication Systems Software and Middleware ER -