ohohlfeld.com : blog
Ohohlfeld.com Banner

Spamalytics: Who goes for Spam?

November 2, 2008

Filed under: internet, papers, research — Tags: , , — Oliver @ 5:48 pm

Spam (Image source)

Direct marketing is not a new approach and its history dates back to the 19th century when the first mail-order catalogues were distributed. Nowadays, the presence of unsolicited bulk e-mail is annoying Internet users world-wide on a daily basis. While there were some costs involved to distribute mail-order catalogues, the marginal cost to send  an e-mail is tiny. Therefore, e-mail based campaigns are profitable even when a negligible amount of receivers goes for the advertised product. The bad news, as highlighted by Kanich et al. is, “a perverse byproduct of this dynamic is that sending as much spam as possible is likely to maximise profit”. In order to maximise the reach of spam advertisement, spammers need to fight with developers of anti-spam technology; the developers of anti-spam software play a cat-and-mouse game with the senders of spam, who have to adapt to the latest spam filtering technologies in order to reach as many people as possible.

However, the presence of spam, despite years of energetic deployment of anti-spam technology, demonstrates the profitability of campaigns using spam. So the natural question rises up: who goes for spam?

This issue is addressed in a paper entitled Spamalytics: An Empirical Analysis of Spam Marketing Conversion presented at the 15th ACM Conference on Computer and Communication Security on Tuesday October 28.

Spam Conversion Pipeline (Image source)

The authors are interested in the conversion rate of spam, which is the probability than an unsolicited e-mail will ultimately elicit a sale. Therefore they infiltrate ongoing spam campaigns sent using the Storm botnet to provide measures for different stages of the spam conversion pipeline as shown in the above figure. In order to understand their methodology, we need to briefly review the way Storm works.

Storm Botnet Architecture (Source: Kanich et al.)

Storm is a peer-to-peer botnet that propagates via spam. The above figure shows the three primary classes of Storm nodes involved in sending spam: worker bots, proxy bots and master servers. While the worker bots are responsible for actually sending the spam, proxy bots act as conduits between workers and master servers. When downloading the Storm binary advertised in spam mails, the infected host becomes either a worker bot (if not reachable from the Internet, e.g. due to firewall restrictions) or a proxy bot. As the command and control traffic directed to the worker bots is unencrypted and always passes through a proxy bot, a man-in-the-middle attack is possible and carried out in the paper by Kanich et al.: by rewriting the comand and control traffic directed to worker bots, spam templates, dictionaries and addresses could be changed and adapted to their needs.

Their methodology can be summarised as follows. They hosted a set of Storm proxy bots, created duplicates of websites advertised in spam and have rewritten the command and control traffic to let the worker bots to advertise their sites instead of the original ones. Thus, no user received more spam, but some users received spam that is less dangerous that it would be otherwise.

Over the course of their experiment, they rewrote the content of about 470 million spam mails sent in three campaigns: about 347 million spams involved in a phamarcy campaign, 83 (38) million for a Storm self-advertisement campain using postcards (april fool). They received 28 purchases on the faked page for the advertised pharmaceutical product and 541 infections of the faked Storm binary, geographically distributed as shown below:

This translates into the following conversion rates (caution: results are not intended to be generalised in other contexts!):

  • 1 in 12,500,000 pharmacy spams lead to a purchase.
  • 1 in 265,000 greeting card spams lead to an infected machine.
  • 1 in 178,000 April Fool’s Day spams lead to an infected machine.
  • 1 in 10 people visiting an infection website downloaded the executable and ran it.

Many more information can be found in their paper (see below), such as top-10 most targeted email address domains, filtering statistics at each stage of the conversion pipeline, statistics about the efficiency of anti-spam methods deployed by typical free e-mail providers (e.g. hotmail and Google mail), time-to-click distribution (the first users visited the advertised page 10 seconds (sic!) after the spam was sent), effects of blacklisting and many more.
The paper is very well written and leads to new insights into how spam works. Interested readers should therefore consider reading this piece of well-conducted research.

Source: C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. Voelker, V. Paxson, S. Savage. Spamalytics: An Empirical Analysis of Spam Marketing Conversion. 15th ACM Conference on Computer and Communications Security 2008, Alexandria, VA, USA. [Summary, PDF Paper, BibTeX]

Further Information:

Electricity over IP

March 6, 2008

Filed under: fun, internet, rfc — Tags: , , , , , , , , , , , , , , , — Oliver @ 10:26 pm

I discovered RFC 3251 today, which describes Electricity over IP as persiflage to numerous RFC’s published by the IETF themed “X over Y”, whereas “X over Y” will be followed by “Y over X” sooner or later. Just for instance, once we had the usual setup where IP packets were encapsulated in Ethernet frames which is a often used link layer protocol (IP over Ethernet). Nowadays, Ethernet over IP is possible as well (RFC 3378: “EtherIP: Tunneling Ethernet Frames in IP Datagrams“), turning the ordinary protocol stack upside down by sending layer 2 protocols over layer 3.

Once upon a time, MPLS was used to go round the “time intensive” routing process in IP backbone networks by using pre-defined switching paths to guide the packet’s way through the network (IP over MPLS). Since June 2007 we got RFC 4817 describing how to encapsulate MPLS in IP packets (MPLS over IP).

In the 90’s, service providers were deploying ATM networks before they were suppressed by the more cost-effective Ethernet technology about a decade later [1]. ATM was once famous partly due to its Quality of Service abilities. Following the general trend, the next step towards a upside down protocol stack is to introduce the notion of pseudo-wires in order to route ATM over IP.

The next step is to tunnel Synchronous Optical Networking (SONET) over MPLS (over IP over … ?), which is interesting as SONET is synchronous whereas IP is not and easily leads to jitter within the transmission.

By the way, not only the IETF is tunnelling the whole protocol stack from every possible direction when needed. Also amateur radio operators encapsulate TCP/IP in AX.25 (the layer 2 protocol used for wireless transmissions in amateur radio) using the 44. class A IP subnet for accessing their own WWW called HamWeb or chatting over IRC instead of more appropriate solutions. So here we have IP over AX.25, which is just like IP over Ethernet, fine. However, when no radio is available, the radio network can be accessed by tunnelling AX.25 over IP (AXIP). So here we’re sending a layer 2 protocol over a layer 3 one again.

As Spam over IP is already old-fashion, we now got IP over Spam. Yes, online banking can now be encapsulated in Viagra! Note that this approach is new as layer 3 packets are now encapsulated in layer 7 data! We do no longer stick to tunnelling the lower layer but consider higher layers also. The potential benefit of this encapsulation is that the great (fire)wall in China can be overcome, as they’re sending and receiving a lot of spam which turns it into a high-bandwidth, low-latency channel, as Dan Kaminsky highlighted in his talk.

So what’s next? Well, RFC 3251 will give the definite answer: Electricity over IP, where IP packets carry electricity in discrete, digitalized form. The document is based on the discovery that the distribution network for electricity is not an IP network. So do service providers have to extend their triple play offerings (classical Internet, (IP)TV, telephone) by electricity over IP to get the IETF into non-technical areas such as the distribution of electricity? Electricity could be routed “over the Internet to reach remote places which presently do not have electricity connections but have only Internet kiosks (e.g., rural India)”. Well, I suggest to read the document and find out about newly emerging technologies, such as:

  • MPLampS: Mostly Pointless Lamp Switching
  • LER: ‘Low-voltage Electricity Receptor – fancy name for “lamp”‘
  • VPN: Voltage Protected Network

So long,

Oliver — waiting for the first electricity trojan or worm and wondering whether houses are no longer in danger due to electricity thieves.

PS: When will we become tunnelling approaches that consider layer 8 (guess who’s controlling the applications …)? I could imagine IP over long or short term memory by memorizing the packet. I’m wondering how Raptor codes (a state-of-the-art Forward Error Correction technique) would perform when coping with an error process called forgetting?

PPS: RFC 2549 introduced Internet Protocol over Avian Carriers (IPoAC) . Will there be Avian Carriers over IP soon?

[1] Disclaimer (just to prevent of someone getting me wrong): Yes, I know how ADSL traffic is still carried (although there is a trend to migrate to Ethernet as this technology is cheaper than ATM due to its wide-spreadness), remember that this post is devoted to RFC 3251 which is a joke so some statements may be a bit “overgeneralised” here ;-) Moreover, I also know they benefits of tunnelling layer 2 protocols over IP, they are mostly written in the article “Layer 2 over IP/MPLS” by Chris Metz (Cisco) published in IEEE Internet Computing journal back in 2001. However, this post is not an article discussing pro’s and con’s of several technologies.

© 2001-2008 by Oliver Hohlfeld, M.Sc. | Imprint

Send me mail to my E-Mail address:
de3njy0nde@tntler.de
de3njy0nde@abc.thomas-graf.de
de3njy0nde@abc.ohohlfeld.com

maljoqu.ermecke@namesp.ohohlfeld.com
max.mustermann@namensp.ohohlfeld.com

Send me mail to my E-Mail address:
dewmdc0nde@tntler.de
dewmdc0nde@abc.ohohlfeld.com
dewmdc0nde@abc.thomas-graf.de

Send me mail to my E-Mail address:
ju3nzu0nde [at] tntler [dot] de
ju3nzu0nde [at] abc.ohohlfeld [dot] com
ju3nzu0nde [at] abc.thomas-graf [dot] de

Send me mail to my E-Mail address:
EMail EMail EMail

Name: e-mail: Subject: Message:

Leave a comment

schreinermstr.gubernat
schreinermstr.gubernat
schreinermstr.gubernat
My Super Secret Homepage

Warning: stristr() [function.stristr]: Empty delimiter. in /home/oliver/public_html/ohcomblog/wp-content/plugins/wassup/wassup.php on line 2093