ohohlfeld.com : blog
Ohohlfeld.com Banner

Believe Me: I am the Internet!

January 29, 2010

Filed under: internet — Tags: , , , — Oliver @ 12:50 am

The Internet is typically perceived as one atomic entity. However, in reality, it is composed of roughly 30,000 networks called Antonomous System. The glue in the Internet, which provides connectivity, is the Border Gateway Protocol (BGP). The protocol itself is old and, in its basic form, relys on trust. This can be easily exploited by fraud or misconfigurations, causing parts of the Internet to be unreachable.

I’m currently preparing some classical BGP incidents for tomorrows Network Protocols and Architecture class. While I was looking form some of the classical BGP prefix hijacks that have been covered widely in the press, I found some nice presentation illustrating the incidents:

Prefix hijacks are an classic and often exploited by Spammers (see slide 17 of our presentation, partly based on Feamster’s Sigcomm paper). An extension of this can be used to eavesdropp traffic by re-routing traffic.  A non-trivial evesdropping attack that requires trust from the used upstream provider was presented at DefCon 2008 (see the slides).

A solution can be found in Secure BGP. However, this approach is—like IPv6—not widely deployed.

Further resources:

“Haste ma’n netblock?”

Find Crackz Using Whois

November 10, 2009

Filed under: fun, internet, teaching — Oliver @ 6:25 pm

One of my student reported a funny situation when he was using whois to solve the exercises in my group on Friday:

$ whois google.com

(…)

Server Name: GOOGLE.COM.SUCKS.FIND.CRACKZ.WITH.SEARCH.GULLI.COM
IP Address: 80.190.192.24
Registrar: EPAG DOMAINSERVICES GMBH
Whois Server: whois.enterprice.net
Referral URL: http://www.enterprice.net

Internet Measurement Seminar: Day I

February 25, 2009

Filed under: internet, papers, research, teaching — Tags: , , , , , , — Oliver @ 5:33 pm

Today was the first day of our two days blockseminar on Internet Measurement, in which I supervised two students. During the seminar, we addressed the following topics (papers) by talks held by students attending the seminar along with a discussion on the topic afterwards:

  • Characterizing Files in the Modern Gnutella Network: A Measurement Study [Slides] [Student Paper] [Original Paper]
    Which files are shared on Gnutella and what are their characteristics? Besides studies that derived traces by hosting peers dedicated to provide measurement data, this paper describes data derived from crawls of the Gnutella network.
  • Rarest First and Choke Algorithms Are Enough [Slides] [Student Paper] [Original Paper]
    This paper discusses why BitTorrent performs well and states that the Rarest First Algorithm and the Choke algorithm are enough to provide reasonable fairness, diversity of the content pieces and performance. Roughly speaking, Those are the key features that differentiate BitTorrent from other peer-to-peer file sharing protocols.
  • Leveraging BitTorrent for End Host Measurements [Slides] [Student Paper] [Original Paper]
    How optimistic unchokes—provided by BitTorrent and essential for its functionality—can be exploited to perform end host measurements; a dedicated and modified BitTorrent client called BitProbes downloads two megabytes of data from peers—by acting as a freerider and not uploading downloaded data—and uses this communication for conducting host measurements.
    Some points that have been discussed: (1) the authors claim that downloading but not storing the data is enough to avoid legal issues. Is that really true? (2) During a sample 7 days crawl, the authors covered about 20% of the available autonomos systems (AS) in the Internet. What does this number mean? Is it a high coverage, or a low one? For the answe, one has to keep in mind that not all AS are likely to host BitTorrent clients (like enterprise networks).
  • Unconstrained Endpoint Profiling (Googling the Internet) [Slides] [Student Paper] [Original Paper]
    How documents indexed by Google can be used to label IP addresses with applications run by a particular host
    The discussion mainly focused on the question whether the proposed method is really unconstrained as the title of the paper claims. Some key points: (1) The propsed method relys on Google, but the Google index varies (regional filtering etc.). (2) Existance of the deep web: not every available document is indexed by a particular search engine. (3) How dynamic are IP addresses? What if we want to label IPs of access providers which usually map to a set of users that used it in the past? (4) Can we trust data provided by the third parties (e.g. faked access log files etc.)?
    We agreed that this methodology seems good to discover trends but details have to be taken with a pinch of salt.
  • I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System [Slides] [Student Paper] [Original Paper1 Paper 2]
    What kind of videos are shared on YouTube and what is their access characteristics. See my blog post on this from October 2007.
  • The Flattening Internet Topology: Natural Evolution, Unsightly Barnacles or Contrived Collapse? [Slides] [Student Paper] [Original Paper]
    This paper analyses a trend of big content provider building up WANs and tend to bypass Tier 1 providers to save transit costs and increase performance which flattens the Internet topology

For references to the original papers, the student papers (mostly in German) and slides, see the seminar webpage. The talks had a very high quality and the discussions were pretty interesting. So I’m really looking forward to day II.

ASN Resolution in Firefox

February 13, 2009

Filed under: internet — Tags: , , , , — Oliver @ 3:54 pm

Networks in the Internet are—roughly speaking—grouped into Autonomous Systems (AS), which have a corresponding Autonomous System Number (ASN) as identifier. Using the routing prefixes (CIDR), IP addresses can be mapped to autonomous systems to find out which administration is running a certain network in which the requested service is provided. Lookups can be done in several ways. However, a compfortable one is provided by a Firefox browser Plugin called ASNumber. For each accessed website, AS related information will be displayed in the status bar of the Firefox web browser.

Please note that the resolution is done using a service hosted by the authors of ASNumber: eu.asnumber.networx.ch (if not cached locally). Thus, privacy related information about ones browsing behaviour may be logged on an external site and thus one may not enable the plugin all the time. An example request illustrating the lookup is given by this link.

Books on Probability Theory

February 12, 2009

Filed under: Books, internet, math, stochastic — Tags: , , , , , , — Oliver @ 10:56 am

I briefly want to point to some (online) books on probability theory. There are a few good ones available online, like the book of of Robert Ash entitled Basic Probability Theory or some books by Robert Gray, e.g. Probability, Random Processes, and Ergodic Properties or Introduction to Statistical Signal Processing. However, when going for printed copies, one may look at books authored by Geoffrey Grimmett, especially Probability and Random Processes. The latter illustrates the right level of probability theory needed in most of the research fields in computer science in a comprehensive and understandable way. A strong feature of this book is its large collection of exercises (there is even a second book dedicated to exercises only, which is more helpful to the learner as it contains also the solutions to each exercise posed in that book) and good examples ilustrating the most important theorems.

Wirless Epidemiology

February 9, 2009

Filed under: internet, papers, research — Tags: , , , — Oliver @ 11:42 am

The debate about wireless security has, so far, focused on preventing people from getting unauthorised access to one’s wireless network. WEP has been shown to be breakable in less than 60 seconds, assuming a certain success probability, by Tews et al. in 2007. Unsecured wireless networks–and WEP can be considered as “unencrypted” due to the work by Tews et al.–are widely considered as a security problem, as unauthorised people may start misusing the network and cause additional costs or legal issues.

A paper by Hu et al. entitled WiFi Epidemiology: Can Your Neighbors’ Router Make Yours Sick? discusses a different issue. People know they might spread the flu virus due to airborne infection when being, living or working closely with other people and thus the virus may exploit this tightly interconnected proximity network. When using computer systems, many users might have experienced that once their system is infected by some virus or other kinds of malware, it might start contributing to spread the virus even further. So flu like epidemics can happen in the digital world as well.

The paper by Hu et al. addresses exactly this issue and transfer it to the wirless domain by asking wether wirless routers, which form a tightly interconnected proimity network in densely populated urban areas, can contribute to spreading malware and thus create “wireless epidemics”. Epidemiology is good understood in other fields. Transfering those results to the wirless domain and highlighting possible security flaws is thus very important and the paper moves into an interesting direction.

The scenario considered in the paper relys on typical security flaws: i) weak or unchanged default passwords and ii) weak or broken cryptographic systems. Thus, the rate of possible infections can be prevented as follows:

  • Change the default password of your wireless router to some reasonable secure password
  • Use an state-of-the-art cryptographic standard that is still considered as “secure” (currently: WPA). Thus, don’t use WEP any longer.

Analytical and Numerical Investigation of Ant Behavior Under Crowded Conditions

January 26, 2009

It is often promising to transfer successful concepts from biological to technical domains. Ant Colony Optimisation, which basic principle is based on pheromone attraction of ants on the way from the colony to a food source and back, is a good example for finding reasonable short paths or tours in graphs, e.g. for addressing NP complete problems like the TSP. Ant optimisation has also been applied to the problem of finding routes in the Internet by Caro et al. (1998) in a paper entitled Ant colonies for Adaptive Routing in Packet-switched Communications Networks. Peters et al. (2008) address the issue of load dependend optimisation in their paper entitled Analytical and Numerical Investigation of Ant Behavior Under Crowded Conditions, where they find an ant based approach promising to reduce congetion in the network by optimising routing algorithms.

Demographic Data of Social Networks

December 11, 2008

Filed under: Social Network, internet — Tags: , , , — Oliver @ 9:33 pm

I discovered a interesting compilation of data about various popular social networks, obtained by Google Adplaner and Google Insights. The report is entitled The 2008 Social Network Analysis Report – Geographic – Demographic and Traffic Data Revealed.

Data provided for Facebook seems quite interesting; while initially targeting colleges, most of the current users seems to be older, according to the information provided by Google. This is even more visible for the micro-blogging service Twitter. When looking at LinkedIn, the majority of the users seem to be in the post collage age, earn more money and has a higher education. Considering LinkedIn as a “network for professionals”, this is not unexpected. However, one has to rely on the validy of the data provided by a third-party.

Mendeley: Social Network Dedicated to Researchers

December 7, 2008

Filed under: Social Network, internet, research — Tags: , , — Oliver @ 4:42 pm

There are a lot of social networks available, dedicated to different needs. However, there is none focusing on researcher as clientele. This seems to be changing with Mendeley, a social network dedicated to scientists. The site is still in the early beta phase and lacks of a lot of users, but already seems promising. Mendeley provides a client–which is also available for Linux and runs fine on my 64 bit Ubuntu installation–which allows managing ones publications and synchronises with the Mendeley profile.

As I want to explore this new network, I created my Mendeley profile just a couple of hours ago. Unlike the experiences made by Daniel Lemire, importing my publications from a BibTeX database was fairly easy. A feature that I’m missing currently is to publish a less detailed CV like it is possible in LinkedIn; when providing details about my education or professional experience, I’m enforced also to provide dates.

Spamalytics: Who goes for Spam?

November 2, 2008

Filed under: internet, papers, research — Tags: , , — Oliver @ 5:48 pm

Spam (Image source)

Direct marketing is not a new approach and its history dates back to the 19th century when the first mail-order catalogues were distributed. Nowadays, the presence of unsolicited bulk e-mail is annoying Internet users world-wide on a daily basis. While there were some costs involved to distribute mail-order catalogues, the marginal cost to send  an e-mail is tiny. Therefore, e-mail based campaigns are profitable even when a negligible amount of receivers goes for the advertised product. The bad news, as highlighted by Kanich et al. is, “a perverse byproduct of this dynamic is that sending as much spam as possible is likely to maximise profit”. In order to maximise the reach of spam advertisement, spammers need to fight with developers of anti-spam technology; the developers of anti-spam software play a cat-and-mouse game with the senders of spam, who have to adapt to the latest spam filtering technologies in order to reach as many people as possible.

However, the presence of spam, despite years of energetic deployment of anti-spam technology, demonstrates the profitability of campaigns using spam. So the natural question rises up: who goes for spam?

This issue is addressed in a paper entitled Spamalytics: An Empirical Analysis of Spam Marketing Conversion presented at the 15th ACM Conference on Computer and Communication Security on Tuesday October 28.

Spam Conversion Pipeline (Image source)

The authors are interested in the conversion rate of spam, which is the probability than an unsolicited e-mail will ultimately elicit a sale. Therefore they infiltrate ongoing spam campaigns sent using the Storm botnet to provide measures for different stages of the spam conversion pipeline as shown in the above figure. In order to understand their methodology, we need to briefly review the way Storm works.

Storm Botnet Architecture (Source: Kanich et al.)

Storm is a peer-to-peer botnet that propagates via spam. The above figure shows the three primary classes of Storm nodes involved in sending spam: worker bots, proxy bots and master servers. While the worker bots are responsible for actually sending the spam, proxy bots act as conduits between workers and master servers. When downloading the Storm binary advertised in spam mails, the infected host becomes either a worker bot (if not reachable from the Internet, e.g. due to firewall restrictions) or a proxy bot. As the command and control traffic directed to the worker bots is unencrypted and always passes through a proxy bot, a man-in-the-middle attack is possible and carried out in the paper by Kanich et al.: by rewriting the comand and control traffic directed to worker bots, spam templates, dictionaries and addresses could be changed and adapted to their needs.

Their methodology can be summarised as follows. They hosted a set of Storm proxy bots, created duplicates of websites advertised in spam and have rewritten the command and control traffic to let the worker bots to advertise their sites instead of the original ones. Thus, no user received more spam, but some users received spam that is less dangerous that it would be otherwise.

Over the course of their experiment, they rewrote the content of about 470 million spam mails sent in three campaigns: about 347 million spams involved in a phamarcy campaign, 83 (38) million for a Storm self-advertisement campain using postcards (april fool). They received 28 purchases on the faked page for the advertised pharmaceutical product and 541 infections of the faked Storm binary, geographically distributed as shown below:

This translates into the following conversion rates (caution: results are not intended to be generalised in other contexts!):

  • 1 in 12,500,000 pharmacy spams lead to a purchase.
  • 1 in 265,000 greeting card spams lead to an infected machine.
  • 1 in 178,000 April Fool’s Day spams lead to an infected machine.
  • 1 in 10 people visiting an infection website downloaded the executable and ran it.

Many more information can be found in their paper (see below), such as top-10 most targeted email address domains, filtering statistics at each stage of the conversion pipeline, statistics about the efficiency of anti-spam methods deployed by typical free e-mail providers (e.g. hotmail and Google mail), time-to-click distribution (the first users visited the advertised page 10 seconds (sic!) after the spam was sent), effects of blacklisting and many more.
The paper is very well written and leads to new insights into how spam works. Interested readers should therefore consider reading this piece of well-conducted research.

Source: C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. Voelker, V. Paxson, S. Savage. Spamalytics: An Empirical Analysis of Spam Marketing Conversion. 15th ACM Conference on Computer and Communications Security 2008, Alexandria, VA, USA. [Summary, PDF Paper, BibTeX]

Further Information:

Newer Posts »
© 2001-2008 by Oliver Hohlfeld, M.Sc. | Imprint

Send me mail to my E-Mail address:
duwmtuxota@tntler.de
duwmtuxota@abc.thomas-graf.de
duwmtuxota@abc.ohohlfeld.com

fis.werkseinkauf@namesp.ohohlfeld.com
max.mustermann@namensp.ohohlfeld.com

Send me mail to my E-Mail address:
jm0mdmxota@tntler.de
jm0mdmxota@abc.ohohlfeld.com
jm0mdmxota@abc.thomas-graf.de

Send me mail to my E-Mail address:
ti5mtexota [at] tntler [dot] de
ti5mtexota [at] abc.ohohlfeld [dot] com
ti5mtexota [at] abc.thomas-graf [dot] de

Send me mail to my E-Mail address:
EMail EMail EMail

Name: e-mail: Subject: Message:

Leave a comment

farzin.borroman
farzin.borroman
farzin.borroman
My Super Secret Homepage

Warning: stristr() [function.stristr]: Empty delimiter. in /home/oliver/public_html/ohcomblog/wp-content/plugins/wassup/wassup.php on line 2093