ohohlfeld.com : blog
Ohohlfeld.com Banner

Internet Measurement Seminar: Day I

February 25, 2009

Filed under: internet, papers, research, teaching — Tags: , , , , , , — Oliver @ 5:33 pm

Today was the first day of our two days blockseminar on Internet Measurement, in which I supervised two students. During the seminar, we addressed the following topics (papers) by talks held by students attending the seminar along with a discussion on the topic afterwards:

  • Characterizing Files in the Modern Gnutella Network: A Measurement Study [Slides] [Student Paper] [Original Paper]
    Which files are shared on Gnutella and what are their characteristics? Besides studies that derived traces by hosting peers dedicated to provide measurement data, this paper describes data derived from crawls of the Gnutella network.
  • Rarest First and Choke Algorithms Are Enough [Slides] [Student Paper] [Original Paper]
    This paper discusses why BitTorrent performs well and states that the Rarest First Algorithm and the Choke algorithm are enough to provide reasonable fairness, diversity of the content pieces and performance. Roughly speaking, Those are the key features that differentiate BitTorrent from other peer-to-peer file sharing protocols.
  • Leveraging BitTorrent for End Host Measurements [Slides] [Student Paper] [Original Paper]
    How optimistic unchokes—provided by BitTorrent and essential for its functionality—can be exploited to perform end host measurements; a dedicated and modified BitTorrent client called BitProbes downloads two megabytes of data from peers—by acting as a freerider and not uploading downloaded data—and uses this communication for conducting host measurements.
    Some points that have been discussed: (1) the authors claim that downloading but not storing the data is enough to avoid legal issues. Is that really true? (2) During a sample 7 days crawl, the authors covered about 20% of the available autonomos systems (AS) in the Internet. What does this number mean? Is it a high coverage, or a low one? For the answe, one has to keep in mind that not all AS are likely to host BitTorrent clients (like enterprise networks).
  • Unconstrained Endpoint Profiling (Googling the Internet) [Slides] [Student Paper] [Original Paper]
    How documents indexed by Google can be used to label IP addresses with applications run by a particular host
    The discussion mainly focused on the question whether the proposed method is really unconstrained as the title of the paper claims. Some key points: (1) The propsed method relys on Google, but the Google index varies (regional filtering etc.). (2) Existance of the deep web: not every available document is indexed by a particular search engine. (3) How dynamic are IP addresses? What if we want to label IPs of access providers which usually map to a set of users that used it in the past? (4) Can we trust data provided by the third parties (e.g. faked access log files etc.)?
    We agreed that this methodology seems good to discover trends but details have to be taken with a pinch of salt.
  • I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System [Slides] [Student Paper] [Original Paper1 Paper 2]
    What kind of videos are shared on YouTube and what is their access characteristics. See my blog post on this from October 2007.
  • The Flattening Internet Topology: Natural Evolution, Unsightly Barnacles or Contrived Collapse? [Slides] [Student Paper] [Original Paper]
    This paper analyses a trend of big content provider building up WANs and tend to bypass Tier 1 providers to save transit costs and increase performance which flattens the Internet topology

For references to the original papers, the student papers (mostly in German) and slides, see the seminar webpage. The talks had a very high quality and the discussions were pretty interesting. So I’m really looking forward to day II.

Scholastica Googelensis

April 21, 2008

Filed under: misc, research — Tags: , , , , , , — Oliver @ 6:31 pm

There are bad news. Viruses and worms are subject to a constant evolution and we are far from reaching the steady state. New influenza viruses, an infectious disease caused by RNA viruses, are constantly produced by mutation and reassortment (the mixing of genetic material from two similar viruses). In the olden days of computing, when we gazed at EGA graphics, computer users content against boot sector viruses and other malicious code affecting their programs. These kinds of viruses became less common in later generations where virus developer focused on exploiting the rich scripting functionalities provided by modern office application suites and Macro Viruses were becoming more widespread. Nowadays, one has to cope with security exploits in hosted software (e.g. phpBB), security leaks in web 2.0 applications (e.g. Facebook applications), phising, ….

This are well-known facts. I presented them to illustrate that viruses evolve and infect new hosts. The bad news is that research has been infected by a new virus called scholastica googlensis, as Alois Potton highlights in the 3/2008 issue of the PIK journal. Scholastica googlensis causes a linearisation of humans aiming towards a perfect alignment, making researchers comparable. Reputation is reduced to a single number, the Google Scholar index, expressing the amount of papers written by the considered author which are indexed in Google’s database. Only the number counts, publish or perish! Research is scaled down to a single metric. The higher the index, the higher the reputation, the higher chances are in an appointment board when filling a vacancy for an full professor. Alois Potton mentioned in his column the idea to reduce the review process at Dagstuhl seminars to a single one dimensional number: the Google Scholar index of the author. Life can be pretty simple.

The consequences are that a single company using the page rank algorithm not only controls the available knowledge – a fact is known, if and only if it is presented within the first n search results – but also influences the way knowledge is created by impairing the selection process in research.

Regarding to Einstein, everything should be made as simple as possible, but no simpler. Is this metric already a way too simple?

© 2001-2008 by Oliver Hohlfeld, M.Sc. | Imprint

Send me mail to my E-Mail address:
dy2nzcwnze@tntler.de
dy2nzcwnze@abc.thomas-graf.de
dy2nzcwnze@abc.ohohlfeld.com

fauke.langbehn@namesp.ohohlfeld.com
max.mustermann@namensp.ohohlfeld.com

Send me mail to my E-Mail address:
ta2mtuwnze@tntler.de
ta2mtuwnze@abc.ohohlfeld.com
ta2mtuwnze@abc.thomas-graf.de

Send me mail to my E-Mail address:
dcxmdgwnze [at] tntler [dot] de
dcxmdgwnze [at] abc.ohohlfeld [dot] com
dcxmdgwnze [at] abc.thomas-graf [dot] de

Send me mail to my E-Mail address:
EMail EMail EMail

Name: e-mail: Subject: Message:

Leave a comment

willaschek.gencdogus
willaschek.gencdogus
willaschek.gencdogus
My Super Secret Homepage

Warning: stristr() [function.stristr]: Empty delimiter. in /home/oliver/public_html/ohcomblog/wp-content/plugins/wassup/wassup.php on line 2093