Today we had the last day of our blockseminar on Internet Measurement (see my post on day 1) and heard the following talks:
After completing the talks, we had a quite fruitful discussion on presentation style of every talk that will be hopefully helpful for the attending students. In general, I was really impressed by the quality of the talks. Some were really brilliant but all of them were unexpectedly good in general.
I really enjoyed this seminar and hope our advanced seminar on Internet Routing offered next term will be as interesting as this one.
Today was the first day of our two days blockseminar on Internet Measurement, in which I supervised two students. During the seminar, we addressed the following topics (papers) by talks held by students attending the seminar along with a discussion on the topic afterwards:
- Characterizing Files in the Modern Gnutella Network: A Measurement Study [Slides] [Student Paper] [Original Paper]
Which files are shared on Gnutella and what are their characteristics? Besides studies that derived traces by hosting peers dedicated to provide measurement data, this paper describes data derived from crawls of the Gnutella network.
- Rarest First and Choke Algorithms Are Enough [Slides] [Student Paper] [Original Paper]
This paper discusses why BitTorrent performs well and states that the Rarest First Algorithm and the Choke algorithm are enough to provide reasonable fairness, diversity of the content pieces and performance. Roughly speaking, Those are the key features that differentiate BitTorrent from other peer-to-peer file sharing protocols.
- Leveraging BitTorrent for End Host Measurements [Slides] [Student Paper] [Original Paper]
How optimistic unchokes—provided by BitTorrent and essential for its functionality—can be exploited to perform end host measurements; a dedicated and modified BitTorrent client called BitProbes downloads two megabytes of data from peers—by acting as a freerider and not uploading downloaded data—and uses this communication for conducting host measurements.
Some points that have been discussed: (1) the authors claim that downloading but not storing the data is enough to avoid legal issues. Is that really true? (2) During a sample 7 days crawl, the authors covered about 20% of the available autonomos systems (AS) in the Internet. What does this number mean? Is it a high coverage, or a low one? For the answe, one has to keep in mind that not all AS are likely to host BitTorrent clients (like enterprise networks).
- Unconstrained Endpoint Profiling (Googling the Internet) [Slides] [Student Paper] [Original Paper]
How documents indexed by Google can be used to label IP addresses with applications run by a particular host
The discussion mainly focused on the question whether the proposed method is really unconstrained as the title of the paper claims. Some key points: (1) The propsed method relys on Google, but the Google index varies (regional filtering etc.). (2) Existance of the deep web: not every available document is indexed by a particular search engine. (3) How dynamic are IP addresses? What if we want to label IPs of access providers which usually map to a set of users that used it in the past? (4) Can we trust data provided by the third parties (e.g. faked access log files etc.)?
We agreed that this methodology seems good to discover trends but details have to be taken with a pinch of salt.
- I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System [Slides] [Student Paper] [Original Paper1 Paper 2]
What kind of videos are shared on YouTube and what is their access characteristics. See my blog post on this from October 2007.
- The Flattening Internet Topology: Natural Evolution, Unsightly Barnacles or Contrived Collapse? [Slides] [Student Paper] [Original Paper]
This paper analyses a trend of big content provider building up WANs and tend to bypass Tier 1 providers to save transit costs and increase performance which flattens the Internet topology
For references to the original papers, the student papers (mostly in German) and slides, see the seminar webpage. The talks had a very high quality and the discussions were pretty interesting. So I’m really looking forward to day II.