ohohlfeld.com : blog
Ohohlfeld.com Banner

Monitoring Your Academic Field in the Web 2.0

October 8, 2009

Filed under: papers, research — Tags: , , — Oliver @ 4:57 pm

Web 2.0 tools can help in following a particular field for new publications. In addition to using Google Scholar to find new papers by monitoring which recent papers cite a certain seminal paper, RSS can help to monitor the field. As Daniel Lemire urged researchers to make their publications available through RSS in 2005, other services offer RSS feeds now. I try to give a brief overview on some that I use the most.

Preprints – arXiv

An archive for preprints of computer science papers (and other fields) which are not peer-reviewed is provided by arXiv, funded by Cornell University and the National Science Foundation. Currently, arXiv hosts more than half-million articles. Due to its popularity, it is worthwhile to follow submissions to categories of personal interest, e.g. Networking and Internet Architecture. More categories are provided at the home page of arXiv.

However, while it is painful to visit interesting categories frequently to follow new submissions made, it can be made faily easy by using RSS along with a feedreder. The whole procedure is described here. Example feed: Networking and Internet Architectures.

Journals – IEEE Transactions

Even the IEEE provides an RSS feed for papers in ther recent issues(e.g. Transactions on Multimedia or Transactions on Networking), which makes it very easy to monitor high-impact journals.

By keyword – CiteULike

A by keyword search can be monitored using CiteULike, which is a Web 2.0 service for reference sharing. An example can be found for QoE here (see the RSS feed here).

Plagiarized Papers: Rolex and Mercedes in the Academic World

September 30, 2009

Filed under: papers, research — Oliver @ 7:03 pm

Depending on where one would purchase a Rolex watch or other, typically high priced, products of famous brands, the price can vary by orders of magnitude. The reason for this observation can be explained by the existence of cheap replicas, illegal copies of the product. Do such copies exist only in the materialized world, or can they be a matter in academia?

plagiarism

Yes, they can! Today, a colleague of mine discovered that his 2005 Sigmetrics paper [1] has been plagiarised. The copy of his paper has been presented in the 8th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2009), held from June 1-3, 2009 in Shanghai, China. The authors are with the “electronics and information department, Huazhong University of Science and Technology, Wuhan, China”.
Who are the authors? Liu Wei is an associate professor. Wenqing Cheng has some history of published papers in recent years and is a professor as well. They should have known the rules. Profiles of the remaining authors could not be found.

Besides the publication of the copied paper in the conference proceedings by IEEE, the copied version of the paper has been selected as an outstanding paper and published by Springer in an additional volume (see [2]). In case you don’t have an ACM or Springer subscription, that would allow you to download the original papers, I created excerpts of the first page of each paper (find there here: original, copy). The paper, including figures, has been copied word by word. The copy has been shortened, typesetted in Microsoft Word instead of LaTeX and adapted to the new page layout. Funny side note: The reference were changed, e.g. [7] has been added as new reference that did not appear in original version of the paper.

If you start looking for other papers by the authors, you might find [3]. To have some more fun, copy some phrases of the abstract and look them up using Google, which will lead to a result like this, showing parts of a book [4] published in 2007. The abstract has been copied almost word by word from two chapters of the book (first sentence from page 327 and the remainder from page 373). Only minor things have been changed: “In this chapter” reads now “In this paper”.

What a disgrace! Or, a very special way of saying that a paper or book is good … ;-)

(We have have taken initial action in those cases. Is there anyone willing to review other papers of those authors for possible cases of plagiarism?)

Update (2009-10-01): I checked another paper: The abstract of [5] has been taken from [6, Chapter 8.4, Page 221] (I can’t download the entire paper due to a lack of subscription). You can use Google Booksearch to verify this. I already mailed to the affected authors and editors.

[1] Original: Florin Ciucu, Almut Burchard and Jörg Liebeherr: “A network service curve approach for the stochastic analysis of networks” (2005)

[2] Copy: Deah J. Kadhim, Saba Q. Jobbar, Wei Liu, and Wenqing Cheng: “The Stochastic Network Calculus Methodology” (2009)

[3] Copy: Nawaf Hadhal Kamil, Deah J. Kadhim, Wei Liu, Wenqing Cheng, “Signal Processing Techniques for Robust Spectrum Sensing,” fcc, pp.120-123, 2009 ETP International Conference on Future Computer and Communication, 2009

[4] Original: Fitzek F H P., Katz M., “Cognitive Wireless Networks: Concepts, Methodologies and Visions Inspiring the Age of Enlightenment of Wireless Communications”, ISBN 978-1-4020-5978-0, 1st: Springer, pp. 714 , 2007
[5] Copy: Deah J. Kadhim, Wei Liu, Wenqing Cheng, “Ultra Wideband Cognitive Network Objective Issues,” fcc, pp.35-38, 2009 ETP International Conference on Future Computer and Communication, 2009
[6] Original: Hossain, E., and V. K. Bhargava (Eds.), Cognitive Wireless Communications Networks, Springer Publication, 2007

ITC Submission Accepted

May 14, 2009

Filed under: papers, publications, research — Oliver @ 7:28 am

Our paper submitted to the 21st International Teletraffic Congress (ITC 21) has been accepted for publication. Abstract:

The migration of voice communication from the Public Switched Telephone Network to the Internet pushes the need to adequately size network resources such as buffers and capacity. This paper addresses the problem of how these resources should be scaled in the number of voice flows N in order to guarantee predefined packet loss probabilities and end-to-end delays. By deriving non-asymptotic buffer overflow probabilities at both edge and interior network nodes, the paper demonstrates that O(1) buffers are sufficient to ensure probabilistic packet loss constraints at all utilizations. Also, by deriving end-to-end delay bounds, the paper shows that the required per-flow capacities decrease as O(1/N) when probabilistic end-to-end delay guarantees are sought. Numerical examples illustrate that statistical multiplexing dominates the effect of scheduling in multi-nodes scenarios with high capacities.

PIK Journal Paper Online

May 1, 2009

pik

My journal paper entitled Stochastic Packet Loss Model to Evaluate QoE Impairments that appeared in issue 1 / 2009 of the PIK journal is now online.

Leaving for KiVS 2009

March 2, 2009

Filed under: conferences, papers, publications, research, talks — Tags: , , , , — Oliver @ 5:08 pm

I’m leaving for KiVS’09 (conference on communication in distributed systems), where I will give a talk entitled “Stochastic Packet Loss Model to Evaluate QoE Impairments” in the award session (I will receive the master thesis award from the communication in distributed systems group). Although there is a paper deadline approaching, I hope to have a little blog coverage on the conference.

Internet Measurement Seminar: Day I

February 25, 2009

Filed under: internet, papers, research, teaching — Tags: , , , , , , — Oliver @ 5:33 pm

Today was the first day of our two days blockseminar on Internet Measurement, in which I supervised two students. During the seminar, we addressed the following topics (papers) by talks held by students attending the seminar along with a discussion on the topic afterwards:

  • Characterizing Files in the Modern Gnutella Network: A Measurement Study [Slides] [Student Paper] [Original Paper]
    Which files are shared on Gnutella and what are their characteristics? Besides studies that derived traces by hosting peers dedicated to provide measurement data, this paper describes data derived from crawls of the Gnutella network.
  • Rarest First and Choke Algorithms Are Enough [Slides] [Student Paper] [Original Paper]
    This paper discusses why BitTorrent performs well and states that the Rarest First Algorithm and the Choke algorithm are enough to provide reasonable fairness, diversity of the content pieces and performance. Roughly speaking, Those are the key features that differentiate BitTorrent from other peer-to-peer file sharing protocols.
  • Leveraging BitTorrent for End Host Measurements [Slides] [Student Paper] [Original Paper]
    How optimistic unchokes—provided by BitTorrent and essential for its functionality—can be exploited to perform end host measurements; a dedicated and modified BitTorrent client called BitProbes downloads two megabytes of data from peers—by acting as a freerider and not uploading downloaded data—and uses this communication for conducting host measurements.
    Some points that have been discussed: (1) the authors claim that downloading but not storing the data is enough to avoid legal issues. Is that really true? (2) During a sample 7 days crawl, the authors covered about 20% of the available autonomos systems (AS) in the Internet. What does this number mean? Is it a high coverage, or a low one? For the answe, one has to keep in mind that not all AS are likely to host BitTorrent clients (like enterprise networks).
  • Unconstrained Endpoint Profiling (Googling the Internet) [Slides] [Student Paper] [Original Paper]
    How documents indexed by Google can be used to label IP addresses with applications run by a particular host
    The discussion mainly focused on the question whether the proposed method is really unconstrained as the title of the paper claims. Some key points: (1) The propsed method relys on Google, but the Google index varies (regional filtering etc.). (2) Existance of the deep web: not every available document is indexed by a particular search engine. (3) How dynamic are IP addresses? What if we want to label IPs of access providers which usually map to a set of users that used it in the past? (4) Can we trust data provided by the third parties (e.g. faked access log files etc.)?
    We agreed that this methodology seems good to discover trends but details have to be taken with a pinch of salt.
  • I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System [Slides] [Student Paper] [Original Paper1 Paper 2]
    What kind of videos are shared on YouTube and what is their access characteristics. See my blog post on this from October 2007.
  • The Flattening Internet Topology: Natural Evolution, Unsightly Barnacles or Contrived Collapse? [Slides] [Student Paper] [Original Paper]
    This paper analyses a trend of big content provider building up WANs and tend to bypass Tier 1 providers to save transit costs and increase performance which flattens the Internet topology

For references to the original papers, the student papers (mostly in German) and slides, see the seminar webpage. The talks had a very high quality and the discussions were pretty interesting. So I’m really looking forward to day II.

Wirless Epidemiology

February 9, 2009

Filed under: internet, papers, research — Tags: , , , — Oliver @ 11:42 am

The debate about wireless security has, so far, focused on preventing people from getting unauthorised access to one’s wireless network. WEP has been shown to be breakable in less than 60 seconds, assuming a certain success probability, by Tews et al. in 2007. Unsecured wireless networks–and WEP can be considered as “unencrypted” due to the work by Tews et al.–are widely considered as a security problem, as unauthorised people may start misusing the network and cause additional costs or legal issues.

A paper by Hu et al. entitled WiFi Epidemiology: Can Your Neighbors’ Router Make Yours Sick? discusses a different issue. People know they might spread the flu virus due to airborne infection when being, living or working closely with other people and thus the virus may exploit this tightly interconnected proximity network. When using computer systems, many users might have experienced that once their system is infected by some virus or other kinds of malware, it might start contributing to spread the virus even further. So flu like epidemics can happen in the digital world as well.

The paper by Hu et al. addresses exactly this issue and transfer it to the wirless domain by asking wether wirless routers, which form a tightly interconnected proimity network in densely populated urban areas, can contribute to spreading malware and thus create “wireless epidemics”. Epidemiology is good understood in other fields. Transfering those results to the wirless domain and highlighting possible security flaws is thus very important and the paper moves into an interesting direction.

The scenario considered in the paper relys on typical security flaws: i) weak or unchanged default passwords and ii) weak or broken cryptographic systems. Thus, the rate of possible infections can be prevented as follows:

  • Change the default password of your wireless router to some reasonable secure password
  • Use an state-of-the-art cryptographic standard that is still considered as “secure” (currently: WPA). Thus, don’t use WEP any longer.

Analytical and Numerical Investigation of Ant Behavior Under Crowded Conditions

January 26, 2009

It is often promising to transfer successful concepts from biological to technical domains. Ant Colony Optimisation, which basic principle is based on pheromone attraction of ants on the way from the colony to a food source and back, is a good example for finding reasonable short paths or tours in graphs, e.g. for addressing NP complete problems like the TSP. Ant optimisation has also been applied to the problem of finding routes in the Internet by Caro et al. (1998) in a paper entitled Ant colonies for Adaptive Routing in Packet-switched Communications Networks. Peters et al. (2008) address the issue of load dependend optimisation in their paper entitled Analytical and Numerical Investigation of Ant Behavior Under Crowded Conditions, where they find an ant based approach promising to reduce congetion in the network by optimising routing algorithms.

Spamalytics: Who goes for Spam?

November 2, 2008

Filed under: internet, papers, research — Tags: , , — Oliver @ 5:48 pm

Spam (Image source)

Direct marketing is not a new approach and its history dates back to the 19th century when the first mail-order catalogues were distributed. Nowadays, the presence of unsolicited bulk e-mail is annoying Internet users world-wide on a daily basis. While there were some costs involved to distribute mail-order catalogues, the marginal cost to send  an e-mail is tiny. Therefore, e-mail based campaigns are profitable even when a negligible amount of receivers goes for the advertised product. The bad news, as highlighted by Kanich et al. is, “a perverse byproduct of this dynamic is that sending as much spam as possible is likely to maximise profit”. In order to maximise the reach of spam advertisement, spammers need to fight with developers of anti-spam technology; the developers of anti-spam software play a cat-and-mouse game with the senders of spam, who have to adapt to the latest spam filtering technologies in order to reach as many people as possible.

However, the presence of spam, despite years of energetic deployment of anti-spam technology, demonstrates the profitability of campaigns using spam. So the natural question rises up: who goes for spam?

This issue is addressed in a paper entitled Spamalytics: An Empirical Analysis of Spam Marketing Conversion presented at the 15th ACM Conference on Computer and Communication Security on Tuesday October 28.

Spam Conversion Pipeline (Image source)

The authors are interested in the conversion rate of spam, which is the probability than an unsolicited e-mail will ultimately elicit a sale. Therefore they infiltrate ongoing spam campaigns sent using the Storm botnet to provide measures for different stages of the spam conversion pipeline as shown in the above figure. In order to understand their methodology, we need to briefly review the way Storm works.

Storm Botnet Architecture (Source: Kanich et al.)

Storm is a peer-to-peer botnet that propagates via spam. The above figure shows the three primary classes of Storm nodes involved in sending spam: worker bots, proxy bots and master servers. While the worker bots are responsible for actually sending the spam, proxy bots act as conduits between workers and master servers. When downloading the Storm binary advertised in spam mails, the infected host becomes either a worker bot (if not reachable from the Internet, e.g. due to firewall restrictions) or a proxy bot. As the command and control traffic directed to the worker bots is unencrypted and always passes through a proxy bot, a man-in-the-middle attack is possible and carried out in the paper by Kanich et al.: by rewriting the comand and control traffic directed to worker bots, spam templates, dictionaries and addresses could be changed and adapted to their needs.

Their methodology can be summarised as follows. They hosted a set of Storm proxy bots, created duplicates of websites advertised in spam and have rewritten the command and control traffic to let the worker bots to advertise their sites instead of the original ones. Thus, no user received more spam, but some users received spam that is less dangerous that it would be otherwise.

Over the course of their experiment, they rewrote the content of about 470 million spam mails sent in three campaigns: about 347 million spams involved in a phamarcy campaign, 83 (38) million for a Storm self-advertisement campain using postcards (april fool). They received 28 purchases on the faked page for the advertised pharmaceutical product and 541 infections of the faked Storm binary, geographically distributed as shown below:

This translates into the following conversion rates (caution: results are not intended to be generalised in other contexts!):

  • 1 in 12,500,000 pharmacy spams lead to a purchase.
  • 1 in 265,000 greeting card spams lead to an infected machine.
  • 1 in 178,000 April Fool’s Day spams lead to an infected machine.
  • 1 in 10 people visiting an infection website downloaded the executable and ran it.

Many more information can be found in their paper (see below), such as top-10 most targeted email address domains, filtering statistics at each stage of the conversion pipeline, statistics about the efficiency of anti-spam methods deployed by typical free e-mail providers (e.g. hotmail and Google mail), time-to-click distribution (the first users visited the advertised page 10 seconds (sic!) after the spam was sent), effects of blacklisting and many more.
The paper is very well written and leads to new insights into how spam works. Interested readers should therefore consider reading this piece of well-conducted research.

Source: C. Kanich, C. Kreibich, K. Levchenko, B. Enright, G. Voelker, V. Paxson, S. Savage. Spamalytics: An Empirical Analysis of Spam Marketing Conversion. 15th ACM Conference on Computer and Communications Security 2008, Alexandria, VA, USA. [Summary, PDF Paper, BibTeX]

Further Information:

SIGCOMM 2008 Papers Available

August 15, 2008

Filed under: conferences, papers, research — Tags: , — Oliver @ 10:47 am

As the SIGCOMM 2008, held in Seattle this year, is getting closer, I noticed that the accepted papers are now available online. They can be accessed here. A group of researchers in my group at Deutsche Telekom Laboratories will present their Time Machine, which allows later inspection of network activity that becomes interesting in retrospect.

Edit: Serveral papers are reviewed in the blog of Michael Mitzenmacher.

Newer Posts »
© 2001-2008 by Oliver Hohlfeld, M.Sc. | Imprint

Send me mail to my E-Mail address:
dkxodqxmja@tntler.de
dkxodqxmja@abc.thomas-graf.de
dkxodqxmja@abc.ohohlfeld.com

odisseas.jarow@namesp.ohohlfeld.com
max.mustermann@namensp.ohohlfeld.com

Send me mail to my E-Mail address:
jyxnjqxmja@tntler.de
jyxnjqxmja@abc.ohohlfeld.com
jyxnjqxmja@abc.thomas-graf.de

Send me mail to my E-Mail address:
ja1oduxmja [at] tntler [dot] de
ja1oduxmja [at] abc.ohohlfeld [dot] com
ja1oduxmja [at] abc.thomas-graf [dot] de

Send me mail to my E-Mail address:
EMail EMail EMail

Name: e-mail: Subject: Message:

Leave a comment

mije.jakobetz
mije.jakobetz
mije.jakobetz
My Super Secret Homepage

Warning: stristr() [function.stristr]: Empty delimiter. in /home/oliver/public_html/ohcomblog/wp-content/plugins/wassup/wassup.php on line 2093