Analysis of BitTorrent Data

State: completed by Robin Stohler

Although, video streaming is causing the main part of Internet traffic, peer-to-peer (P2P) file sharing applications still account for a large portion of the total Internet traffic [4]. Internet Service Providers (ISPs) face the challenge of network congestion during peak traffic hours, which is made more difficult by the P2P traffic. Furthermore, overlay networks used by file sharing applications, such as BitTorrent [2], cause unnecessary inter-domain traffic, because they are typically not aware of the underlying physical network. Several solutions to reduce inter-domain traffic have been proposed by the research community [1][3][5][10]. The evaluation of those solutions is based on limited data which is not necessarily representative.


The VIOLA project is addressing the lack of the time component in BitTorrent data sets. Currently an improved version of the measurement system is collecting roughly 60 GB of data per day. The data contains IP addresses, of peers and their swarms. Roughly 10000 swarms are active (larger than 50 peers). The data can be used to answer many questions about BitTorrent and user behavior.

Some areas of research on the data are:

- Investigate small swarms closer (the long tail)

- Interarrival times of peers 

- Correlation of user peaks


If you are interested in this data and working with it send an email to Andri Lareida. Include the type of Thesis you are interested in, a bit about your background, and your rough time frame.



[1] R. Bindal, P. Cao, W. Chan, J. Medved, G. Suwala, T. Bates, and A. Zhang. Improving Traffic Locality in BitTorrent via Biased Neighbor Selection. 26th IEEE International Conference on Distributed Computing Systems (ICDCS 2006), Lisboa, Portugal, July 2006.

[2] BitTorrent. http://www.bittorrent.com/, last visited: 2014.

[3] D.R. Choffnes and F.E. Bustamante. Taming the Torrent: A Practical Approach to Reducing Cross-ISP Traffic in Peer-to-Peer Systems. ACM SIGCOMM Computer Communication Review, Vol. 38, No. 4, pp 363-374, October 2008.

[4] Cisco Inc.: Cisco Visual Networking Index: Forecast and Methodology, 2012-2017. http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ ns705/ns827/white_paper_c11-481360_ns827_Networking_Solutions_ White_Paper.html, January 2013.

[5] A. Mondal, I. Trestian, Z. Qin, and A. Kuzmanovic. P2P as a CDN: A New Service Model for File Sharing. The International Journal of Computer and Telecommunications Networking, Vol.  56, No. 14, pp 3233-3246, September 2012.

[10] S. Oechsner, F. Lehrieder, T. Hoßfeld, F. Metzger, D. Staehle, and K. Pussep. Pushing the Performance of Biased Neighbor Selection through Biased Unchoking. IEEE International Conference on Peer-to-Peer Computing (P2P 2009), Seattle, USA, April 2009.



Final Report

40% Design, 30% Implementation, 30% Documentation
Statistical basics, programming/scripting, Matlab / R / or similar

Supervisors: Andri Lareida

back to the main page