Distributed Peer-to-peer FastSS Indexing

State: completed by Dalibor Peric

Peer-to-Peer, or P2P, is a form of distributed computing with the objective of sharing computing resources among a variable number of peers - usually home computers. Ideally, there there is no need for a central server, to avoid having a single point of failure and to distribute load as much as possible among peers.

One important issue is the way to find information in such networks. Search is usually done only by exact keyword match, which means if there is a slight variation in spelling, the peer searching for information will not not get the full set of results he or she is expecting.

Fast Similarity Search, or in short FastSS, is an algorithm developled at UZH that addresses this issue. Using special indexes, it can help search for similar words in addition to the exact search already performed by peer-to-peer lookups. Currently, the P2P version of FastSS is underway, but the problem of how to build the necessary index in a peer-to-peer fashion determines still an open challenge.

EMANICS, an EU-funded Network of Excellence, is investigating into these issues as well. Therefore, the projec-wide testbed EMANICSLab has been established between participating institutes, in which new technologies, services and management concepts can be tested. Additionaly, the larger PlanetLab structure is also available for hosting tests.

P2PFastSS has been implemented and tested on PlanetLab. The current design defines three tasks: storing, indexing and collecting statistical data. Although any node can perform these tasks, they are performed sequentially. The main task is to parallelize these tasks.

Currently, if a node stores a text, the same node is also responsible for indexing. Simulations have shown that the indexing node has a high CPU load, while other peers are mostly idle. The parallelization should balance this load. In the new P2PFastSS prototype, a node should be able to store a text on an other node and then the other node starts to index the text.

Final Report

40% Design, 60% Implementation
Peer-to-peer networks and Java programming knowledge

Supervisors: Dr. Thomas Bocek

back to the main page