Peer-to-Peer Location-based Search: Engineering a Novel Peer-to-Peer Overlay Network
Supervisor(s) and Committee member(s): first Referee: Prof. Dr.-Ing. Ralf Steinmetz Second Referee: Prof. Dr.-Ing. Klaus Wehrle
Personalization of Internet services is a significant feature and exploiting the users’ location brings the most value to it. Location-based services have a wide application range – from emergency, tracking, and navigation services to informational and entertainment services. In existing centrally managed solutions, the results of location-based search are often incomplete or outdated. Additional information about the searched object (e.g. the menu, facilities, prices) is usually not available, as such a huge amount of data and frequent updates (e.g. the number of free places in restaurant) would overload the server. In a peer-to-peer solution, each peer is responsible for the information about the object it represents, therefore, updating and publishing information is done directly without a single point of failure. It is available to a wide community to join and publish their services, as peer-to-peer systems operate at low costs. The main goal of this thesis is to prove the feasibility of engineering a peer-to-peer solution for fully retrievable location-based search. Following an engineering approach, we first examine the most used and referred overlays of different types (unstructured, structured, and hybrid). Comparative evaluation identifies the influence of their design decisions on quality aspects such as efficiency, scalability, robustness, and stability. The foundation for the design of our solution is based on the findings from this study. The resulting overlay, Globase.KOM is a structured superpeer-based overlay in the form of a tree enhanced with interconnections. Superpeers are chosen from publicly reachable, static peers with more capacity, spare bandwidth, and good network connectivity. The world projection is divided into rectangular zones, which do not overlap. Each zone is assigned to a superpeer, located inside this zone. It is responsible for all peers in the zone. Superpeers form the tree, which is based on the subset-relation of their zones. Further contribution is the clear methodology for evaluating peer-to-peer search overlays, by defining metrics and various workloads that address all crucial quality properties. Additionally, in order to model realistic workloads, we discuss the difference between user behavior in popular file-sharing applications and VoIP applications such as Skype. As an evaluation tool, we select the simulation framework PeerfactSim.KOM and extend it to support various geographical distribution of peers on a world map and a location-aware churn model. The evaluation results prove the efficiency, good load balancing, scalability, robustness, and stability of the system. Query resolution is significantly faster than in related solutions. Additionally, the location-awareness of the overlay results in an efficient mapping of the logical overlay to the physical underlay which reduced total transmission delay and unnecessary underlay traffic. Although uneven load distribution seems to be an issue due to the tree structure of the overlay, we prove very good load balance due to interconnections and a careful zone formation, which together diminish a higher expected overload of superpeers in the higher tree levels. Our solution scales logarithmically with growing network size. It proves to be robust and stable under simultaneous failures of superpeers in higher tree levels, or in the same branch of the tree, and in the case of frequent querying. The biggest influence on the quality of our solution is the choice of identifier space, its management, and interconnections. A peer identifier contains the information responsibility and its location. That allows smart selection of interconnections and efficient greedy routing. Interconnections enable bypassing of the superpeers in higher levels of the tree and therefore allow equal load distribution among the superpeers. A fast recovery time and small performance variation under extreme churn and critical failures is to be credited to the various maintenance strategies used in combination. The simulation results are confirmed by a prototype and its applicability is shown in the examples of distributed multimedia communication and future peer-to-peer based collaborative applications.