Peer-to-Peer Architectures for Federated Search of Complex Digital Libraries
The set of providers of Digital Libraries and services on the Web is growing both in absolute numbers and in terms of diversity. From a user point of view, there should be a single virtual library (``one stop shop'') comprising all relevant sources for their information needs. Peer-to-peer architectures have been effective at integrating large numbers of very simple DLs, for example, for file sharing. This project research will demonstrate the use of peer-to-peer architectures for federated search across large numbers of complex digital libraries that are integrated only very loosely.
The project is based on the assumption that it is neither possible nor desirable to enforce homogeneity in a large-scale federation of complex digital libraries. DL providers will differ in terms of their schema used, the quality of the data and their degree of cooperation. We will develop transformation methods that take into account the intrinsic imprecision and vagueness of mappings between different schemas. For this purpose, appropriate methods for describing DL schemas and the (uncertain) mappings between them must be developed.
There is a growing number of Web services that can be used for improving retrieval results from DLs; mapping services help in bridging heterogeneity, and enhancing services provide functions for retrieving additional, relevant documents. We will develop methods for dynamic incorporation of these services into the P2P retrieval system, by developing appropriate methods for both service description and service selection.
Large-scale peer-to-peer networks require routing services so that messages are routed to desired destinations efficiently. We will develop content-based routing services (resource description, resource selection, and data fusion) for peer-to-peer networks. Content-based routing services raise a variety of new issues in the peer-to-peer environment, for example partial representations of DL contents, and a more complex process for deciding whether to satisfy messages locally or route them to another node.
In order to make our implementations of these methods available for other researchers and developers, we will implement all methods by using the JXTA framework, which currently is used by a number of other projects in the DL and peer-to-peer areas.