[ieee ieee globecom 2006 - san francisco, ca, usa (2006.11.27-2006.12.1)] ieee globecom 2006 -...

6
TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing Sasu Tarkoma Helsinki Institute for Information Technology {sasu.tarkoma}@hiit.fi Abstract— In this paper we present the temporal subspace routing (TSR) technique for peer-to-peer environments that al- lows transparent exchange of information defined using metadata and queries based on user interests. The system unifies generic semantic matching, routing, caching, and access control. The technique supports continuous queries that are matched against metadata profiles of remote resources. Both queries and profiles are defined as subspaces of a multi-dimensional content space. Matched objects may be downloaded or synchronized. We present a generic data structure with optimizations for matching in this environment and discuss several use cases where the system may be applied. The mechanism utilizes the covering relation between queries and profiles. This allows automatic taxonomies of downloaded profiles and queries. Our main application is peer- to-peer and ad hoc metadata-based resource and file sharing. I. I NTRODUCTION In recent years, pervasive computing has become reality with millions of mobile phones and portable devices. A number of core technologies are needed in order to realize the intelligent and adaptive services of tomorrow. Efficient and intelligent data sharing and synchronization are basic properties of current and future applications, especially in pervasive environments. We are faced with the challenge of how to locate nearby important data items and keep them synchronized on different devices. In this paper we present the temporal subspace routing (TSR) technique for peer-to-peer environments that allows transparent exchange of information defined using metadata and queries based on user interests. The technique supports continuous queries that are matched against metadata profiles of remote resources. Both queries and profiles are defined as subspaces of a multi-dimensional content space. Matched objects may be downloaded or synchronized. We present a data structure with optimizations for matching covering queries in this environment and discuss several use cases where the system may be applied. The mechanism utilizes the covering relation between queries and profiles. This allows automatic taxonomies of downloaded profiles and queries. TSR supports profile caching, profile trailing, and query-defined views on distributed data. This paper is structured as follows: in Section II we present motivation for this work. Section III presents an overview of the TSR technique. In Section IV we define data queries and profiles, in Section V we examine the data structure for temporal subspace matching. Section VI presents the TSR model. In Section VII we address access control, and in Section VIII discuss different network models. Section IX presents the related work, and finally Section X presents the conclusions. II. MOTIVATION In this section we discuss two scenarios in which flexible information exchange is needed. We consider the office en- vironment, a traffic scenario, and then give motivation for a system that bridges these scenarios. A. Office Scenario We may find a number of potentially useful ad hoc and peer- to-peer interactions in the modern office. In this environment, people have meetings and there is a requirement for infor- mation management and tracking. Many of the information management needs, such as notification and context-sensitive operation, may be performed using a centralized server. In- deed, this is a preferred choice for many companies, because of various security requirements. We have previously demonstrated the smart office scenario 1 , which highlights context-aware operations, such a sending a message to every person in a meeting room, or triggering synchronizing documents after a meeting. As an extension, we have also proposed context-aware collection and synchro- nization tracking service for mobile clients using a logically centralized service [14]. B. Traffic Scenario Consider a busy street with an intersection. In the future, it is expected that vehicles are able to communicate with each other and with their environment about traffic and road conditions, and improve the safety and ease of driving. This requires vehicular networks and wireless capabilities in cars. With radio capability, fixed elements in the environment and other cars could act as sources of useful information. In this kind of operation, the information is typically locality- bound. For example, traffic signs, cars, and other elements could propagate information about accidents, maintenance operations, and traffic jams. Using a generic peer-to-peer information exchange tech- nology, a vehicle does not necessarily require new hardware. For example, a regular PDA with wireless ad hoc capability could be used. The driver would be able to select the range of 1 IEEE WMCSA 2004 Demonstrations © 1-4244-0357-X/06/$20.00 2006 IEEE This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

Upload: sasu

Post on 16-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE IEEE Globecom 2006 - San Francisco, CA, USA (2006.11.27-2006.12.1)] IEEE Globecom 2006 - ISE02-3: TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing

TSR: Temporal Subspace Routing for Peer-to-PeerData Sharing

Sasu TarkomaHelsinki Institute for Information Technology

{sasu.tarkoma}@hiit.fi

Abstract— In this paper we present the temporal subspacerouting (TSR) technique for peer-to-peer environments that al-lows transparent exchange of information defined using metadataand queries based on user interests. The system unifies genericsemantic matching, routing, caching, and access control. Thetechnique supports continuous queries that are matched againstmetadata profiles of remote resources. Both queries and profilesare defined as subspaces of a multi-dimensional content space.Matched objects may be downloaded or synchronized. We presenta generic data structure with optimizations for matching in thisenvironment and discuss several use cases where the systemmay be applied. The mechanism utilizes the covering relationbetween queries and profiles. This allows automatic taxonomiesof downloaded profiles and queries. Our main application is peer-to-peer and ad hoc metadata-based resource and file sharing.

I. INTRODUCTION

In recent years, pervasive computing has become realitywith millions of mobile phones and portable devices. Anumber of core technologies are needed in order to realizethe intelligent and adaptive services of tomorrow. Efficientand intelligent data sharing and synchronization are basicproperties of current and future applications, especially inpervasive environments. We are faced with the challenge ofhow to locate nearby important data items and keep themsynchronized on different devices.

In this paper we present the temporal subspace routing(TSR) technique for peer-to-peer environments that allowstransparent exchange of information defined using metadataand queries based on user interests. The technique supportscontinuous queries that are matched against metadata profilesof remote resources. Both queries and profiles are definedas subspaces of a multi-dimensional content space. Matchedobjects may be downloaded or synchronized. We present a datastructure with optimizations for matching covering queriesin this environment and discuss several use cases where thesystem may be applied. The mechanism utilizes the coveringrelation between queries and profiles. This allows automatictaxonomies of downloaded profiles and queries. TSR supportsprofile caching, profile trailing, and query-defined views ondistributed data.

This paper is structured as follows: in Section II we presentmotivation for this work. Section III presents an overviewof the TSR technique. In Section IV we define data queriesand profiles, in Section V we examine the data structure fortemporal subspace matching. Section VI presents the TSRmodel. In Section VII we address access control, and in

Section VIII discuss different network models. Section IXpresents the related work, and finally Section X presents theconclusions.

II. MOTIVATION

In this section we discuss two scenarios in which flexibleinformation exchange is needed. We consider the office en-vironment, a traffic scenario, and then give motivation for asystem that bridges these scenarios.

A. Office Scenario

We may find a number of potentially useful ad hoc and peer-to-peer interactions in the modern office. In this environment,people have meetings and there is a requirement for infor-mation management and tracking. Many of the informationmanagement needs, such as notification and context-sensitiveoperation, may be performed using a centralized server. In-deed, this is a preferred choice for many companies, becauseof various security requirements.

We have previously demonstrated the smart office scenario1,which highlights context-aware operations, such a sending amessage to every person in a meeting room, or triggeringsynchronizing documents after a meeting. As an extension,we have also proposed context-aware collection and synchro-nization tracking service for mobile clients using a logicallycentralized service [14].

B. Traffic Scenario

Consider a busy street with an intersection. In the future,it is expected that vehicles are able to communicate witheach other and with their environment about traffic and roadconditions, and improve the safety and ease of driving. Thisrequires vehicular networks and wireless capabilities in cars.With radio capability, fixed elements in the environment andother cars could act as sources of useful information. Inthis kind of operation, the information is typically locality-bound. For example, traffic signs, cars, and other elementscould propagate information about accidents, maintenanceoperations, and traffic jams.

Using a generic peer-to-peer information exchange tech-nology, a vehicle does not necessarily require new hardware.For example, a regular PDA with wireless ad hoc capabilitycould be used. The driver would be able to select the range of

1IEEE WMCSA 2004 Demonstrations

©1-4244-0357-X/06/$20.00 2006 IEEEThis full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

Page 2: [IEEE IEEE Globecom 2006 - San Francisco, CA, USA (2006.11.27-2006.12.1)] IEEE Globecom 2006 - ISE02-3: TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing

information accepted by the device, for example from alertsto local commercials and blog entries.

C. Between Ad Hoc and Fixed Infrastructure

The restriction to a closed network or a centralized service,as mentioned in the office scenario, severely limits the po-tential usability of a data dissemination system. An open andscalable infrastructure-free platform promises more flexibilityat the cost of manageability and security. In order to gain wideracceptance the system must also include security and loggingfeatures.

In this paper, we present the temporal subspace routingsystem addressing the challenges pertaining to flexible infor-mation dissemination in various environments.

III. OVERVIEW

The goal of the system is to allow information exchangebetween peers that is driven by interests. Client interestsare described using queries on published content profiles(metadata). A content profile is a set of discrete or continuousvalues. A content profile could be a fragment of an RSSfeed, contextual metadata, a blog entry, an emergency alert,a reference to a web resource, or an arbitrary description ofan object. In Section IV we propose that filters are used torepresent both queries and profiles.

We distinguish between push and pull interactions. The pullinteraction requires that the clients first exchange their interestsand then exchange metadata. The push, on the other hand,allows a client to directly push metadata to another clientsuspected of being interested in it.

The pull interaction proceeds as follows:

1) Discover a peer.2) Initiate information exchange by sending a list of lo-

cal queries or a summary of them. Send a summaryof queries sent by other peers, if this information isavailable.

3) Indicate whether interaction is one-shot (pull) or contin-uous (push).

4) Receive a set of matching profiles.5) Receive a set of forward references to peers that may

have matching profiles.6) If continuous operation was indicated, continue to re-

ceive matching profiles.7) Actual data is shared outside the system with a data

transport protocol. Supported protocols may be providedin a profile.

The push interaction is similar, but omits the second phaseof the pull interaction. Push also requires solutions to preventDenial of Service attacks. We leave the peer discovery processunspecified and discuss alternatives in Section VIII. The under-lying communication abstraction may be based on broadcast,multicast, anycast, or unicast. Information exchange maybe driven by epidemic algorithms and gossiping [6], anchorpoints, or structured or unstructured overlays.

IV. REPRESENTING INFORMATION USING FILTERS

Filters are used in publish/subscribe (pub/sub) systems tospecify user interests and make routing decisions. We borrowthe general notification data model from pub/sub systems [3].A filter is a stateless Boolean function. Filters typically havetwo useful properties: covering and overlapping. Filter F1

is said to cover filter F2, F1 � F2, if and only if all thenotifications that are matched by F2 are also matched by F1.Overlap is defined similarly and happens when the two filtersmatch the same arbitrary notification. It is also possible tomerge filters using perfect or imperfect techniques [12].

We have specified and implemented a new data structurecalled the poset-derived forest [13] for content-based routing,which stores only a subset of the relations to optimize filterprocessing and support frequent updates. Experimental resultsindicate that the forest is considerably more efficient than anacyclic graph-based poset under frequent updates [13]. Wehave developed a graphical tool, called the PosetBrowser2,for experimenting with various content-based routing datastructures. A generic data structure, such as the poset-derivedforest, supports the use of arbitrary filter objects as long as thecovering relations are defined for the input set. On the otherhand, being generic they cannot provide the same performanceas filter object and language specific matchers.

Figure 1 illustrates various uses for filters. They may be usedto represent client interests, context variables and metadata,and access control rules. Semantic relations between filtersare useful in reasoning about them.

Interests Filters

Metadata

Context

Reasoning, Routing,

Matching

Access

Control

Fig. 1. Using filters to represent interests and information

V. DATA STRUCTURE FOR SUBSPACE MATCHING

We propose that the poset-derived forest can be used tostore both metadata profiles and metadata queries based on thecovering relation [14]. Figure 2 illustrates how two forests maybe combined to support the matching of profiles and querieswith mappings between the elements of each forest. The figureshows profiles and queries defined using intervals. We call thiscombined data structure as DoubleForest (DF) [12]. In TSRthe DF stores both local and remote profiles and queries. Thedifference is that remote objects are cached and subject to acaching policy, whereas local objects are not. When a local

2Available at www.hiit.fi/fuego/fc/demos

©1-4244-0357-X/06/$20.00 2006 IEEEThis full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

Page 3: [IEEE IEEE Globecom 2006 - San Francisco, CA, USA (2006.11.27-2006.12.1)] IEEE Globecom 2006 - ISE02-3: TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing

object is removed, it may be cached to keep it available forother peers.

We define two sets, P and Q, which denote profiles andqueries, respectively. P and Q are the base sets for the twoforests. The basic idea is to maintain mappings MPQ andMQP in different directions between the structures. MPQ

determines the set of covering queries given a profile, andsimilarly, MQP determines the set of covered profiles givena query. It is easy to produce change notifications to relevantentities when an element is added or removed from eitherstructure. This approach is general and supports various fil-tering languages; however, it does not work well if there areonly a few covering relations between the elements.

Local queries

Q1 = [0,10]

Q2 = [12,20]

Cached

remote queries

Q3 = [2,7]

Q4 = [15,22]

Q5 = [16,20]

Local profiles

P1 = [1,5]

P2 = [5,10]

Cached remote

profiles

P3 = [5,25]

P4 = [17,20]

P5 = [7,9]

Q1 Q2

Q3 Q5

Q4P3 P1

P4 P2

P3

P5

Query forest (Q) Profile forest (P)

Mappings

MPQ: P (Q)

MQP: Q (P)

Updated on add and del

MQP(Q1) {P1, P2, P5}, MQP(Q2) {P4}, MQP(Q3) {Ø}, MQP(Q4) {P4}, MQP(Q5) {P4}

MPQ(P1) {Q1}, MPQ(P2) {Q1}, MPQ(P3) {Ø}, MPQ(P4) {Q2,Q4,Q5}, MPQ(P5) {Q1}

Fig. 2. Storing and matching of profiles and queries

The two supported semantic matching operators are coverand overlap. In essence, the structure computes the transitiveclosure for profiles given a query when the cover operator isused. We have defined optimizations for the computation of theclosure using the mappings of parent and child nodes [12]. Iffilter distribution is known beforehand, they may be preloadedinto the data structure to improve the matching performancelater.

We have developed an online example of this mecha-nism called the ContextBrowser1. The ContextBrowser showscontext and metadata queries and profiles graphically anddemonstrates real-time collection tracking [14]. Initial resultswith the DF data structure with matching indicate significantperformance benefits compared to a naive set based matching.Scalability may be improved by restricting the result set size.Since the DF processes the most covering profiles first, thisproperty may be used to summarize elements that do not fitinto the query result set.

VI. DATA SHARING USING SUBSPACE ROUTING

The two central features of TSR are temporality and sub-space routing. The first means that both profiles and querieshave a temporal existence. Caching allows peers to store localcopies of profiles. They do not have to always store a profilelocally, which leads to different replication strategies [5].Subspace routing allows expressive matching of profiles toqueries using the covering relation.

1Available at www.hiit.fi/fuego/fc/demos

TABLE ITEMPORAL SUBSPACE ROUTING OPERATIONS.

Operation Description

Add(X ,Q,T ) X adds the query Q with expiration time TAdd(X ,P ,T ) X publishes profile P with expiration time TDel(X ,Q) X removes the query Q (optional)Del(X ,P ) X removes the profile P (optional)Upcall-add(X ,P ) X is notified about new profile PUpcall-del(X ,P ) X is notified about removal of P

Both queries and profiles are represented using filters andthey have an associated timeout value. When the current timeexceeds the timeout value, the profile will be removed from thesystem and no longer delivered to new peers. When a querycovers a profile, we say that they match, and the profile isdelivered to the peer that sent the query.

In systems where data has a point existence, for examplepub/sub systems, mobility and disconnected operation supportrequire additional mechanisms. Temporal existence of profilesand queries simplifies the problem of missing profiles due toincompleteness of the topology.

A. Algorithm

Algorithm 1 presents the main idea of TSR. An incomingfilter f from interface I is either a query or a profile asindicated by the Op argument, and it is added or removed.Explicit removal is an option for profiles and queries, becausethe expiration time ensures eventual removal. The addProfileand delProfile functions add, and correspondingly remove, theinput filter. The addQuery and delQuery are the correspondingfunctions for queries. Additional logic not present in thealgorithm is needed for caching and access control.

Profile forwarding is performed by the forwardProfile func-tion, which takes a profile, f , and a set of outgoing interfaces,Q, as arguments. Alternatively, the function accepts a set ofprofiles, P , and an outgoing interface I . A profile is forwardedto any peers that have previously expressed interest in theprofile. A profile may be forwarded in two cases. First, whena new profile is sent by a peer. Second, when a new queryis received and the query covers a locally stored or cachedprofile. Any forwarding operations to local interface identifiersresult in upcalls to the applications that installed the queries.

Query forwarding follows the content-based routing modeland a query is forwarded to all peers excluding those that havealready received a covering query or that sent the query beingforwarded. This is realized by the forwardQuery functions andthey perform the necessary optimizations and may also use hopcounters to limit the propagation of queries. Locally installedqueries must be forwarded to peers. Remotely installed queriesshould be forwarded to peers and cached by them.

A profile is removed when it expires. A profile may beadded multiple times. When an identical profile or query isadded, the expiration time of the (object,identifier) pair isupdated. When a query is removed, due to manual deletionor expiration, a deletion message is propagated to active andreachable peers that were previously sent the query and have

©1-4244-0357-X/06/$20.00 2006 IEEEThis full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

Page 4: [IEEE IEEE Globecom 2006 - San Francisco, CA, USA (2006.11.27-2006.12.1)] IEEE Globecom 2006 - ISE02-3: TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing

Algorithm 1 The procMessageTSR algorithm.

PROCMESSAGETSR(Op,f ,I)

1 if duplicate message or already seen object2 then discard3 if Op = ADD-PROFILE

4 then5 ADDPROFILE(f ,I)6 let Q← getCoveringQueries(f )7 remove all queries sent by I from Q8 FORWARDPROFILE(f ,Q)9 elseif Op = ADD-QUERY

10 then11 ADDQUERY(f ,I)12 let P ← getCoveringProfiles(f )13 remove all profiles sent by I from P14 remove all profiles already sent to I from P15 FORWARDPROFILE(P ,I)16 if routing enabled17 then FORWARDQUERY(f ,I)18 elseif Op = DELQUERY

19 then20 DELQUERY(f ,I)21 if routing enabled22 then FORWARDDELQUERY(f ,I)

not received an active covering query.If a remote interface that sent an active query is not

reachable, all messages to that interface are silently dropped.The peer will update its queries and receive matching profileswhen it reconnects with the current node. Hence, the modelsupports disconnected operation and does not require explicitmessage buffering.

Since a profile may be disseminated to a destination throughmultiple paths, there may be loops in the dissemination topol-ogy. The system must be able to detect duplicate profiles, thosethat have already been routed, to minimize communication andprocessing overhead. A simple solution performs duplicatedetection at the receiver using the DF data structure andsyntactic testing using canonical filter representation and ahashtable. We assume that syntactic equivalence testing issufficient for duplicate detection.

B. Selective Caching and Trails

The structure of the DF may be used to perform selectivecaching and store information about profile and query trails.By selective caching, we mean that selected subsets of remoteprofiles and queries are summarized by an element and point-ers to the neighbours that sent the summarized elements. Localqueries and profiles are never summarized. These pointersform a trail for profiles, because they may be used to findnodes that have a fresh copy of a profile in memory. A pointermay be the peer that last sent the profile, or it may be theoriginal peer identifier or address.

Selective caching can be performed for both queries andprofiles. When a set of queries is summarized by a coveringquery, the number of active elements in the forest is reduced.A number of false positives may result if pointers stored inthe summarizing element are used to forward profiles.

There are two ways to perform profile summarization withthe DF when the covering operator is used to match elements.In the first option, we employ the same technique for pro-files as for queries and summarize a profile using its directpredecessor in the profile forest. In this case, some mappingsare lost from the summarized profiles to active queries. In thesecond option, we summarize a profile using a subset of itsdirect successors. In this case, no mappings are lost, but theremay be false positives.

Figure 3 gives an example of query and profile summariza-tion. In the figure, selective caching generalizes queries andprofiles by collapsing them to covering elements.

P3

P2 P3

P1

P5

P3

Q2 Q3

Q1

Q4 P4

P3

Q2

Q1

Q3

(Q4)

P1

(P2,P3, P4)

P5

Q3 covers P1

Q4 covers P2

Query Q4 summarized by Q3.

P1 keeps record of

who sent

summarized profiles

(profile trail)

P5 is a local profile

(never summarized

into other profiles)

I.

II.

Fig. 3. Example of query and profile summarization

C. Performance

We experimented with the DF on a mobile device, the NokiaCommunicator 9500, using the native Java environment. Thedata structure code was originally developed for a desktopand no changes were made to optimize the performance tothe small device environment. We used randomly generated(uniform distribution) 2-dimensional range filters in the range[0, 100] as queries and profiles to experiment with a workloadthat has a moderate amount of covering relations.

Figure 4 presents the performance results with a variablenumber of profiles (50-250) and a static number of queries(250). The DF was preloaded with the queries and the inser-tion time of a variable number of profiles was measured inmilliseconds.

With this parameter space, the optimized DF has a betterperformance than the naive set-based algorithm. The insertiontime per filter for 50 filters was 337 ms and for 250 filtersit was 449 ms. This performance seems to be reasonableon a small device. We note that since DF maintains a moreelaborate structure than the naive algorithm, the optimizationsmay not be effective with very small workloads.

©1-4244-0357-X/06/$20.00 2006 IEEEThis full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

Page 5: [IEEE IEEE Globecom 2006 - San Francisco, CA, USA (2006.11.27-2006.12.1)] IEEE Globecom 2006 - ISE02-3: TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing

The maximum number of filters in this experiment was 500.This number was observed to be close to the upper limit beforethe Java heap (4MB) was exhausted. The memory use can beoptimized for the small device and we expect that thousandsof filters are manageable. One easy optimization is to partitionfilters based on their data types or schemas. This reduces bothmatching and memory costs.

0 20000 40000 60000 80000

100000 120000 140000 160000 180000

50 100 150 200 250

Tim

e (m

s)

Filters

Variable profiles with cover

Optimized DoubleForestSet-based

Fig. 4. Performance results with a variable number of profiles

VII. ACCESS CONTROL

Access control is a basic requirement for information shar-ing systems. Simple access control can be implemented usingfilter types and host identifiers. Any query must have an activeallowing access control rule.

More expressive access control rules may be realized usingfilters. In this case, data structure support is needed for thefilters. An allowing access control filter covers the set ofallowed queries. Allowed queries map to a set of profiles.

Two DFs may be combined into a TripleForest (Figure 5),in which one forest stores access control rules, one queries,and one profiles. Transitive closure is maintained betweenaccess control rules and queries, and queries and profiles. Forany profile to match with a query from node X , the querymust contain an allowing access control filter for X . Thisdata structure has additional overhead pertaining to closuremanagement, but provides a generic unified structure for high-level routing operations.

VIII. DISCUSSION

The proposed temporal subspace routing mechanism maybe useful in different environments, because it addresses chal-lenges with disconnected operation and mobility, and supportsexpressive information exchange. The mechanism used toconnect nodes is critical to the scheme’s success.

In a social network [8], [2] information exchange happensbetween familiar devices either when devices meet or throughthe Internet. Query and profile caching allow nodes to keepinformation that may not interest them currently, but are ofinterest to peers in the social network. Assuming that nodes

P3

P2 P3

P1

P5

P3

Q2 Q3

Q1

Q4 P4

P3

A2 A3

A1

A4

Maintain mappings, a

query is allowed if

there is an allowing

(covering) access rule

A.

Maintain mappings

between queries and

profiles.

Fig. 5. TripleForest with access control rule forest

eventually meet and the profiles do not expire, the informationwill eventually be diffused throughout the network.

TSR does not solve the basic device discovery problem, butit may be used to do query-driven information disseminationin the ad hoc environment. Selective caching and informationtrails are envisaged to be useful, because they allow nodes togive advice to other nodes. This feature can be used seamlesslybetween MANETs (Mobile Ad Hoc Networks) and the Internet.

A reflector node may be introduced as an architectural ele-ment to improve information diffusion. The reflector caches allincoming profiles and allows other nodes to install continuousqueries. A reflector could be installed on a train or a bus, forexample, to allow peer-to-peer information sharing betweenpassengers in a more robust fashion than random encounters.

Unstructured peer-to-peer systems are seen as good candi-dates to support highly dynamic communication topologies.Typically, these unstructured networks are based on randomgraphs.

A frequently used technique to maintain randomness in thepresence of faults and churn is the view shuffling technique.With this technique, each node frequently changes its linksto other nodes [1], [15]. The important parameters in viewshuffling are the viewsize, k, and the fanout, f . The initialview of a node is inherited from some active node in thesystem. Subsequent view shuffles follow a simple procedure.A node builds a list of the local views of f nodes chosenat random from the current view. A second list contains thenodes that have requested view updates since the last update.The new view is created by choosing k nodes from these twolists [1].

View shuffling could also be used with TSR. Nodes inthe current view exchange local queries and matching pro-files. Each node caches remote queries and profiles. Thisis performed every view change. Profiles will eventually bedisseminated to every node requesting it assuming that queriesand profiles do not expire during the dissemination phase.

IX. RELATED WORK

A protocol for service discovery in pervasive environmentsis presented in [4]. This work utilizes peer-to-peer caching ofservice advertisements and group-based intelligent forwardingof service requests. Services are described using the WebOntology Language (OWL) and the semantic class/subClass

©1-4244-0357-X/06/$20.00 2006 IEEEThis full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.

Page 6: [IEEE IEEE Globecom 2006 - San Francisco, CA, USA (2006.11.27-2006.12.1)] IEEE Globecom 2006 - ISE02-3: TSR: Temporal Subspace Routing for Peer-to-Peer Data Sharing

hierarchy of OWL is used to selectively disseminate servicerequests.

The use of covering relations in TSR is similar to thisscheme, but there is no need for a priori defined hierarchyas long as the filtering language and data types are specified.The covering relation is computed by each peer independently.We also have a data structure for a unified routing solutionthat combines access control and also the caching of profiles.Since the profiles are cached, information is delivered even ifthe original sender is no longer available.

An architecture called 7DS (Seven Degrees of Separation)for information sharing among mobile and wireless deviceswas presented in [10]. In this system, participants obtain dataobjects from Internet servers and cache them. The systemexploits locality of information access.

A negotiation-based information exchange protocol for sen-sor networks, SPIN, was proposed in [9]. Nodes use meta-datanegotiations to eliminate the transmission of redundant data.Nodes may also take the knowledge of resources into accountwhen making decisions. They address three central challengesin sensor networks: implosion, overlap, and resource blind-ness. The push and pull interactions in TSR are similar tothe ADV-REQ-DATA protocol in SPIN [9]; however, TSR isquery-driven and leaves actual data storage for applicationsand considers only metadata.

Related work also includes various data diffusion tech-niques [7], [11]. In magnetic diffusion the data sink functionslike a magnet and propagates the magnetic charge to set upthe magnetic field. Sensor data is then propagated towards thesink based on this field.

The proposed TSR system differs from previous work,because it unifies generic semantic matching, routing, caching,and access control. Most data dissemination systems are tightlycoupled with the wireless and ad hoc nature of the networkand only consider simple name-value pairs and do not supportmore expressive semantic operators. Our system can be usedon top of different logical networks, such as random networks,social networks, and proximity-based networks.

X. CONCLUSIONS

In this paper, we presented the temporal subspace routingmodel for distributed peer-to-peer data sharing. The systemis based on the DoubleForest data structure, which computesmappings between queries and profiles based on the coveringrelation. The system unifies generic semantic matching, rout-ing, caching, and access control. We focused on a generaldescription of the technique and the data structure. It isenvisaged that the system can be used on top of differentlogical networks, such as random networks, social networks,and proximity-based networks. Future work includes experi-mentation with different underlying networking environmentsand different application scenarios.

REFERENCES

[1] A. Allavena, A. Demers, and J. E. Hopcroft. Correctness of a gossipbased membership protocol. In PODC ’05: Proceedings of the twenty-fourth annual ACM SIGACT-SIGOPS symposium on Principles of dis-

tributed computing, pages 292–301, New York, NY, USA, 2005. ACMPress.

[2] P. O. Boykin and V. P. Roychowdhury. Personal email networks: Aneffective anti-spam tool. CoRR, cond-mat/0402143, 2004.

[3] A. Carzaniga, D. S. Rosenblum, and A. L. Wolf. Design and evalu-ation of a wide-area event notification service. ACM Transactions onComputer Systems, 19(3):332–383, Aug. 2001.

[4] D. Chakraborty, A. Joshi, Y. Yesha, and T. Finin. Toward distributed ser-vice discovery in pervasive computing environments. IEEE Transactionson Mobile Computing, 5(2):97–112, 2006.

[5] E. Cohen and S. Shenker. Replication strategies in unstructured peer-to-peer networks. In Proceedings of the ACM SIGCOMM’02 Conference,2002.

[6] P. Eugster, R. Guerraoui, A.-M. Kermarrec, and L. Massoulie. Fromepidemics to distributed computing. IEEE Computer, 37(5):60–67, May2004.

[7] H.-J. Huang, T.-H. Chang, S.-Y. Hu, and P. Huang. Magnetic diffusion:disseminating mission-critical data for dynamic sensor networks. InMSWiM ’05: Proceedings of the 8th ACM international symposium onModeling, analysis and simulation of wireless and mobile systems, pages134–141, New York, NY, USA, 2005. ACM Press.

[8] H. Kautz, B. Selman, and M. Shah. Referral Web: Combining socialnetworks and collaborative filtering. Communications of the ACM,40(3):63–65, 1997.

[9] J. Kulik, W. Heinzelman, and H. Balakrishnan. Adaptive Protocols forInformation Dissemination in Wireless Sensor Networks. In 5th ACMMOBICOM, Seattle, WA, August 1999.

[10] M. Papadopouli and H. Schulzrinne. Seven degrees of separation inmobile ad hoc networks. In Proc. of IEEE Global TelecommunicationsConference 2000 (GLOBECOM ’00), 2000.

[11] C. Tang, Z. Xu, and S. Dwarkadas. Peer-to-peer information retrievalusing self-organizing semantic overlay networks. In A. Feldmann,M. Zitterbart, J. Crowcroft, and D. Wetherall, editors, SIGCOMM, pages175–186. ACM, 2003.

[12] S. Tarkoma. Efficient Content-based Routing, Mobility-aware Topolo-gies, and Temporal Subspace Matching. PhD thesis, Departmentof Computer Science, University of Helsinki, 2006. Available atethesis.helsinki.fi.

[13] S. Tarkoma and J. Kangasharju. Optimizing Content-based Routers:Posets and Forests. The Journal of Distributed Computing, 2006. Toappear.

[14] S. Tarkoma, T. Lindholm, and J. Kangasharju. Collection and objectsynchronization based on context information. In Mobility AwareTechnologies and Applications, Second International Workshop, MATA2005, Montreal, Canada, October 17-19, 2005, Proceedings, pages 240–251, 2005.

[15] S. Voulgaris, D. Gavidia, and M. Steen. Cyclon: Inexpensive member-ship management for unstructured P2P overlays. Journal of Networkand Systems Management, 13(2):197–217, June 2005.

©1-4244-0357-X/06/$20.00 2006 IEEEThis full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE GLOBECOM 2006 proceedings.