a lightweight platform for web mashups in immersive mirror worlds

8
34 PERVASIVE computing Published by the IEEE CS n 1536-1268/13/$31.00 © 2013 IEEE TRANSIT AND TRANSPORT A Lightweight Platform for Web Mashups in Immersive Mirror Worlds A ugmented-reality and mirror- world applications are a new breed of applications that belong to the mixed-reality section of Paul Milgram’s virtuality con- tinuum. 1 Although augmented-reality appli- cations can overlay digital artifacts using a see-through display for in situ exploration, the mirror-world applications aim to create a realis- tic, information-enhanced virtual replica of the world, 2 enabling remote exploration scenarios. The objects of the mirror worlds are 3D mod- els automatically constructed from digital pictures, videos, or laser or sensory inputs, acquired via rapid drive-thru (that is, map providers drive a specialized car that lets them acquire images while driving). Exposed in augmented imagery, the 3D models enable an immersive experience that lets users engage with the environment in ways not pos- sible in the real world. 3 Although mirror-world applications provide a rich immersive experience and flexibility in a single application, they don’t interoperate well with external applications and programming environments. This limits their appeal, because simply extending the default functionality, or reusing parts of the functionality in other con- texts, involves learning specific technologies, leading to a fragmented ecosystem. A Web browser, however, provides a generic runtime environment suitable for a wide range of devices—from advanced desktop computers to mobile devices—that have already proven suc- cessful for visualizing geo-tagged information on 2D maps. Attempts to bring mirror-world immersive experiences into the browser have thus far been limited to the latest browser engines running on high-end desktop computers equipped with hardware 3D acceleration. Although mobile devices are becoming more sophisticated, it will still be some time before they reach the capabili- ties of their desktop counterparts. With this in mind, we developed Cloud City Scene, our approach to enabling mirror- world immersive experiences in mainstream mobile and desktop Web browsers. We rely on Web technologies that have been used tra- ditionally for visualizing geo-tagged informa- tion on a 2D canvas to achieve an immersive experience in a 3D-like scene. The component is fully integrated in the browser runtime en- vironment, letting Web developers create mashups that can be visualized inside the im- mersive scene. The client-side system is com- plemented by a server component that con- verts geo-tagged data to a format that can be visualized in the scene that the Web browser displays. Cloud City Scene is a lightweight platform that enables visualizations of Web mashups in an immersive mirror-world environment in which annotations blend in with buildings, terrain, and objects, letting users interact with the underlying real-world scene. Vlad Stirbu, Yu You, Kimmo Roimela, and Ville-Veikko Mattila Nokia Research Center

Upload: ville-veikko

Post on 09-Dec-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

34 PERVASIVE computing Published by the IEEE CS n 1536-1268/13/$31.00 © 2013 IEEE

T r a n s i T a n d T r a n s p o r T

a Lightweight platform for Web Mashups in immersive Mirror Worlds

A ugmented-reality and mirror-world applications are a new breed of applications that belong to the mixed-reality section of Paul Milgram’s virtuality con-

tinuum.1 Although augmented-reality appli-cations can overlay digital artifacts using a see-through display for in situ exploration, the mirror-world applications aim to create a realis-tic, information-enhanced virtual replica of the world,2 enabling remote exploration scenarios. The objects of the mirror worlds are 3D mod-

els automatically constructed from digital pictures, videos, or laser or sensory inputs, acquired via rapid drive-thru (that is, map providers drive a specialized car that lets them acquire images while driving).

Exposed in augmented imagery, the 3D models enable an immersive experience that lets users engage with the environment in ways not pos-sible in the real world.3

Although mirror-world applications provide a rich immersive experience and flexibility in a single application, they don’t interoperate well with external applications and programming environments. This limits their appeal, because simply extending the default functionality, or reusing parts of the functionality in other con-texts, involves learning specific technologies,

leading to a fragmented ecosystem. A Web browser, however, provides a generic runtime environment suitable for a wide range of devices—from advanced desktop computers to mobile devices—that have already proven suc-cessful for visualizing geo-tagged information on 2D maps.

Attempts to bring mirror-world immersive experiences into the browser have thus far been limited to the latest browser engines running on high-end desktop computers equipped with hardware 3D acceleration. Although mobile devices are becoming more sophisticated, it will still be some time before they reach the capabili-ties of their desktop counterparts.

With this in mind, we developed Cloud City Scene, our approach to enabling mirror-world immersive experiences in mainstream mobile and desktop Web browsers. We rely on Web technologies that have been used tra-ditionally for visualizing geo-tagged informa-tion on a 2D canvas to achieve an immersive experience in a 3D-like scene. The component is fully integrated in the browser runtime en-vironment, letting Web developers create mashups that can be visualized inside the im-mersive scene. The client-side system is com-plemented by a server component that con-verts geo-tagged data to a format that can be visualized in the scene that the Web browser displays.

Cloud City Scene is a lightweight platform that enables visualizations of Web mashups in an immersive mirror-world environment in which annotations blend in with buildings, terrain, and objects, letting users interact with the underlying real-world scene.

Vlad Stirbu, Yu You, Kimmo Roimela, and Ville-Veikko MattilaNokia Research Center

PC-12-01-Sti.indd 34 12/5/12 1:23 PM

Page 2: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

january–march 2013 PERVASIVE computing 35

Web and resource-oriented Mashup EnvironmentHere, we provide an overview of the sys-tem architecture, describe the browser-based rendering library and scripting engine that let application developers visualize Web mashups in augmented and mirror worlds, and present the Web infrastructure that provides the data that the rendering engine uses. (For more general information, see the “Related Work in Web Mashups” sidebar).

system architectureThe Cloud City Scene system architec-ture expands the reach of augmented- and

mixed-reality applications beyond the established core of dedicated native ap-plications. The architecture, depicted in Figure 1, has two major components: the server back end, responsible for processing the geo-tagged data and ex-posing the result as Web resources, and the Web applications running on the user’s devices, responsible for render-ing the information and handling the user interaction.

The Cloud City Scene server takes as input the information provided by ser-vices that expose geographically tagged data consumed in native applications, such as panorama textures, terrain

meshes, and building models. It then converts them into formats that are appropriate for use in the mainstream desktop and mobile Web browsers.

The Cloud City Scene client is a Java Script library that interacts with the server and displays the information to the user. The client can handle two classes of Web browsers. The first group includes the full-fledged desktop and mobile browsers that support a range of HTML5 technologies and have their own JavaScript engine. The second cat-egory includes hybrid Web browsers (such as the Nokia Xpress Browser) tar-geted at resource-constrained mobile

W eb mashups emerged as a simple way to create more

sophisticated applications by combining, in a homog-

enous way, data from multiple existing sources.1 In its simplest

form, a mashup aggregates or summarizes different sets of data,

or creates alternative user interfaces for a website for situations

that weren’t envisioned by the designers or that don’t fit the pri-

mary usage scenario. more elaborate approaches personalize the

content to match the specific needs or preferences of each user,

or add visualization to raw data, typically overlaid on a map.

a relatively new category of mashups focuses on real-time

monitoring, the design goal being to make the users aware that

the used datasets change. The nature of the change might vary

with the update frequencies, varying from days to seconds.2 In

line with the client-server architecture of the Web, the mashups

can be implemented either in the browser, using various java-

Script libraries for data processing, or on the server.

In this context, mapping mashups belong to the mashups

that add visualization to geographically tagged data by dis-

playing them over a map canvas using pins, custom icons, or

overlays that contain various geometric shapes. Their popular-

ity, fueled by services such as Google maps (http://maps.google.

com) and Open Street map (www.openstreetmap.org), is cap-

tured by the mashup dashboard at ProgrammableWeb (www.

programmableweb.com/mashups), where, as of march 2012,

close to one in three of all mashups registered on the website is

a mapping mashup. Other mashup categories, such as the ones

related to photography, travel, or transportation, might also use

maps for visualization, which takes the mapping mashups per-

centage even higher.

Tools that enable the 3D visualization of the world take the

mapping mashup to a new level. For example, Google Earth

(www.google.com/earth) allows remote exploration of locations

on Earth, the moon, and even mars, while the geographically

tagged data can be mashed up using Keyhole markup Lan-

guage (KmL) documents,3 or collada documents4 for imported

synthetic 3D models. Similarly, native augmented-reality ap-

plications developed by academia or industry, such as argon

Browser (http://argon.gatech.edu) or Layar reality Browser

(www.layar.com), allow in situ exploration of augmented real-

ity mashups that are presented to users as Kharma channels,5

a KmL extension that lets Web content be geographically po-

sitioned and dynamically manipulated using javaScript or layar

visions.

Web mashups have democratized what used to be the domain

of custom-built Geographic Information Systems (GIS). now,

even small developers can take geo-tagged data exposed by a

plethora of Web services and visualize them on a map. The trend

is accelerating as central and local governments expose data of

public interest as Open Linked Data (http://linkeddata.org).

REfEREnCES

1. j. yu et al., “understanding mashup Development,” IEEE Internet Com-puting, vol. 12, no. 5, 2008, pp. 44–52.

2. j. Wong and j. hong, “What Do We ‘mashup’ When We make mashups?” Proc. 4th Int’l Workshop on End-User Software Engineering (WEuSE 08), acm, 2008, pp. 35–39.

3. T. Wilson, OGC Keyhole Markup Language, 2.2.0, Open GIS consortium, 2008.

4. m. Barnes and E.L. Finch, COLLADA—3D Asset Exchange Schema, Release 1.5.0, Khronos Group, 2008.

5. a. hill et al., “Kharma: an Open KmL/hTmL architecture for mobile augmented reality applications,” Proc. IEEE Int’l Symp. Mixed and Aug-mented Reality (ISmar 10), IEEE, 2010, pp. 233–234.

related Work in Web Mashups

PC-12-01-Sti.indd 35 12/5/12 1:23 PM

Page 3: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

36 PERVASIVE computing www.computer.org/pervasive

TransiT and TransporT

devices in which the JavaScript process-ing is done in a browser proxy hosted in the network while the optimized results are sent to the mobile device.

The approach lets a wide range of devices consume content that’s visu-alized in realistic 3D-like fashion. Be-cause Cloud City Scene relies on Web technologies, it can target multiple plat-forms, and the application developers can maintain consistency without wor-rying about device particularities.

The ViewerCloud City Scene is a mirror-world application that lets users interact with a realistic 3D replica of the real world. The application relies on a set of HTML5 technologies that are read-ily supported by the mainstream desk-top and mobile Web browsers, allow-ing its use on a wide range of mobile devices, desktop, and laptop comput-ers, and even consumer electronics devices.

The Mirror-World Scene. The Cloud City Scene viewer application uses mirror-world scenes to achieve the immersive viewing experience. Each scene contains

information about objects that are vis-ible in the physical proximity of a geo-graphic location that corresponds to a street-level panorama image viewpoint. The objects visible in one scene are rep-resented as polygon masks that are projected against the background pan-orama image. In addition to the mask, an object can have additional metadata, such as the distance from the viewer and the angle under which the object is seen by the viewer (see Figure 2a). Furthermore, the masks let users inter-act with each corresponding object in a customizable fashion.

Although the scene is viewed by the human user as a 2D canvas,4 the Cloud City Scene application maintains the information about the objects in a scene graph data structure. This lets the viewer recreate the perception of depth using a z-index, which displays the masks as layers ordered accord-ing to their distance from the viewer (see Figure 2b). The panorama im-age is projected onto the outer layer, followed by the building and terrain masks. The scene can also contain vir-tual objects that don’t exist in the real world.

We can group the objects in the scene, based on their nature, using the following categories. Building masks represent the building visible from the location where the panorama image was taken. Each mask has an associ-ated building unique identifier, which lets the users interact with the corre-sponding building.

Terrain masks represent the streets visible from the panorama location. The terrain masks let the users navi-gate from one panorama location to another.

Points-of-interest (POIs) placement masks correspond to positions on building facades on which information about the POIs associated with the re-spective building can be placed. Each mask contains metadata about the dis-tance from the center of the panorama and viewing angle. The information is used to shrink or tilt the POI rep-resentations to create the perception of depth. The placement masks let the mashup application easily place content with which the user can interact.

Virtual artifacts represent synthetic objects that don’t exist in the real world. A virtual artifact binds together a

Figure 1. The Cloud City Scene system architecture. It has two major components: the server back end and the Web applications running on the user’s devices.

Cloud City Scene domain

Geo-data server

Cloud CityScene server

Proxy browserserver

(JavaScript engine)

Augmented/mixedreality application

Cloud City Sceneclient library

Proxy browserclient

• Panorama textures• Building models• Terrain meshes• Virtual artifacts

• Panorama projections• Building and terrain masks• Points-of-interest (POI) placement masks• Virtual artifacts projections

• Native applications• WebGL browsers

• HTML5• 2D canvas context• JavaScript

• Image map• CSS2.1 (JavaScript)

Geo-data Server back end User devices Technologies

PC-12-01-Sti.indd 36 12/5/12 1:23 PM

Page 4: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

january–march 2013 PERVASIVE computing 37

geographic location, a 3D model, and Web content, letting the object be viewed in the mirror world.5 The vir-tual artifact masks let users interact with the corresponding virtual object.

The Scripting Engine. Cloud City Scene lets application developers interact pro-grammatically with the rendered scene using a JavaScript API. The API pro-vides a set of utilities that enable scene movement, awareness, and alterations. The movement functionality lets the developer change the scene to a dif-ferent location or adjust the viewport heading inside a scene. The awareness functionality provides feedback about the objects present in the scene or vis-ible in the field of view. The alteration functionality lets developers insert vir-tual artifacts into the scene.

To import a virtual artifact, the de-veloper provides the virtual artifact URI, which the back-end server fetches and converts into an object mask ren-dered into the scene. The library and the back-end server handle the scene placement automatically, taking into account the distance from the viewer and possible occlusions by other objects in the scene. Additionally, developers can register callback functions on all scene objects, which lets them custom-ize the user’s interaction with the scene.

The Back-End infrastructureThe scene metamodel that Cloud City Scene uses is based on the same urban model data as our native Nokia City Scene client (http://betalabs.nokia.com/apps/nokia-city-scene). In Cloud City Scene, this same data is exposed

via Web APIs that enable third-party mashup applications to embed their proprietary data into real-world scenes.

In general, there are two sources of geographically tagged data: either from general-purpose providers, such as GeoNames, SimpleGeo, and Linked-GeoData, or from domain-specific in-formation or content providers, such as local city public services. Most of this data is already available programmati-cally, so overlaying geo-data on top of the 2D map is possible. However, the limitations that come with presenting information on a 2D map have moti-vated research into other presentations, such as Cloud City Scene. The back end is actually a content process pipeline that adds a new dimension to the ex-isting 2D map-based location-aware applications.

Figure 2. A mirror-world scene: (a) the metadata elements, (b) the scene structure, and (c) a scene example, with building masks (purple), terrain masks (red), and points-of-interest (POI) placement masks (blue), near Portsmouth Square in San Francisco.

Visible worldScene

Building

Building

Building

Virtual artifact

Viewangle

DepthMas

k

Viewer

Projected panorama

Building and terrain masks

z-index

0

1

2

100

Far

POI placement or object mask

POI placement or object mask Close

–180 180

...

Heading

(a)

(c)

(b)

PC-12-01-Sti.indd 37 12/5/12 1:23 PM

Page 5: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

38 PERVASIVE computing www.computer.org/pervasive

TransiT and TransporT

To mash up existing Web content, the application developer might dy-namically import an artifact into the scene using the client JavaScript API. Or, he or she might register the data to the back end in advance by publishing either the data in the Keyhole Markup Language (KML) format or a URI back-link to the provider’s Web API. The latter approach adds an additional cost for real-time handshakes but offers more dynamic features.

The pipeline parses the input data, no matter how it’s registered, and trans-lates it into an internal 3D scene graph for server-side rendering (see Figure 3). The scene graph can be an aggregation of multiple input data sources. The ren-dering module takes the output of the aggregation module and produces an annotated 2D image of the scene as seen from the current viewpoint. Figure 2c exemplifies the output of the rendering module.

In addition to the projected model, a reference to the original data source can be passed through for the third-party content items. This way, the mashup client can render application-specific information in a proprietary fashion.

Scene Metadata and Server-Side Projection. Cloud City Scene employs three basic layers of information as the background for building mirror-world applications: panoramic imagery, 3D building meshes, and a 3D terrain mesh. Navteq (www.navteq.com) captures the data using special data collection vehicles and processes and aligns it into a coherent world model.6 However, the processing logic in Cloud City Scene is generic and applicable to other data re-sources and formats.

To convert the 3D scene into an inter-active 2D canvas for the client, we first select the panoramic image closest to the desired viewpoint. This image is repro-jected into a 360-degree spherical image, normalized so that the X coordinate of the image directly maps to world-space direction—for example, North is always at the center of the image. The 3D scene geometry visible from the chosen view-point is then projected into the same 2D Cartesian coordinate system. The visible regions of each building are then seg-mented into 2D outlines and simplified to polygons with a fairly low number of points each for quick processing in the client. The street network information is used to generate the terrain masks. Size culling is also applied, whereby ar-eas too small for meaningful interaction are dropped altogether.

In addition to the 2D polygons cor-responding to visible building and ter-rain regions, we generate 3D metadata for layering additional content on top of the scene and navigating between different views. For this, we further segment the 3D building masks based on 3D surface orientation information so that, effectively, every individual fa-cade of each building becomes a sepa-rate patch of the whole building mask. Each patch stores information about its (average) distance from the current viewpoint and angle between the patch’s normal and the viewing direction.

Aligning Content and Panoramas. Our basic data assets are accurately aligned to world coordinates, but this isn’t nec-essarily true of content we would like to display on top of the basic scene. When content is registered by third-party ser-vices with only World Geodetic System

(WGS) 84 coordinates (latitude, longi-tude, and optional altitude), our align-ment process searches for the best place-ment in the 3D coordinate in relation to the nearest panorama. If more data, such as buildings, is available, the process can also try to place the content close to the sides of the streets, facing the same di-rection as the closest building if possible.

This process infers the building model closest to the content coordinate by com-puting the distance to the center of the mass of each surrounding model. The process then identifies the relevant fa-cade for 3D by casting a ray toward the closest model and observing which fa-cade is intersected. The process can also infer, if needed, the appropriate content altitude from the terrain elevation, al-lowing the content altitude to be defined relative to the local ground level.

Although this process is heuristic, it lets the system place content on facades simply using 2D geolocation. Once the placement is determined, the next step in the pipeline is to generate the masks with proper perspective projection.

Case study: acme Tours Web applicationTo gain first-hand experience with our platform, we developed a concept mashup application for Acme Tours, which offers hop-on, hop-off city sight-seeing tours. The application is intended to run in mobile Web browsers, present-ing information about bus routes and the location of the routes’ stops and pro-viding personalized schedules for each bus stop that display live information about bus arrivals and departures.

The prototype environment resem-bles the one expected in a real-life de-ployment (see Figure 4). The user’s smartphone interacts with three differ-ent websites:

• the Acme Tours website hosts the Web application and a set of Web resources that correspond to the bus routes and stops, the information signs that display information in the mirror world, and a Web feed that

Figure 3. Back-end pipeline for the content projection process. The scene graph can be an aggregation of multiple input data sources.

Geo-taggeddata

Aggregated 3Dscene graph

2D annotatedimageRegistration Server-side

rendering

PC-12-01-Sti.indd 38 12/5/12 1:23 PM

Page 6: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

january–march 2013 PERVASIVE computing 39

provides live information about the bus arrivals and departures at each bus station;

• the mapping server provides the Java-Script library and the back-end func-tionality that lets the routes and bus stops be displayed as an overlay on a 2D map; and

• the Cloud City Scene service provides the JavaScript library that enables the Cloud City Scene viewer and the corresponding back-end functional-ity, which lets the information-sign artifacts be imported and visualized in the mirror world.

We hosted the Acme Tours and Cloud City Scene services used in the proto-type environment, and we used Nokia Maps, a commercially available solu-tion, as the mapping service (http://api.maps.nokia.com/2.0.0/devguide/ overview.html).

The Acme Tours application includes a Web-based component that takes the information about the bus routes and stops, exposed as resources by the Acme server, and visualizes them on the 2D map. The application also has a server-based component that takes the information sign exposed by the Acme server so that the Cloud City Scene server can then convert the informa-tion into a layer suitable for display in the mirror world viewer. The rep-resentations exposed by the resources hosted by the Acme Tours server use well-known, Web-friendly formats. For example, the bus routes and stops are KML documents that can be visu-alized, without further processing, and overlaid on the map canvas provided by Nokia Maps. Similarly, the informa-tion signs are virtual artifacts, a bundle of KML and Collada documents, that can be imported into the scene through the Cloud City Scene server. Finally, the bus arrival feed is a JavaScript Object Notation (JSON) document,7 compliant with Live Bus Arrival data format (see www.tfl.gov.uk/businessandpartners/syndication), which can be handled directly by JavaScript.

From a user’s perspective, the ap-plication appears homogeneous, even if the data and user interface components are combined from dif-ferent sources. The user starts the ap-plication by typing the URI into the

address bar or reading the URI from a 2D barcode printed on the ticket. The application then presents the map view that displays the bus routes and the stops operated by Acme Tours (see Figure 5a). The device’s

Figure 4. The prototype environment. A smartphone interacts with three different websites—the Acme Tours website, a mapping server, and Cloud City Scene.

www.acme-tours.net api.maps.nokia.com

cloudcityscene.netBus arrivals feed

Acme ToursWeb application

ViewerJavaScript API

Back end

Bus routes andstops

Information signs(virtual artifacts)

Nokia MapsJavaScript API

www.acme-to

Web browser

Figure 5. Acme Tours mashup application using the Cloud City Scene platform: (a) routes and stops visualization on the map, near Trafalgar Square in London, using Internet Explorer 9, and (b) immersive visualization of an information sign virtual artifact using Safari Mobile.

(a) (b)

PC-12-01-Sti.indd 39 12/5/12 1:23 PM

Page 7: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

40 PERVASIVE computing www.computer.org/pervasive

TransiT and TransporT

current location is displayed as a blue dot, letting the user find the closest bus stop.

By tapping on a bus stop marker, the view changes to immersive mode, letting the user explore the bus stop’s location by panning the panorama im-age (see Figure 5b). Each of the scenes that correspond to the bus locations are augmented with the Acme Tour infor-mation signs that display information about arriving and departing buses, targeted for each user. The informa-tion signs are interactive, letting the us-ers visualize an expanded view of the bus stop schedule.

Concept discussion and ExperienceMashups that bring Web content into immersive mirror worlds can be eas-ily created by application develop-ers who don’t have large resources at their disposal (see Table 1). Using the Cloud City Scene platform is as easy as creating a traditional Web mashup that displays geographically tagged data on a map canvas. The JavaScript viewer library takes over the owner-ship of a <div> element in the applica-tion webpage and provides convenient methods for controlling the scene lo-cation and viewport heading. Devel-opers can also insert virtual artifacts into the scene and register callbacks that handle the user interaction with the scene objects. The learning curve for integrating the functionality in a Web application is small, because

the development follows design pat-terns that are already established on the Web.

Our case study application uses rela-tively simple data exposed by one ser-vice. However, running the mashup in the browser runtime environment lets developers import data from various sources by including in the webpages links to the appropriate JavaScript li-braries that process the data. Addi-tionally, the browser runtime environ-ment gives access in a cross-platform fashion to sensor or device context data using standard APIs, defined by the Device APIs working group in the World Wide Web Consortium (W3C). Application developers with access to data from local sensors—such as po-sitioning, accelerometer, or gyroscope sensors—can use the data to create a personal experience for their mashups.

Browser-based applications that rely on Cloud City Scene to display geo-graphically tagged information in an immersive mode using street-level pan-orama images let application develop-ers control how the information is pre-sented to the users. In contrast, native augmented- and mixed-reality applica-tions lack this flexibility, because they have only limited ways of customizing the user interface. For example, the Ar-gon Browser and Layar Reality Browser let developers import content using channels or layar visions, respectively. However, the formats by which these augmented-reality browsers import content are tightly coupled with the

viewers, which end up creating content silos around each browser application. To create content that can be visualized in multiple clients, application develop-ers must transform the data into the appropriate format on the server.

Another side effect of the tight for-mat coupling is that although the technologies used are Web friendly, the developers can’t make browser-based mashups, preventing them from offloading processing tasks to the clients. To compensate for this limi-tation, developers must reserve ad-ditional computational resources on their server infrastructure. Alterna-tively, the immersive experience en-abled by Google Earth can be embed-ded in Web browsers using a browser plug-in. This approach has limitations, because functionality availability is restricted only to the platforms and browsers supported by the plug-in provider—currently desktop brows-ers running on Windows and Mac OS.

The map view provides a familiar metaphor for visualizing geographi-cally tagged data. Despite recent im-provements from major map providers that display landmark buildings using 3D-like wireframes, the map view re-mains basically a view of the world from above. This visualization modal-ity works best at large scale, but it’s lim-ited to the microscale. The street-level visualization brings better results, be-cause it matches what a user would see in the real world and enables a finer-grained spatial localization.

TABLE 1 A summary of the features of immersive mashups solutions.

Cloud City Scene Google Earth Argon Browser Layar Reality Browser

Supported devices Desktop computers, mobile devices, and hybrid (proxy) browsers

Desktop computers mobile devices mobile devices

Browser integration Default browser engine Plug-in n/a n/a

Device sensors access World Wide Web consortium (W3c) device aPIs

W3c device aPIs native native

3D scene awareness yes yes no no

Programming environment hTmL5 hTmL5 Kharma client software development kit

PC-12-01-Sti.indd 40 12/5/12 1:23 PM

Page 8: A Lightweight Platform for Web Mashups in Immersive Mirror Worlds

january–march 2013 PERVASIVE computing 41

New solutions from the major map providers (including Google Maps and Nokia Maps) rely on WebGL8 to cre-ate immersive experiences at the city level, using maps augmented with 3D building models, and at the street level. However, these solutions have hard-ware and software requirements that aren’t met currently by mobile devices. Our solution works with technologies used for visualizations on a 2D canvas, but because we have a scene graph that manages the scene’s spatial informa-tion, we achieve 3D-like awareness. For example, when rendering information about points of interest or inserted ob-jects, we consider the distance from the viewer or possible occlusions.

C loud City Scene demon-strates that lightweight Web technologies, used typically for information visualiza-

tion on a 2D canvas, can be used effec-tively to create a realistic and immersive street-level representation of the physi-cal world inside Web browsers. The 3D-like environment is fully integrated with the browser runtime environment, allowing a large pool of application de-velopers to create Web mashups that visualize the Web content in the mirror world. The low system requirements not only provide similar features as the more resource-intensive solutions that are fully 3D aware but also expand the reach of mirror-world Web mashups to a wide range of devices.

Moving forward, the key challenges will be to expand the mirror worlds beyond the discrete fixed points, where the panorama images were taken, to a continuous space, where the appropri-ate imagery is generated on the fly. This will help accommodate both different positions on the surface of the world as well as different elevations. Addition-ally, the virtual artifacts visualized in the mirror world must be better inte-grated into the environment, taking into account environmental factors, such as illumination.

ACknOWLEDgMEnTWe acknowledge Finish Funding agency for Tech-nology and Innovation (TEKES) for funding the research presented in this article.

REFEREnCES 1. P. Milgram and F. Kishino, “A Taxonomy

of Mixed Reality Visual Displays,” IEICE Trans. Information Systems, vol. E77-D, no. 12, 1994, pp. 1321–1329.

2. J. Smart, J. Cascio, and J. Paffendorf, Metaverse Roadmap Overview, 2007; www.metaverseroadmap.org/overview/index.html.

3. C. Dede, “Immersive Interfaces for Engage-ment and Learning,” Science, vol. 323, no . 5910 , 2 0 0 9, pp . 6 6 – 69 ; w w w. sciencemag.org/content/323/5910/66.abstract.

4. I. Hickson, HTML Canvas 2D Context, World Wide Web Consortium (W3C) working draft, Mar. 2012; www.w3.org/TR/2012/WD-2dcontext-20120329.

5. V. Stirbu, D. Murphy, and Y. You, “Open and Decentralized Platform for Visual-izing Web Mash-Ups in Augmented and Mirror Worlds,” Proc. 21st Int’l Conf. Companion on World Wide Web (WWW 12), ACM, 2012, pp. 609–610.

6. T. Pylvänäinen et al., “Automatic Align-ment and Multi-View Segmentation of Street View Data Using 3D Shape Priors,” Proc. 5th Int’l Symp. 3D Data Processing, Visualization and Transmission (3DPVT 10), 2010; http: //campwww.informatik. tu-muenchen.de /3DPVT2010/data /media/e-proceeding/papers/paper033.pdf.

7. D. Crockford, The Application/JSON Media Type for JavaScript Object Nota-tion (JSON), IETF RFC 4627, July 2006; www.ietf.org/rfc/rfc4627.txt.

8. C. Marrin, WebGL Specification, The Khronos Group, Feb. 2011; www.khronos.org/registry/webgl/specs/1.0/.

the AuThORSVlad Stirbu is a mobile Web researcher and practitioner. he completed the work described in this article while he was a senior researcher with the Live mixed reality Team at the nokia research center in Tampere, Finland. his research interests include Web-based service architectures, location-based services, and GuI toolkits. Stirbu received his PhD in software systems at the Tampere university of Technology. he’s a member of IEEE and the IEEE com-puter Society. contact him at [email protected].

Yu You is a senior researcher in the Live mixed reality Team at the nokia re-search center in Tampere, Finland. his research interests include mobile run-times, Web technologies, service architectures, and cloud computing. Lately he has been engaged in projects related to augmented- and mixed-reality do-mains and location-based services. you received his PhD in information science from the university of jyväskylä. contact him at [email protected].

Kimmo Roimela is a principal researcher of computer graphics and mixed reality at the nokia research center in Tampere, Finland. his research interests include real-time rendering, graphics data compression, augmented reality, context-based applications, urban modeling, and physically based illumina-tion. roimela has an mSc from the Tampere university of Technology. contact him at [email protected].

Ville-Veikko Mattila is a senior manager at the nokia research center. his re-search interests include audio and perception-based signal processing, human perception, and user experience, and his later research has focused on mobile multiplayer gaming and mixed and augmented reality. mattila received his Dr. Tech. in information technology from Tampere university of Technology. contact him at [email protected].

Selected cS articles and columns are also available for free at http://computingnow.computer.org.

PC-12-01-Sti.indd 41 12/5/12 1:23 PM