a predictive model for frequently viewed tiles in a web map sterling quinn mgis candidate esri...

34
A predictive model for A predictive model for frequently viewed tiles in frequently viewed tiles in a Web map a Web map Sterling Quinn Sterling Quinn MGIS Candidate MGIS Candidate ESRI ArcGIS Server Product Engineer ESRI ArcGIS Server Product Engineer Mark Gahegan Mark Gahegan Faculty Advisor Faculty Advisor

Upload: christopher-rimmel

Post on 29-Mar-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

A predictive model for frequently A predictive model for frequently viewed tiles in a Web mapviewed tiles in a Web map

Sterling QuinnSterling QuinnMGIS CandidateMGIS Candidate

ESRI ArcGIS Server Product EngineerESRI ArcGIS Server Product Engineer

Mark GaheganMark GaheganFaculty AdvisorFaculty Advisor

Page 2: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

IntroductionIntroduction

This project presents a model for This project presents a model for predicting high-traffic areas of a Web mappredicting high-traffic areas of a Web map

Model output indicates where server-side Model output indicates where server-side cache of map tiles should be createdcache of map tiles should be created

Page 3: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Project objectivesProject objectives

Describe server-side caching of map tilesDescribe server-side caching of map tiles

Describe the need for selective cachingDescribe the need for selective caching

Present a predictive model for popular areas of Present a predictive model for popular areas of the mapthe map

Describe ways the model could be used and Describe ways the model could be used and evaluatedevaluated

Page 4: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Web map optimization and the Web map optimization and the advent of server-side cachingadvent of server-side caching

Page 5: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Organizing large maps in Organizing large maps in manageable “tiles” is not newmanageable “tiles” is not new

Large paper map Large paper map series are indexed in series are indexed in organized gridsorganized grids

CGIS, a pioneering CGIS, a pioneering GIS, used “frames” to GIS, used “frames” to organize data (right)organize data (right)

From Tomlinson, Calkins, & Marble, 1976, p. 56.

Page 6: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Other techniques for organizing Other techniques for organizing maps in tiles or grid systemsmaps in tiles or grid systems

Pyramid technique Pyramid technique successively successively generalizes rasters in generalizes rasters in groups of four cells groups of four cells (right)(right)

Quadtree structures Quadtree structures index datasets in a index datasets in a hierarchy of hierarchy of quadrantsquadrants

From De Cola & Montagne, 1993, p. 1394.

Page 7: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

The modern map tileThe modern map tile

JPG or PNG imageJPG or PNG image

Standard square Standard square dimensions (256 x 256 dimensions (256 x 256 or 512 x 512)or 512 x 512)

Stored in large “caches” Stored in large “caches” on the server at multiple on the server at multiple scalesscales

Page 8: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Server-side caching of map Server-side caching of map tiles is newtiles is new

Traditional map servers (ArcIMS, WMS) Traditional map servers (ArcIMS, WMS) draw the image on the flydraw the image on the fly Can take a while if the map is complexCan take a while if the map is complex

Cached map tiles give extremely fast Cached map tiles give extremely fast performanceperformance

Tiled maps allow users to retrieve just the Tiled maps allow users to retrieve just the needed pieces of the mapneeded pieces of the map

Page 9: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Advent of tiled maps and server-Advent of tiled maps and server-side cachingside caching

Microsoft Terra Server an early Microsoft Terra Server an early deployment of massive amounts of cached deployment of massive amounts of cached imagery tilesimagery tiles

Google Maps serves cached map tiles Google Maps serves cached map tiles with AJAX techniques to create a with AJAX techniques to create a “seamless” Web mapping experience“seamless” Web mapping experience

Page 10: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Tiles in Google Tiles in Google Maps quickly Maps quickly retrieved as you retrieved as you navigatenavigate

From Google Maps: http://maps.google.com

1

2

Page 11: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Many sites have Many sites have followed followed Google’s patternGoogle’s pattern

MapQuest: http://www.mapquest.com

Yahoo Maps: http://maps.yahoo.com

Microsoft Virtual Earth: http://maps.live.com

Page 12: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Caching optionsCaching options

Page 13: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Current caching optionsCurrent caching options

Current GIS software allows analysts to Current GIS software allows analysts to create tile caches for their own mapscreate tile caches for their own maps ESRI’s ArcGIS ServerESRI’s ArcGIS Server MapnikMapnik Microsoft MapCruncherMicrosoft MapCruncher

Page 14: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Caching can require enormous Caching can require enormous resources on the serverresources on the server

Caches covering big areas at large scales Caches covering big areas at large scales can include millions of tilescan include millions of tiles Many gigabytes, or even terabytes of storageMany gigabytes, or even terabytes of storage Days, weeks, or sometimes months to Days, weeks, or sometimes months to

generategenerate

Many GIS shops lack resources to Many GIS shops lack resources to maintain large cachesmaintain large caches

Page 15: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Selective caching as a strategy for Selective caching as a strategy for saving resourcessaving resources

Administrator can cache only the areas Administrator can cache only the areas anticipated to be most visitedanticipated to be most visited

Remaining areas can be:Remaining areas can be: Added to the cache “on-demand” when first Added to the cache “on-demand” when first

user navigates thereuser navigates there Filled with a “Data not available” tileFilled with a “Data not available” tile

Page 16: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Benefits of selective cachingBenefits of selective caching

Wise because some Wise because some tiles (ocean, desert) tiles (ocean, desert) will rarely, if never, be will rarely, if never, be accessedaccessed Saves timeSaves time Saves disk spaceSaves disk space

Page 17: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Implications of selective cachingImplications of selective caching

Requires an admission that some areas Requires an admission that some areas are more important than othersare more important than others

Poses challenge of predicting popular Poses challenge of predicting popular areas before the map is releasedareas before the map is released

Page 18: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

The need for a predictive modelThe need for a predictive model

Page 19: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Project presents a predictive model Project presents a predictive model for where to pre-cache tilesfor where to pre-cache tiles

““Which places are most interesting?”Which places are most interesting?”

Inputs are datasets readily available to Inputs are datasets readily available to GIS analystGIS analyst

Output vector features a template for Output vector features a template for where to pre-cache tileswhere to pre-cache tiles

Page 20: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Purpose of the modelPurpose of the model

Help majority of users see a fast Web map Help majority of users see a fast Web map while minimizing cache creation time and while minimizing cache creation time and storage spacestorage space

Page 21: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Not a descriptive modelNot a descriptive model

Descriptive model shows Descriptive model shows where users have already where users have already viewedviewed

Microsoft Microsoft HotmapHotmap good good example of a descriptive example of a descriptive tool (right)tool (right)

Descriptive models useful Descriptive models useful for deriving and validating for deriving and validating predictive modelspredictive models

From Microsoft Hotmaphttp://hotmap.msresearch.us

Page 22: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Advantages of a predictive modelAdvantages of a predictive model

Doesn’t require the map to be deployed Doesn’t require the map to be deployed alreadyalready

Can include fixed and varying geographic Can include fixed and varying geographic phenomenaphenomena

Has applications far beyond map cachingHas applications far beyond map caching

Page 23: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Proposed methodsProposed methods

Page 24: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Study area and conditionsStudy area and conditions

Model predicts frequently viewed places Model predicts frequently viewed places for a general base mapfor a general base map

May create models for thematic maps if May create models for thematic maps if time allows time allows

Study area of CaliforniaStudy area of California

Page 25: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Input datasetsInput datasets

Populated / developed areasPopulated / developed areas

Road networksRoad networks

CoastlinesCoastlines

Points of interestPoints of interest

Page 26: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Populated / developed areasPopulated / developed areas

Human Influence Human Influence Index grid by the Index grid by the Socioeconomic Data Socioeconomic Data and Applications and Applications Center (SEDAC) at Center (SEDAC) at Columbia UniversityColumbia University

Model selects all grid Model selects all grid cells over a certain cells over a certain valuevalue

Page 27: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Road networksRoad networks

Major roads buffered by a given distanceMajor roads buffered by a given distance

AllAll roads within national parks, roads within national parks, monuments, historical sites, and monuments, historical sites, and recreation areas, buffered by a given recreation areas, buffered by a given distancedistance

Page 28: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

CoastlinesCoastlines

All coastlines buffered by a given distance All coastlines buffered by a given distance (wider buffer on inland side)(wider buffer on inland side)

Page 29: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Points of interestPoints of interest

Set of 60 interesting points chosen by model Set of 60 interesting points chosen by model authorauthor Mountain peaksMountain peaks Theme parksTheme parks Sports arenasSports arenas Etc.Etc.

Represents a flexible layer that could be tailored Represents a flexible layer that could be tailored to local needsto local needs

Page 30: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Deriving the outputDeriving the output

Merge all layers togetherMerge all layers together

Clip to California outline (with small buffer)Clip to California outline (with small buffer)

Remove small holes and polygonsRemove small holes and polygons

Dissolve into one multipart featureDissolve into one multipart feature

Simplify to remove unneeded verticesSimplify to remove unneeded vertices

Page 31: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Using the model outputUsing the model output

Output a vector dataset that can be used as a Output a vector dataset that can be used as a template for creating cached tilestemplate for creating cached tiles

Compare model output area with total area to Compare model output area with total area to understand percent coverageunderstand percent coverage

Compare model output with actual usage over Compare model output with actual usage over timetime

Refine if necessaryRefine if necessary

Page 32: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

LimitationsLimitations

Models of world scope should account for Models of world scope should account for Internet connectivityInternet connectivity

Input datasets have varying collection datesInput datasets have varying collection dates

Input datasets vary in resolution and precisionInput datasets vary in resolution and precision

Maps with many scales might require multiple Maps with many scales might require multiple iterations and variations of the modeliterations and variations of the model

Page 33: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

Questions?Questions?

Page 34: A predictive model for frequently viewed tiles in a Web map Sterling Quinn MGIS Candidate ESRI ArcGIS Server Product Engineer Mark Gahegan Faculty Advisor

ReferencesReferences

De Cola, L. & Montagne, N. (1993). The PYRAMID De Cola, L. & Montagne, N. (1993). The PYRAMID system for multiscale raster analysis. system for multiscale raster analysis. Computers & Computers & Geosciences, 19Geosciences, 19(10), 1393 – 1404.(10), 1393 – 1404.

Tomlinson, R. L., Calkins, H. W., & Marble, D. F. (1976). Tomlinson, R. L., Calkins, H. W., & Marble, D. F. (1976). Computer Handling of Geographical Data. Computer Handling of Geographical Data. Paris: Paris: Unesco.Unesco.