spatialhadoop:a mapreduce framework for spatial data 汇报人:赵郁亮 2015-8-3 icde 2015

24
SpatialHadoop:A MapReduce Framework for Spatial Data 汇汇汇 汇汇汇 2015-8-3 ICDE 2015

Upload: joan-short

Post on 01-Jan-2016

280 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop:A MapReduce Framework

for Spatial Data

汇报人:赵郁亮 2015-8-3 ICDE 2015

Page 2: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Executive Summary

• Propose a full-fledged MapReduce framework with native support for spatial data.

• Propose a new system architecture with fourlayers:language,operations,mapreduce and storage layers.

• SpatialHadoop achieve orders of magnitude better performance than hadoop for spatial data processing.

Page 3: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Outline

• Introduction

• Related work

• SpatialHadoop Architecture

• Experiments

Page 4: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Introduction

• An explosion in the amounts of spatial data were produced by various devices such as smart phones,satellites,and medical devices.

• Hadoop was adopted as a solution for scalable processing of huge datasets in many applications,e.g.,machine learning ,graph processing and behavioral simulations.

• ESRI has released ‘GIS Tools on Hadoop’.

Page 5: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Introduction

• Parallel-Secondo

• MD-HBase

• Hadoop-GIS

• SpatialHadoop

Page 6: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Related work

• Specific spatial operations

R-tree construction

Range query

kNN query

All NN query

• Systems

Hadoop-GIS

MD-Hbase

Parallel-Secondo

Page 7: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

Page 8: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

• Language Layer(Pigeon)

Data types

Spatial functions

KNN query

Page 9: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

• Storage Layer(Indexing) Existing techniques for spatial indexing in

Hadoop

1) Build only

2 ) Custom on-the-fly indexing

3) Indexing in HDFS

Page 10: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

• Storage Layer(Indexing) Overview of Indexing in SpatialHadoop

Page 11: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

Index Building

1)Partitioning

Step1:Number of partitions.

Step2:Partitions boundaries.

Step3:Physical partitioning

2)Local Indexing

3)Global Indexing

Page 12: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

Grid file

Page 13: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

R-tree

Page 14: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

R+-tree

Page 15: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

• MapReduce Layer

Page 16: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

• Operations Layer Range QueryKNN

Page 17: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

SpatialHadoop Architecture

• Operations Layer Spatial Join

Step1:Global join

Step2:Local join

Step3:Duplicate avoidance

Page 18: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Experiments

• DataSet

TIGER:spatial features in the US such as streets and rivers(60G).

OSM:OpenStreetMap(60G)

NASA:120 Billion(4.6 TB)

SYNTH:2 Billion(128 GB,uniform distribution)

• Experiment Environment

Amazon EC2 cluster of up to 100 nodes

Hadoop 1.2.1 on java 1.6

Page 19: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Experiments

• Evaluation Range Query

Page 20: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Experiments

• Evaluation Range Query

Page 21: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Experiments

• Evaluation KNN

Page 22: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Experiments

• Evaluation Spatial Join

Page 23: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015

Experiments

• Evaluation Index Creation

Page 24: SpatialHadoop:A MapReduce Framework for Spatial Data 汇报人:赵郁亮 2015-8-3 ICDE 2015