gis and big data: theory and best practice case studies dr. dave schrader director – strategy and...

20
GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

Upload: karen-robertson

Post on 23-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

GIS AND BIG DATA:

THEORY AND BEST PRACTICE CASE STUDIES

Dr. Dave SchraderDirector – Strategy and Marketing, Teradata

October 2012 – University of Redlands

Page 2: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

WHO IS TERADATA?

WHAT IS TERADATA’S STRATEGY?

HOW DO BIG DATA AND GEOSPATIAL FIT?

Page 3: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

3 © Teradata 2012

• Founded 1979, first shipment 1984

• $2.4B a year in revenues, growing 22%

• Leading vendor of Enterprise-sized Data Warehouses (HW, SW, PS)

• Engineering HQ is in Rancho Bernardo

• We sell to the Global 3000, blue chip customer base

• Well-known to all database experts

• Moving from “back office” to “frontline” (Active), increasing # of data types

TERADATA

Page 4: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

4 © Teradata 2012

The Teradata Story – History of Big Data

1983: Teradata ships 1st system to Wells Fargo

Jan 1992Walmart passes 1TB

Jan 2006WMT loads 1B rows/day, 1 hr latency

June 2012eBay loads 1TB/minute More than 25 customers with

>25,000 Terabytes at their fingertips

Page 5: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

5 © Teradata 2012

What Data is Driving Growth? … The W’s

• More detailed data comes from`

> Detailed Customer Behavioral Data– “Where” in all industries: mobile and geospatial– “What and When” granularity – e.g., browsing on web, including non-

clicks and non-transactions – Telco: all the detail behind each phone call (BSS, OSS): location– Social networking data – tweets, blogs

> Detailed Operations Data– “How” – Process data– Network congestion, goal planning– Transportation optimizations in real-time– Manufacturing: sensor and test data

Page 6: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

6 © Teradata 2012

560

Data Mart Appliance

Extreme Data

Appliance

Data Warehouse Appliance

Extreme Performance

Appliance

Active Intelligent Data Warehouse

Purpose

Test &Development

-or-Data Marts

Strategic Analytics on

Extreme Data Volumes

Data Warehouse-or-

Departmental Data Marts

Extreme Performance for

Operational Analytics

Enterprise Scale

Strategic & Operational Intelligence

ScalabilitySMP

Up to 12TBMPP

Up to 186PBMPP

Up to 315TBMPP

Up to 18TBMPP

Up to 92PB

Active Users

Scalability

Flexibility

Purpose-Built Teradata Platform Family

1650

2690

4600

66X

X

Page 7: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

7 © Teradata 2012

TOP RATING BY GARTNER - DBMS

Why the TOP Rating for Data Warehousing?

Happy Customers!

Superior Technology!

Innovative Users!

Page 8: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

8 © Teradata 2012

The Next Generation of Analytics: Trends

• Transaction: Value to the business• Interaction: EXPERIENCE with the business

• Consumer is CEO of the household• Consumers making intelligent decisions based

upon analytics & perfect economic information

• Format: Structured & MULTI-STRUCTURED Data• Type: Web, social, location, device, channel• VOLUME and VELOCITY

Page 9: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

9 © Teradata 2012

Teradata and its Acquisitions

• Teradata Integrated Data Warehouse

• Operational BI/Intelligence

• Platform Family• Interoperability

& Consulting

Business Business ApplicationsApplications

Big DataBig Data Analytics Analytics

DataDataWarehousingWarehousing

• Aster Data• Extreme Data

Appliance• Partnerships

•Aprimo Applications

•Strategic Partnerships

Page 10: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

TERADATA +GEOSPATIAL

Page 11: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

11 © Teradata 2012

DataWarehouse

BIG DATA

OLAP Cubes

AgileAnalytics

Data Mining

Geospatial

Application Development

PERIOD DataM01 M02 M03 M04 M05 M06 M07

REG2 SEG1 Accts Balances Accts Balances Accts Balances Accts Balances Accts Balances Accts Balances Accts1 A 1 $1 1 $2 1 $1 2 $1 2 $1 1 $1 2

B 4 $14 4 $9 4 $10 5 $13 4 $12 4 $14 5 C 137 $369 129 $299 124 $317 165 $323 144 $349 136 $364 153 D 50 $45 45 $38 42 $37 61 $37 60 $36 52 $45 56 E 24 $71 22 $55 21 $76 31 $59 26 $77 24 $61 27 F 2 $2 2 $1 2 $1 3 $1 3 $1 2 $1 3 G 2 $5 1 $2 1 $5 2 $5 2 $3 2 $3 1 H 11 $36 10 $36 9 $37 13 $32 11 $39 10 $40 11

1 Total 231 $542 215 $442 204 $485 281 $471 252 $518 231 $528 258 2 A 1 $3 1 $1 1 $1 2 $3 2 $1 2 $1 2

B 5 $12 4 $12 4 $10 6 $14 5 $10 5 $9 5 C 73 $249 69 $200 68 $164 84 $186 74 $150 72 $204 79 D 35 $30 32 $24 31 $24 40 $24 39 $21 39 $26 41 E 20 $29 19 $36 21 $32 25 $38 21 $45 21 $54 22 F 0 $0 0 $0 0 $0 0 $0 0 $0 0 $0 0 G 1 $4 1 $3 1 $3 1 $4 1 $2 1 $3 1 H 5 $20 5 $13 5 $13 6 $13 6 $12 6 $71 6

2 Total 141 $346 132 $289 133 $247 164 $282 148 $242 146 $369 156 3 A 0 $0 0 $0 0 $1 0 $1 0 $0 0 $1 0

B 1 $1 1 $2 1 $1 1 $2 1 $2 1 $1 1 C 30 $87 29 $72 27 $64 32 $75 30 $68 29 $76 30 D 26 $29 25 $25 23 $22 30 $26 30 $23 28 $24 28 E 9 $26 8 $28 9 $27 11 $20 10 $19 10 $41 10 F 1 $1 1 $0 1 $0 1 $0 1 $1 1 $1 1 G 0 $0 0 $0 0 $0 0 $1 0 $0 0 $1 0 H 2 $7 2 $29 2 $11 2 $6 2 $17 2 $7 2

3 Total 70 $151 67 $157 63 $128 78 $131 75 $130 71 $152 72 4 A 0 $0 0 $0 0 $1 1 $1 0 $0 0 $0 0

B 1 $2 1 $4 1 $1 1 $1 1 $2 1 $3 1 C 54 $130 47 $122 41 $110 62 $121 49 $118 45 $137 49 D 2 $1 2 $2 2 $2 2 $1 3 $1 2 $2 3 E 4 $6 3 $5 3 $6 4 $14 4 $12 4 $14 4 F 0 $0 0 $0 0 $0 0 $0 0 $0 0 $0 0 G 1 $1 0 $0 0 $1 1 $0 1 $0 1 $0 1 H 6 $18 5 $13 5 $11 6 $20 5 $15 5 $14 5

4 Total 68 $159 60 $146 52 $132 78 $159 62 $150 58 $171 63

Temptation: Build Analytic Silos, Geospatial Silos

Page 12: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

12 © Teradata 2012

Analytics for Everyone

DataWarehouse

BIG DATA

OLAP Cubes

AgileAnalytics

Data Mining

Geospatial

Application Development

PERIOD DataM01 M02 M03 M04 M05 M06 M07

REG2 SEG1 Accts Balances Accts Balances Accts Balances Accts Balances Accts Balances Accts Balances Accts1 A 1 $1 1 $2 1 $1 2 $1 2 $1 1 $1 2

B 4 $14 4 $9 4 $10 5 $13 4 $12 4 $14 5 C 137 $369 129 $299 124 $317 165 $323 144 $349 136 $364 153 D 50 $45 45 $38 42 $37 61 $37 60 $36 52 $45 56 E 24 $71 22 $55 21 $76 31 $59 26 $77 24 $61 27 F 2 $2 2 $1 2 $1 3 $1 3 $1 2 $1 3 G 2 $5 1 $2 1 $5 2 $5 2 $3 2 $3 1 H 11 $36 10 $36 9 $37 13 $32 11 $39 10 $40 11

1 Total 231 $542 215 $442 204 $485 281 $471 252 $518 231 $528 258 2 A 1 $3 1 $1 1 $1 2 $3 2 $1 2 $1 2

B 5 $12 4 $12 4 $10 6 $14 5 $10 5 $9 5 C 73 $249 69 $200 68 $164 84 $186 74 $150 72 $204 79 D 35 $30 32 $24 31 $24 40 $24 39 $21 39 $26 41 E 20 $29 19 $36 21 $32 25 $38 21 $45 21 $54 22 F 0 $0 0 $0 0 $0 0 $0 0 $0 0 $0 0 G 1 $4 1 $3 1 $3 1 $4 1 $2 1 $3 1 H 5 $20 5 $13 5 $13 6 $13 6 $12 6 $71 6

2 Total 141 $346 132 $289 133 $247 164 $282 148 $242 146 $369 156 3 A 0 $0 0 $0 0 $1 0 $1 0 $0 0 $1 0

B 1 $1 1 $2 1 $1 1 $2 1 $2 1 $1 1 C 30 $87 29 $72 27 $64 32 $75 30 $68 29 $76 30 D 26 $29 25 $25 23 $22 30 $26 30 $23 28 $24 28 E 9 $26 8 $28 9 $27 11 $20 10 $19 10 $41 10 F 1 $1 1 $0 1 $0 1 $0 1 $1 1 $1 1 G 0 $0 0 $0 0 $0 0 $1 0 $0 0 $1 0 H 2 $7 2 $29 2 $11 2 $6 2 $17 2 $7 2

3 Total 70 $151 67 $157 63 $128 78 $131 75 $130 71 $152 72 4 A 0 $0 0 $0 0 $1 1 $1 0 $0 0 $0 0

B 1 $2 1 $4 1 $1 1 $1 1 $2 1 $3 1 C 54 $130 47 $122 41 $110 62 $121 49 $118 45 $137 49 D 2 $1 2 $2 2 $2 2 $1 3 $1 2 $2 3 E 4 $6 3 $5 3 $6 4 $14 4 $12 4 $14 4 F 0 $0 0 $0 0 $0 0 $0 0 $0 0 $0 0 G 1 $1 0 $0 0 $1 1 $0 1 $0 1 $0 1 H 6 $18 5 $13 5 $11 6 $20 5 $15 5 $14 5

4 Total 68 $159 60 $146 52 $132 78 $159 62 $150 58 $171 63

20-40%+ wasted moving data

Page 13: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

13 © Teradata 2012

Teradata Integrated Analytics

Optimized in-database data mining technology

from leading vendors, open

source and Teradata

AdvancedAnalytics

Native temporal

support to manage and update time dimension

Temporal

Native database geospatial data types

and analytics

Geospatial

Analytic platforms and partner tools to analyze

unstructured and

structured data

Big DataIntegration

Teradata Database

Tools and techniques to

accelerate development of analytics

ApplicationDevelopment

Teradata Open Parallel Framework

In-database data labs to accelerate

exploration of new data and

ideas

AgileAnalytics

CustomServices

EmbeddedServices

VirtualMachines

Teradata Purpose-Built Platform Family

Teradata Integrated Analytics

Page 14: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

14 © Teradata 2012

Native Geospatial Data TypesSpatial Data Integrated with Non-Spatial Data

• Geospatial is a feature that allows us to store, process, consume geospatial data• Teradata Geospatial based on the ST_Geometry data type

> SQL/MM Standard> Like numeric or string types native to Teradata> Location is type ST_Geometry

– Point (x y)

– Line or curve (xy, xy, xy)

– Polygon (xy, xy, xy, xy..)

Customer IDInteger

Customer NameChar

Customer AddressChar

Customer TypeChar

LocationST_Geometry

38327 John Smith 2110 Oak St. San Francisco, CA 94112

C Point (37.40113, 122.2091)

39234 William White 100 Broadway, Deaborn, MI 21002

A Point (42.153, -83.1078)

Geocoded Customer Table Example:

pointline

polygon

Page 15: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

15 © Teradata 2012

MeasurementsST_AreaST_DistanceST_SphericalDistanceST_SpheroidalDistanceST_PerimeterST_Length

Spatial RelationshipsST_IntersectsST_OverlapsST_RelateST_TouchesST_WithinST_ContainsST_DisjointST_CrossesST_Equals

AttributeST_AsBinaryST_AsTextST_CoordDimST_DimensionST_GeometryTypeST_IsEmptyST_IsSimpleST_IsClosedST_NumPointsST_SRID…

Spatial OperatorST_BufferST_IntersectionST_BoundaryST_DifferenceST_EnvelopeST_ExteriorRingST_GeometryNST_InteriorRingNST_Transform

Teradata Geospatial Spatial Methods – sampleHigh Speed Big Data Analytics

Page 16: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

16 © Teradata 2012

Geospatial QueriesAnswering ‘Where’

• ST_Geometry functions…> Measurements

– Distance, surface, perimeter…> Relationship between two

objects– Intersect, contains, within,

adjacent…

> Simplified Example - find top 100 customers by value within the store area boundaries and their distance from the store:

SELECT top 100 C.name, C.address, C.value, C.location.ST_Distance(S.location) AS Distance

FROM cities C, stores S, store_area SAWHERE S.id=1 and S.id=SA.id and

C.location.ST_WITHIN(SA.area)ORDER BY 3 Desc;

Customer

Retail Outlet

Distance

Competitor outlet

Mail Campaign Targets

Store Area

Page 17: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

17 © Teradata 201217 > 04/19/23

Telco – RetailAccelerates Analytics with Teradata

Find the 3 closest stores within 50 miles of each customer location.> Over 30 million customers> Over 2,200 stores> Target customers changing frequently

Manual Geospatial Analytics• Calculate distance between each

store and customer> Calculations based on complex

trigonometric functions> Over 65 billion calculations> Filter results <= 50 miles> Retained 1 billion results

In-database Geospatial Analytics• Teradata Geospatial functions

> Set a 50 mile buffer (filter) for stores> Identify customers within the buffer> Calculate spherical distance for those

customers

25 times faster

Store

Store

Page 18: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

18 © Teradata 2012

Teradata Geospatial Analytics

• Integrated spatial and non-spatial data

• High speed processing of big data

• Innovation simplified via Data Labs

• Proven by industry leaders

Page 19: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

19 © Teradata 2012

Big Data - provides enormous insight…

Customer behavior, calling/browsing habits, their social network…

…keyword use…

… location, travel destinations…

…personal profiles…

…sensor data and metrics…

…Opportunity to move beyond traditional analytics !

Page 20: GIS AND BIG DATA: THEORY AND BEST PRACTICE CASE STUDIES Dr. Dave Schrader Director – Strategy and Marketing, Teradata October 2012 – University of Redlands

20 © Teradata 2012

A major Telco uses

real-time analytics to find remedies for dropped mobile

phone calls