# d a t a 1 7 -...

34
# D a t a 1 7

Upload: dangduong

Post on 28-Mar-2018

228 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

D a t a 1 7

Building a data lake on AWS Syscorsquos journey to predictive analyticsGreg KhairallahHead of Business Development Analytics Amazon Web Services

Navin AdvaniSenior Director Business TechnologySysco

D a t a 1 7

State of Data Warehousing

Data Warehousing Challenges Today

Exponential Data Growth Varying Data Types Need Data Analyzed Faster

Benefits of Using Amazon Redshift

Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3

Amazon Redshift is

Fast Simple Elastic Secure Compatible Low Cost

Financial and management reporting

Payments to suppliers and billing workflows

WebMobile clickstream and event analysis

Recommendation and predictive analytics

The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester

Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed

spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in

the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change

The Forrester Wavetrade Big Data Warehouse Q2 2017

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 2: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Building a data lake on AWS Syscorsquos journey to predictive analyticsGreg KhairallahHead of Business Development Analytics Amazon Web Services

Navin AdvaniSenior Director Business TechnologySysco

D a t a 1 7

State of Data Warehousing

Data Warehousing Challenges Today

Exponential Data Growth Varying Data Types Need Data Analyzed Faster

Benefits of Using Amazon Redshift

Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3

Amazon Redshift is

Fast Simple Elastic Secure Compatible Low Cost

Financial and management reporting

Payments to suppliers and billing workflows

WebMobile clickstream and event analysis

Recommendation and predictive analytics

The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester

Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed

spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in

the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change

The Forrester Wavetrade Big Data Warehouse Q2 2017

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 3: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

State of Data Warehousing

Data Warehousing Challenges Today

Exponential Data Growth Varying Data Types Need Data Analyzed Faster

Benefits of Using Amazon Redshift

Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3

Amazon Redshift is

Fast Simple Elastic Secure Compatible Low Cost

Financial and management reporting

Payments to suppliers and billing workflows

WebMobile clickstream and event analysis

Recommendation and predictive analytics

The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester

Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed

spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in

the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change

The Forrester Wavetrade Big Data Warehouse Q2 2017

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 4: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Benefits of Using Amazon Redshift

Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3

Amazon Redshift is

Fast Simple Elastic Secure Compatible Low Cost

Financial and management reporting

Payments to suppliers and billing workflows

WebMobile clickstream and event analysis

Recommendation and predictive analytics

The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester

Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed

spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in

the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change

The Forrester Wavetrade Big Data Warehouse Q2 2017

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 5: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Financial and management reporting

Payments to suppliers and billing workflows

WebMobile clickstream and event analysis

Recommendation and predictive analytics

The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester

Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed

spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in

the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change

The Forrester Wavetrade Big Data Warehouse Q2 2017

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 6: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester

Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed

spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in

the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change

The Forrester Wavetrade Big Data Warehouse Q2 2017

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 7: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Accelerate Migrations from Legacy Systems

ldquoAWS Database Migration Service is the most

impressive migration service wersquove seenrdquo ndash Gartner

Amazon Redshift

Migrate

Over 1000 unique

migrations to Amazon

Redshift using DMS

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 8: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Modernize your analytics platformData Lake = flexible set of web services that match your use cases

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 9: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Designed for 11 9s

of durability

Designed for

9999 availability

Durable Available High performance Multiple upload

Range GET

Store as much as you need

Scale storage and compute

independently

No minimum usage commitments

Scalable

Amazon EMR

Amazon Redshift

Amazon DynamoDB

Amazon Athena

Integrated

Simple REST API

AWS SDKs

Read-after-create consistency

Event notification

Lifecycle policies

Easy to use

Why Amazon S3 for data lake

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 10: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Big Data on AWS

Immediate Availability Deploy instantly No hardware to

procure no infrastructure to maintain amp scale

Trusted amp Secure Designed to meet the strictest

requirements Continuously audited including certifications

such as ISO 27001 FedRAMP DoD CSM and PCI DSS

Broad amp Deep Capabilities Over 70 services and 100s of

features to support virtually any big data application amp

workload

Hundreds of Partners amp Solutions Get help from a

consulting partner or choose from hundreds of tools and

applications across the entire data management stack

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 11: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Sysco FoodsAn Overview

Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and

educational facilities lodging establishments and other customers who prepare meals away from home

Sysco operates 197 distribution facilities serves about half a million customers in 13 countries

For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion

COSTA RICA

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 12: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Current State Challenges

Lack of Analytical Capabilities Lack of business analytical

capabilities to analyze large volume data across category

management customer insights price simulations etc

Reporting Inconsistencies and Long Lead Times Reporting

standards are not defined most reports transactions are tailored to

requests Multiple data source and systems creating spaghetti data

scenarios leading to inconsistencies

Creeping Cost of Ownership Aged and Siloed BI solutions and

processes are slowly increasing the total cost of ownership in storage

infrastructure maintenance and administration

Scalability amp Stability Issues Reporting team is currently above

capacity with several thousands custom reports running Issues with

performance delays in reporting due to data load causing instabilities

Future State Goals

Enable Revenue Growth - Better enable business decisions through

data visibility and consistency

Improve Operational Efficiency - Increase the efficiency of business

processes through data management best practices

Enhanced Customer Experience ndash Deliver more intuitive information

to our internal and external customers through self-serve reporting

model

Enterprise View Of Data - Consolidated view of the customers

suppliers and products data from Sysco SUS and SAP broadline and

specialties companies (Canada Sygma etc) in one physical location

Reduce Total Cost of Ownership and Deliver Value Faster ndash

Faster time to market for insights at a lower price

Provide accuracy timeliness and fidelity to the BI reporting process

Next generation architecture that fosters innovation and reduce costs

Change the BI consumption pattern ie move from hindsight to insight driven reporting

Take manual work load off the team and enable them becoming data analyst rather than report

creators

Enable decommissioning of triplicated business applications and processes

Benefits of Transition

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 13: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below

helped unlock savings drive top line growth and market share

The three year plan was enabled by quick actionable insights that were derived using tools like Tableau

Merchandising Supply ChainSales amp Margin

Management

Initiative CatMan Operational Data Insights

RevMan Opportunity

Tracking and Cost to

Serve

Targeted Insights

bull Broker Performance

bull Category Attribute Analysis

bull Category Conversion

bull Category Compliance

bull Innovation Items Scorecard

bull Marketing associate compliance

bull Inbound amp Outbound

Productivity

bull Cost per Piece

bull Service Level

bull Warehouse Efficiency

bull DriverDelivery Scorecards

bull eCommerce Penetration

and Adoption

bull Opportunity Tracker

bull Price Management Tool

bull Deal Manager

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 14: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

bull Cost Per

Piece

dashboard

bull Summary

view of

comparison

results

bull Allows to

compare to

plan and PY

bull Provides

ability to drill

down to

department

(Warehouse

Delivery

Maintenance)

Category Management

Price Optimization

Operational Productivity Measures

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 15: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

The roadmap consisted of improvements across the three dimensions of people

process and technology in order to achieve a successful transformation

PEOPLE

- Centralization amp restructuring of the

BI org

- Strategic insourcing of key roles

- Training re-tooling for individual

and team growth

PROCESS

- Adoption of an Agile delivery model

- Data Governance

- Continuous process improvements

- Change management to help with

adoption

TECHNOLOGY

- Additional capability at a lower cost

- Consolidate toolsets

- Easier access to non-USBL data

- Stabilize the existing platform

Business Value Derived from

Data amp Analytics

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 16: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

What is SEED (Sysco Ecosystem for Enterprise Data)

SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward

while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights

SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security

SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly

stand up sandbox environments for experimentation

Demand driven model with predictable amp affordable costs

Stabilization of environments reduced cost of delivery over time

Broad and deep functionality to support various use cases within data and analytics

Improved agility and quality with powerful tools for data manipulations and migrations

Why SEED

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 17: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel

Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines

SUS(AS40

0)SUS

amp SWMS

SUS(AS40

0)3rd party

SUS(AS40

0)

CANADA

amp

SpecialtyIn

form

atica

B

O E

TL J

ob

s

SUS(AS40

0)SAP

1010 service account

Business Objects

Direct Query

Custom reporting Data

Extraction

ETL Service account

NETEZZA Internal

SAP ETL Account

Tableau

NETEZZA

Informatica

Arrow-Steam NPD HAVI

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 18: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

WMS IDS DPR

Sales Inventory

Master Data

SWMS

Amazon S3

Raw data Transformed

Data Reportable

Data

AWS Lambda Amazon EMR AWS Data Pipeline

Amazon

Redshift

Amazon RDS

Extracts

Amazon

Athena

Other BI apps

Internal

External

Data Scientist

ELT Compute Layer

Storage Layer Analyze LayerIngestion

Collection

Layer

Auditing and Monitoring Layer

Amazon CloudWatch

Extracts

Consumers

Sygma

Freshpoint

AWS Glue -

post phase II

AWS CloudTrail

Amazon Glacier

archive Metastore

AWS Glue -

post Phase II

Amazon

Redshift Spectrum

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 19: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach

Architecture Simplification

(Ingestion consumption and new

capabilities)

bull Movement from capacity driven

model to a demand driven model for

predictable costs

bull Handle mixed loads by offloading

processing (ETL) to a distributed

environment

bull Simplify and regulate data

movement across systems

bull Allow for addition of data types from

transactions interaction and

observations currently not in the

EDW

bull Usage driven consumption design

patterns

Cost optimization

bull CAP-EX and OP-EX reduction

bull Sustainable support solution that allows

for reduction in MS costs

bull Reduction in number of tools to deploy

and mange

User Value

bull High valued BI capabilities drive

development of the Data-warehouse

bull Timely access to data ndash hrs mins

versus multiple daysmonths

bull Enablement of advanced analytics

Enhanced reliability amp accuracy

bull Accurate data delivered via repeatable

process

bull Errors are identified and corrected

before business use

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 20: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Analytical Use Cases

for the Business Revenue Management

bull Margins review by market

bull Predictive Pricing simulations with

external economic data

bull Pass thru predictive pricing analysis at all

levels of the organization

bull Descriptive model for Customer

Segmentation

Merchandising and Supply Chain

bull Assortment optimization at scale

bull Track vendor cost components of items

bull Lotting using decision trees

bull Forecast Vendor Price changes

bull Market basket analysis

bull Warehouse Performance Analysis

Marketing

bull Share of Wallet

bull Machine learning for future promotions

bull Cross-sell opportunity feeder

bull Churn analysis

The capabilities of SEED allow for the enablement of advanced analytics use

cases already defined and requested by the various functional areas

SEED

bull Analytical Sandboxes

bull Quicker time to market

bull R integration

bull Better performing retrievals

bull Large data sets

bull Unstructured data

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 21: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and

optimized for SLA requirements

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 22: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Slow Dashboard Rendering

Memory Utilization reaching limits

Storage Limitations

Needed improved IOPS (InputOutput

Operations Per Second)

Needed High Availability

Top most used Sites

Workbooks by Site

Proactive Monitoring

and

Growth Projection

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 23: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Current System Specifications

Worker Nodes

bull EC2 Instance Type c42xlarge

bull Operating System Windows 2012 R2

bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)

bull Cores 4

bull RAM 15GB

Primary Node

On Prem2 Nodes16 Cores128 GB RAM

AWS3 Nodes16 Cores244 GB3000 IOPS

AWS6 Nodes40 Cores610 GB3000 IOPS

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 24: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

2014 2015 2016 2017 Scale OutTotal number of

Server Users64 1700 3860 12713 20000

Total number of

Active Users64 1100 1375 5825 12000

Dedicated Core

vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU

Concurrent Users 11 55 120 350 TBD

Max Concurrency 16 60 150 400 960

Number of

Workbooks8 110 206 671 TBD

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 25: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Benefits of moving to SEED on AWS

Scalability amp Availability to meet Business Needs

Better Cost Leverage

Improved Capability

Security

Testing before implementation

Governance

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 26: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today

Getting StartedTo assist you in getting

started on Amazon Redshift

AWS has developed a guide

to help you begin your data

warehouse transformation

To learn more and read the guide

Click Here

Quick Start Guide ndash

TableauThis Quick Start helps you

deploy a modern enterprise

data warehouse (EDW)

environment that is based on

Amazon Redshift and

includes the analytics and

data visualization capabilities

of Tableau Server

To learn more about the Quick Start

and to get started with Tableau Server

on AWS

Click Here

JumpstartsAWS Partners 47Lining and

NorthBay have both

developed jumpstart

consulting offers for

customers to demonstrate the

effectiveness of their modern

data warehouse solutions

To learn about 47Liningrsquos consulting

offer

Click Here

To learn about NorthBayrsquos consulting

offer

Click Here

Please complete

the session survey

from the Session

Details screen in

your TC17 app

Page 27: # D a t a 1 7 - cdn.tri-digital.comcdn.tri-digital.com/Tableau/2017/resources/17BS-013_PPT_SyscoAWS... · ETL Service account NETEZZA Internal SAP ETL Account Tableau NETEZZA Informatica

Please complete

the session survey

from the Session

Details screen in

your TC17 app