# d a t a 1 7 -...
TRANSCRIPT
D a t a 1 7
Building a data lake on AWS Syscorsquos journey to predictive analyticsGreg KhairallahHead of Business Development Analytics Amazon Web Services
Navin AdvaniSenior Director Business TechnologySysco
D a t a 1 7
State of Data Warehousing
Data Warehousing Challenges Today
Exponential Data Growth Varying Data Types Need Data Analyzed Faster
Benefits of Using Amazon Redshift
Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3
Amazon Redshift is
Fast Simple Elastic Secure Compatible Low Cost
Financial and management reporting
Payments to suppliers and billing workflows
WebMobile clickstream and event analysis
Recommendation and predictive analytics
The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester
Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed
spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in
the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change
The Forrester Wavetrade Big Data Warehouse Q2 2017
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Building a data lake on AWS Syscorsquos journey to predictive analyticsGreg KhairallahHead of Business Development Analytics Amazon Web Services
Navin AdvaniSenior Director Business TechnologySysco
D a t a 1 7
State of Data Warehousing
Data Warehousing Challenges Today
Exponential Data Growth Varying Data Types Need Data Analyzed Faster
Benefits of Using Amazon Redshift
Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3
Amazon Redshift is
Fast Simple Elastic Secure Compatible Low Cost
Financial and management reporting
Payments to suppliers and billing workflows
WebMobile clickstream and event analysis
Recommendation and predictive analytics
The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester
Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed
spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in
the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change
The Forrester Wavetrade Big Data Warehouse Q2 2017
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
State of Data Warehousing
Data Warehousing Challenges Today
Exponential Data Growth Varying Data Types Need Data Analyzed Faster
Benefits of Using Amazon Redshift
Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3
Amazon Redshift is
Fast Simple Elastic Secure Compatible Low Cost
Financial and management reporting
Payments to suppliers and billing workflows
WebMobile clickstream and event analysis
Recommendation and predictive analytics
The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester
Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed
spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in
the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change
The Forrester Wavetrade Big Data Warehouse Q2 2017
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Benefits of Using Amazon Redshift
Amazon Redshift is a fast fully managed petabyte-scale data warehouse that makes it simple and cost-effective to analyze all your data using your existing business intelligence tools Amazon Redshift also includes Redshift Spectrum allowing you to directly query exabytes of unstructured data in Amazon S3
Amazon Redshift is
Fast Simple Elastic Secure Compatible Low Cost
Financial and management reporting
Payments to suppliers and billing workflows
WebMobile clickstream and event analysis
Recommendation and predictive analytics
The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester
Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed
spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in
the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change
The Forrester Wavetrade Big Data Warehouse Q2 2017
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Financial and management reporting
Payments to suppliers and billing workflows
WebMobile clickstream and event analysis
Recommendation and predictive analytics
The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester
Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed
spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in
the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change
The Forrester Wavetrade Big Data Warehouse Q2 2017
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
The Forrester Wavetrade is copyrighted by Forrester Research Inc Forrester and Forrester Wavetrade are trademarks of Forrester
Research Inc The Forrester Wavetrade is a graphical representation of Forresters call on a market and is plotted using a detailed
spreadsheet with exposed scores weightings and comments Forrester does not endorse any vendor product or service depicted in
the Forrester Wave Information is based on best available resources Opinions reflect judgment at the time and are subject to change
The Forrester Wavetrade Big Data Warehouse Q2 2017
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Accelerate Migrations from Legacy Systems
ldquoAWS Database Migration Service is the most
impressive migration service wersquove seenrdquo ndash Gartner
Amazon Redshift
Migrate
Over 1000 unique
migrations to Amazon
Redshift using DMS
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Modernize your analytics platformData Lake = flexible set of web services that match your use cases
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Designed for 11 9s
of durability
Designed for
9999 availability
Durable Available High performance Multiple upload
Range GET
Store as much as you need
Scale storage and compute
independently
No minimum usage commitments
Scalable
Amazon EMR
Amazon Redshift
Amazon DynamoDB
Amazon Athena
Integrated
Simple REST API
AWS SDKs
Read-after-create consistency
Event notification
Lifecycle policies
Easy to use
Why Amazon S3 for data lake
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Big Data on AWS
Immediate Availability Deploy instantly No hardware to
procure no infrastructure to maintain amp scale
Trusted amp Secure Designed to meet the strictest
requirements Continuously audited including certifications
such as ISO 27001 FedRAMP DoD CSM and PCI DSS
Broad amp Deep Capabilities Over 70 services and 100s of
features to support virtually any big data application amp
workload
Hundreds of Partners amp Solutions Get help from a
consulting partner or choose from hundreds of tools and
applications across the entire data management stack
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Sysco FoodsAn Overview
Sysco is the global leader in selling marketing and distributing food products to restaurants healthcare and
educational facilities lodging establishments and other customers who prepare meals away from home
Sysco operates 197 distribution facilities serves about half a million customers in 13 countries
For Fiscal Year 2017 that ended July 1 2017 Sysco generated sales of more than $55 billion
COSTA RICA
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Current State Challenges
Lack of Analytical Capabilities Lack of business analytical
capabilities to analyze large volume data across category
management customer insights price simulations etc
Reporting Inconsistencies and Long Lead Times Reporting
standards are not defined most reports transactions are tailored to
requests Multiple data source and systems creating spaghetti data
scenarios leading to inconsistencies
Creeping Cost of Ownership Aged and Siloed BI solutions and
processes are slowly increasing the total cost of ownership in storage
infrastructure maintenance and administration
Scalability amp Stability Issues Reporting team is currently above
capacity with several thousands custom reports running Issues with
performance delays in reporting due to data load causing instabilities
Future State Goals
Enable Revenue Growth - Better enable business decisions through
data visibility and consistency
Improve Operational Efficiency - Increase the efficiency of business
processes through data management best practices
Enhanced Customer Experience ndash Deliver more intuitive information
to our internal and external customers through self-serve reporting
model
Enterprise View Of Data - Consolidated view of the customers
suppliers and products data from Sysco SUS and SAP broadline and
specialties companies (Canada Sygma etc) in one physical location
Reduce Total Cost of Ownership and Deliver Value Faster ndash
Faster time to market for insights at a lower price
Provide accuracy timeliness and fidelity to the BI reporting process
Next generation architecture that fosters innovation and reduce costs
Change the BI consumption pattern ie move from hindsight to insight driven reporting
Take manual work load off the team and enable them becoming data analyst rather than report
creators
Enable decommissioning of triplicated business applications and processes
Benefits of Transition
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Due to competitive market pressures there was a big push to streamline operating costs and the three key areas below
helped unlock savings drive top line growth and market share
The three year plan was enabled by quick actionable insights that were derived using tools like Tableau
Merchandising Supply ChainSales amp Margin
Management
Initiative CatMan Operational Data Insights
RevMan Opportunity
Tracking and Cost to
Serve
Targeted Insights
bull Broker Performance
bull Category Attribute Analysis
bull Category Conversion
bull Category Compliance
bull Innovation Items Scorecard
bull Marketing associate compliance
bull Inbound amp Outbound
Productivity
bull Cost per Piece
bull Service Level
bull Warehouse Efficiency
bull DriverDelivery Scorecards
bull eCommerce Penetration
and Adoption
bull Opportunity Tracker
bull Price Management Tool
bull Deal Manager
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
bull Cost Per
Piece
dashboard
bull Summary
view of
comparison
results
bull Allows to
compare to
plan and PY
bull Provides
ability to drill
down to
department
(Warehouse
Delivery
Maintenance)
Category Management
Price Optimization
Operational Productivity Measures
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
The roadmap consisted of improvements across the three dimensions of people
process and technology in order to achieve a successful transformation
PEOPLE
- Centralization amp restructuring of the
BI org
- Strategic insourcing of key roles
- Training re-tooling for individual
and team growth
PROCESS
- Adoption of an Agile delivery model
- Data Governance
- Continuous process improvements
- Change management to help with
adoption
TECHNOLOGY
- Additional capability at a lower cost
- Consolidate toolsets
- Easier access to non-USBL data
- Stabilize the existing platform
Business Value Derived from
Data amp Analytics
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
What is SEED (Sysco Ecosystem for Enterprise Data)
SEED is a AWS based ecosystem that allows Sysco to unlock the value from our data and drive our analytics journey forward
while also modernizing our technology landscape to enable scalable enterprise wide data discovery amp insights
SEED is envisioned to scale with evolving business needs and provides a foundation for data governance and data security
SEED being cloud native inherently also helps drive the Data Science and our Agile journey forward with the ability to quickly
stand up sandbox environments for experimentation
Demand driven model with predictable amp affordable costs
Stabilization of environments reduced cost of delivery over time
Broad and deep functionality to support various use cases within data and analytics
Improved agility and quality with powerful tools for data manipulations and migrations
Why SEED
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Leveraged the success and cost savings within Netezaa Lexington data mart transformation to go after the crown jewel
Syscorsquos ldquoEDW Landscaperdquo Based on the success of Lexington transformation leadership team accelerated timelines
SUS(AS40
0)SUS
amp SWMS
SUS(AS40
0)3rd party
SUS(AS40
0)
CANADA
amp
SpecialtyIn
form
atica
B
O E
TL J
ob
s
SUS(AS40
0)SAP
1010 service account
Business Objects
Direct Query
Custom reporting Data
Extraction
ETL Service account
NETEZZA Internal
SAP ETL Account
Tableau
NETEZZA
Informatica
Arrow-Steam NPD HAVI
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
WMS IDS DPR
Sales Inventory
Master Data
SWMS
Amazon S3
Raw data Transformed
Data Reportable
Data
AWS Lambda Amazon EMR AWS Data Pipeline
Amazon
Redshift
Amazon RDS
Extracts
Amazon
Athena
Other BI apps
Internal
External
Data Scientist
ELT Compute Layer
Storage Layer Analyze LayerIngestion
Collection
Layer
Auditing and Monitoring Layer
Amazon CloudWatch
Extracts
Consumers
Sygma
Freshpoint
AWS Glue -
post phase II
AWS CloudTrail
Amazon Glacier
archive Metastore
AWS Glue -
post Phase II
Amazon
Redshift Spectrum
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Consolidated focus on data ingestion consumption and need for new capabilities led us to the Ecosystem approach
Architecture Simplification
(Ingestion consumption and new
capabilities)
bull Movement from capacity driven
model to a demand driven model for
predictable costs
bull Handle mixed loads by offloading
processing (ETL) to a distributed
environment
bull Simplify and regulate data
movement across systems
bull Allow for addition of data types from
transactions interaction and
observations currently not in the
EDW
bull Usage driven consumption design
patterns
Cost optimization
bull CAP-EX and OP-EX reduction
bull Sustainable support solution that allows
for reduction in MS costs
bull Reduction in number of tools to deploy
and mange
User Value
bull High valued BI capabilities drive
development of the Data-warehouse
bull Timely access to data ndash hrs mins
versus multiple daysmonths
bull Enablement of advanced analytics
Enhanced reliability amp accuracy
bull Accurate data delivered via repeatable
process
bull Errors are identified and corrected
before business use
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Analytical Use Cases
for the Business Revenue Management
bull Margins review by market
bull Predictive Pricing simulations with
external economic data
bull Pass thru predictive pricing analysis at all
levels of the organization
bull Descriptive model for Customer
Segmentation
Merchandising and Supply Chain
bull Assortment optimization at scale
bull Track vendor cost components of items
bull Lotting using decision trees
bull Forecast Vendor Price changes
bull Market basket analysis
bull Warehouse Performance Analysis
Marketing
bull Share of Wallet
bull Machine learning for future promotions
bull Cross-sell opportunity feeder
bull Churn analysis
The capabilities of SEED allow for the enablement of advanced analytics use
cases already defined and requested by the various functional areas
SEED
bull Analytical Sandboxes
bull Quicker time to market
bull R integration
bull Better performing retrievals
bull Large data sets
bull Unstructured data
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Interactive queries Ad-hoc queries and data extracts queries within each use cases were evaluated and
optimized for SLA requirements
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Slow Dashboard Rendering
Memory Utilization reaching limits
Storage Limitations
Needed improved IOPS (InputOutput
Operations Per Second)
Needed High Availability
Top most used Sites
Workbooks by Site
Proactive Monitoring
and
Growth Projection
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Current System Specifications
Worker Nodes
bull EC2 Instance Type c42xlarge
bull Operating System Windows 2012 R2
bull vCPU 8 (High frequency Intel Xeon E5-2666 v3 (Haswell) processors optimized specifically for EC2)
bull Cores 4
bull RAM 15GB
Primary Node
On Prem2 Nodes16 Cores128 GB RAM
AWS3 Nodes16 Cores244 GB3000 IOPS
AWS6 Nodes40 Cores610 GB3000 IOPS
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
2014 2015 2016 2017 Scale OutTotal number of
Server Users64 1700 3860 12713 20000
Total number of
Active Users64 1100 1375 5825 12000
Dedicated Core
vCPU capacity16 40 80 vCPU 80 vCPU 192 vCPU
Concurrent Users 11 55 120 350 TBD
Max Concurrency 16 60 150 400 960
Number of
Workbooks8 110 206 671 TBD
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Benefits of moving to SEED on AWS
Scalability amp Availability to meet Business Needs
Better Cost Leverage
Improved Capability
Security
Testing before implementation
Governance
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
AWS and Featured Partners offer guides Quick Starts and Jumpstarts to help you get started today
Getting StartedTo assist you in getting
started on Amazon Redshift
AWS has developed a guide
to help you begin your data
warehouse transformation
To learn more and read the guide
Click Here
Quick Start Guide ndash
TableauThis Quick Start helps you
deploy a modern enterprise
data warehouse (EDW)
environment that is based on
Amazon Redshift and
includes the analytics and
data visualization capabilities
of Tableau Server
To learn more about the Quick Start
and to get started with Tableau Server
on AWS
Click Here
JumpstartsAWS Partners 47Lining and
NorthBay have both
developed jumpstart
consulting offers for
customers to demonstrate the
effectiveness of their modern
data warehouse solutions
To learn about 47Liningrsquos consulting
offer
Click Here
To learn about NorthBayrsquos consulting
offer
Click Here
Please complete
the session survey
from the Session
Details screen in
your TC17 app
Please complete
the session survey
from the Session
Details screen in
your TC17 app