the importance of the etl process
DESCRIPTION
This presentation is part of LearnItFirst's SQL Server 2012: A Comprehensive Introduction course. The video that contains this presentation can be watched here: https://www.youtube.com/watch?v=FzayiGi97bc The concept of ETL is one of the more important concepts that will be covered in the last couple chapters of this course. Even if you are not working with ETL today and now, chances are high that at some point you will be asked to. This video will explain the scenario in which a business has several different databases, and how to make it all fit together. Highlights from this slideshow: - What is Microsoft's ETL tool? - What enables a business to use so many different databases? - What different options do you have for building a dashboard? - Needs of the organization versus the requirements of the vendorTRANSCRIPT
![Page 1: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/1.jpg)
p. 1 1
1
Chapter: SQL Server 2012 Integration Services Course: SQL Server 2012 - A Comprehensive Introduction Course ID: 170 Instructor: Scott Whigham
Chapter 16: Video # 2
The Importance of the ETL Process
![Page 2: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/2.jpg)
p. 2 2
2
SQL Server 2012 Integration Services (SSIS) is Microsoft’s ETL tool
– Extract, Transform, and Load
![Page 3: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/3.jpg)
p. 3 3
3
Most businesses have data in more than one format
–How does one business happen to use so many different databases?
![Page 4: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/4.jpg)
p. 4 4
4
Let’s walk through a likely scenario and see how this happens:
–2001: The “AdventureWorks” company launches a web store to complement its brick-and-mortar stores
• ASP-based website
• SQL Server 2000 backend
• Customers are encouraged to phone questions in or to send an email
![Page 5: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/5.jpg)
p. 5 5
5
Things change... – 2001: Launch with SQL 2000
–2003: AdventureWorks buys a competitor
• Competitor used a PHP/MySQL ticketing system
• AW mgmt chooses to adopt this system for customer ticketing rather than build/buy an alternative
![Page 6: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/6.jpg)
p. 6 6
6
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
![Page 7: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/7.jpg)
p. 7 7
7
Needs change... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
–2004: The company is growing – time for more “stuff”:
• A PHP/MySQL project management system is installed
• A marketing mailer application with contact mgmt is purchased
![Page 8: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/8.jpg)
p. 8 8
8
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
![Page 9: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/9.jpg)
p. 9 9
9
Markets change... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
–2005: A new ASP.NET website is rolled out with a SQL Server 2005 backend
• Major upgrade from SQL Server 2000 -> 2005
![Page 10: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/10.jpg)
p. 10 10
10
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
2005 Website upgrade MS SQL Server 2005
![Page 11: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/11.jpg)
p. 11 11
11
Trends change... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
– 2005: Upgraded website to SQL 2005
–2008: Website sales popularity causes “growing pains”
• A new supply chain management app purchased
• A new employee management/HR/payroll package is purchased
![Page 12: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/12.jpg)
p. 12 12
12
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
2005 Website upgrade MS SQL Server 2005
2008 Supply chain mgmt MS SQL Server 2008
2008 Employee/HR/Payroll DB2
![Page 13: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/13.jpg)
p. 13 13
13
The world grows smaller... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
– 2005: Upgraded website to SQL 2005
– 2008: Added supply chain mgmt and HR/payroll packages
–2010: Website sales continue to gain popularity, particularly overseas
• A new shipping database is purchased
• Employee expenses are now tracked in custom MS Excel spreadsheets
![Page 14: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/14.jpg)
p. 14 14
14
AdventureWorks timeline:
Year Usage Data Source
2001 Website MS SQL Server 2000
2003 Customer Ticket System MySQL 3.23
2004 Project Management MySQL 4.0
2004 Marketing mailer MS Access
2005 Website upgrade MS SQL Server 2005
2008 Supply chain mgmt MS SQL Server 2008
2008 Employee/HR/Payroll DB2
2010 Shipping *.csv file downloaded monthly
2010 Employee expense tracking MS Excel
![Page 15: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/15.jpg)
p. 15 15
15
It’s 2012 and company executives + management have been playing a game lately...
– You know this one, don’t you?
![Page 16: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/16.jpg)
p. 16 16
16
![Page 17: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/17.jpg)
p. 17 17
17
The world grows smaller... – 2001: Launch with SQL 2000
– 2003: PHP/MySQL 3.23 ticketing system
– 2004: PHP/MySQL 4.0 project management
– 2005: Upgraded website to SQL 2005
– 2008: Added supply chain mgmt and HR/payroll packages
– 2010: New shipping database, employee expense tracking
–2012: Executives want a B.I. solution
• You name it, they want it
• But... – there’s no budget for software purchases...
![Page 18: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/18.jpg)
p. 18 18
18
No budget for new software = more opportunities for you!
– You decide:
• ... to create a relational OLAP data warehouse to store all the company’s historic data in a unified way
• ... to create a multidimensional database with multiple cubes (to facilitate fast browsing of analytics)
• ... to install Excel 2013 on all CxO and management machines, and to teach them how to build pivot tables and pivot charts
• ... to investigate Reporting Services as a way to build internal web dashboards and subscription-based reporting
– On-the-job experience, here we come!
![Page 19: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/19.jpg)
p. 19 19
19
The company data is all “loosely connected”
– A customer makes a small order via the website
– The same customer submits a “Help!” ticket
– Customer rep. has to make an order for a replacement part
– Sales person takes customer to an entertainment event
– Customer now makes a large order
– Key question: how did we acquire this customer?
![Page 20: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/20.jpg)
p. 20 20
20
Integration Services is your ETL tool
1. You Extract the data from the source to a staging area • Optional, but typically an MS SQL Server relational
database
2. You make any changes to the data (a.k.a. a Transformation) • Either in motion or in the staging area
3. You Load the data into the relational data warehouse
4. You process the cube(s)
– SSIS is your “one stop shop” for all of this!
![Page 21: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/21.jpg)
p. 21 21
21
Your final step is to build a dashboard
– Reporting Services or PowerPivot?
– Power View or Excel?
– SharePoint or email?
– On-demand or subscription-based?
![Page 22: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/22.jpg)
p. 22 22
22
Your dashboard is a hit!
![Page 23: The Importance of the ETL Process](https://reader033.vdocuments.mx/reader033/viewer/2022052903/5577ea09d8b42a7b7b8b51b5/html5/thumbnails/23.jpg)
p. 23 23
23
In the next video…
–How to Install and Configure SSIS 2012
“A painter paints pictures on canvas. But musicians paint their pictures on silence.”
- Leopold Stokowski