data administration data warehouse environment (dwe) implementation 8/19/04

30
Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Upload: muriel-hodges

Post on 12-Jan-2016

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Data Administration

Data Warehouse Environment (DWE) Implementation

8/19/04

Page 2: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

8/11/2004

WebFOCUS (Reporting)

Operational Enterprise

Ad Hoc and Operational Reports

Tell me what happened?

EIS

Tell me everything I need to know and what is important, but do it quickly and easily!

Original Plan for DWE

Data Warehouse Cleansed Subset of Detail Data Subset of Summary

Data Multiple Years of Data Periodic Updates Strategic

External Data Census Data, Benchmark, Salary Surveys, Economic

Data

Data Staging Area Extract Data Transform Data Quality Assurance Create Metadata

Source Data IDMS

Oracle Flat files

Ad Hoc Query Repository Copy of Source Data Operational Daily Updates All Elements Minimum Number of Years of

Data

Data Mart #2 Resource Management

Data Mart #1 Course Management

Subset of DW Summarized in specific

manner Tactical

Metadata

SAS Data Mining Server

OLAP Server

Tell me what may happen, or what is interesting?

Give me information to help me achieve specific goals!

Tell me what happened and why?

Legend: 1) Wide black border indicates physical servers. 2) Narrow black border indicates no decision on

if it will be a separate physical server. 3) Gray background indicates BI or analytical

software servers.

Page 3: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE Terms• Source Data: Operational data from internal systems, such as IDMS (FES, FRS, HRS,

SIS), Oracle, etc. • External Data: Data from systems external to the University, such as economic and

census data collected by the government.

• Data Staging Area: Storage and processing area for data extracted from the internal and external systems prior to loading into the Warehouse, Data Marts or Ad Hoc Query Repository. Some of the data will remain un-cleansed and an exact replica of the data in the online systems, for subsequent loading into the Ad Hoc Query Repository. Other data will be cleansed and transformed before being moved to the Data Warehouse and Data Marts for analysis. Some data will be located in multiple places and in multiple forms and aggregations. (Also known as an ETL or Extract, Transformation and Load server.)

• Metadata: A term used for data that describes or specifies other data. It is used to define all of the characteristics of data required to build databases and applications, and to support knowledge workers and information producers. This includes data element name, meaning, format, domain values, business integrity rules, relationships, owner, etc.

Page 4: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE Terms• Ad Hoc Query Repository: A collection of enterprise data from multiple sources,

used to do ad hoc and operational reporting where the need to use the most current and un-standardized source data is a requirement. The Repository will typically contain only one or two years of the most recent data, unless regulatory or statutory requirements dictate otherwise. (Also known as an Operational Data Store or ODS.)

• Data Warehouse: An enterprise-wide, cross-functional, cross-organizational database typically comprised of data extracted, cleansed and/or summarized from multiple online transaction processing systems, and other stores of data (Purdue University; Stanford University). It is designed for query and analysis, typically contains historical data, and is used to present information to support decision-making, tactical and strategic business processes. A data warehouse tends to start from an analysis of what data already exists and how it can be collected in such a way that the data can later be used. In general, a data warehouse tends to be a strategic, but somewhat unfinished concept; a data mart tends to be tactical and aimed at meeting an immediate need. (Improving Data Warehouse and Business Information Quality, Larry P. English, 1999.)

Page 5: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE Terms• Data Mart: A subset of enterprise data from the Data

Warehouse that is summarized and stored in an optimal fashion for analysis and presentation of information to support trend analysis and tactical decisions and processes. Data Marts are typically designed based on an analysis of user needs to answer specific questions in the pursuit of specific goals. The scope can be that of a complete data subject such as Student, or of a particular business area or line of business, such as Enrollment. (Improving Data Warehouse and Business Information Quality, Larry P. English, 1999.)

• Enterprise Reporting: A category of software technology that

enables the development, organization, sharing, execution, delivery and scheduling of reports via a web platform.

Page 6: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DW Terms (Continued)

• On-Line Analytical Processing (OLAP): A category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information that has been transformed from raw data to reflect the real dimensionality of the enterprise as understood by the user. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server. (http://www.moulton.com/olap/olap.glossary.html) Functionality includes multi-dimensional analysis, slicing, drill-down and rotation.

• Data Mining: A class of database applications that look for hidden patterns in a group of data. For example, data mining software can help retail companies find customers with common interests. The term is commonly misused to describe software that presents data in new ways. True data mining software doesn't just change the presentation, but actually discovers previously unknown relationships among the data. (http://www.webopedia.com/TERM/d/data_mining.html)

Page 7: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DW Terms (Continued)

• Executive Information System (EIS): An application developed to provide senior management direct access to information relevant to an organization’s goals and performance, such as a dashboard. These applications are developed to gather, analyze and integrate internal and external data to provide management with insight into key performance indicators, potential problems, and changes in the environment. Typical features include extensive use of graphics, simple navigational controls, automatic replacement of report contents, drill-down analysis, trend analysis capabilities, exception reporting or alerts, graphical charts with links to underlying reports, provision of data from multiple sources, and the highlighting of information an executive feels is critical. (The Data Warehouse Lifecycle Toolkit, Ralph Kimball, et al.)

Page 8: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

What is a

Decision Support System

EIS

Data Mart

Data Warehouse

Operational Data Store

Covansys

High Level Summarized Data For Top Executives (“Pre-programmed DASHBOARD”)

Addresses Specific Subject Area

Collection Of Integrated Subject Oriented Databases (Historical)

Time-Current, Integrated Databases (Tactical-Power Users)

Components of a Decision Support System

Page 9: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

8/11/2004

WebFOCUS (Reporting)

Operational Enterprise

Ad Hoc and Operational Reports

Tell me what happened?

EIS

Tell me everything I need to know and what is important, but do it quickly and easily!

Original Plan for DWE

Data Warehouse Cleansed Subset of Detail Data Subset of Summary

Data Multiple Years of Data Periodic Updates Strategic

External Data Census Data, Benchmark, Salary Surveys, Economic

Data

Data Staging Area Extract Data Transform Data Quality Assurance Create Metadata

Source Data IDMS

Oracle Flat files

Ad Hoc Query Repository Copy of Source Data Operational Daily Updates All Elements Minimum Number of Years of

Data

Data Mart #2 Resource Management

Data Mart #1 Course Management

Subset of DW Summarized in specific

manner Tactical

Metadata

SAS Data Mining Server

OLAP Server

Tell me what may happen, or what is interesting?

Give me information to help me achieve specific goals!

Tell me what happened and why?

Legend: 1) Wide black border indicates physical servers. 2) Narrow black border indicates no decision on

if it will be a separate physical server. 3) Gray background indicates BI or analytical

software servers.

Page 10: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

8/17/2004

WebFOCUS (Reporting)

Operational Enterprise

Ad Hoc and Operational Reports

Tell me what happened?

Current DWE

Data Warehouse Cleansed Subset of Detail Data Subset of Summary

Data Multiple Years of Data Periodic Updates Strategic

External Data Census Data, Benchmark, Salary Surveys, Economic

Data

Data Staging Area Extract Data Transform Data Quality Assurance Create Metadata

Source Data IDMS

Oracle Flat files

Ad Hoc Query Repository Copy of Source Data Operational Daily Updates All Elements Minimum Number of Years of

Data

Data Mart #1 Course Management

Subset of DW Summarized in specific

manner Tactical

Metadata

SAS Data Mining Server

Tell me what may happen, or what is interesting?

Give me information to help me achieve specific goals!

Tell me what happened and why?

Legend: 1) Wide black border indicates physical servers. 2) Narrow border indicates no decision on if it

will be a separate physical server. 3) Red border indicates under development. 4) Gray background indicates BI or analytical

software servers.

Page 11: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Data Warehouse Subject Areas To be added: Hospital data Note: Asset Management includes facilities, network and properties management.

Staff Applicant

s

Faculty Applicants

Assets (Tangible)

Vendors

Research

Faculty

Courses

Room

Scheduling

Students

Alumni

Staff

Applicants

Prospects

Department/ College

Donors

Accounts (Funding)

Grants

Page 12: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE Current Resources– Query Repository Production: PowerEdge 6650, 4 2.8GHz CPU, 4GB

RAM, 1.2TB storage, Windows Server 2003 Development: PowerEdge 2650, 1 3.0GHz CPU, 2GB RAM, 252GB storage, Windows Server 2003 Software: Oracle Enterprise

– ETL Production: Dell PowerEdge 6650, 4 2.0GHz CPU, 2TB storage,

Windows 2000 Advanced Server Development: Dell PowerEdge 6650, 2 2.0GHz CPU, 1TB storage, Windows 2000 Advanced Server Software: Informatica PowerCenter

– Enterprise Reporting Production: PowerEdge 2650, 2 2.8GHz CPU, 4GB RAM, 291GB storage, Windows 2003 Server Standard Development: PowerEdge 2550, 2 1.27GHz CPU, 1GB RAM, 220GB storage, Windows 2000 Server Software: WebFOCUS

– Statistical Analysis: Dell PowerEdge 2550, 2 1.4 GHZ CPU, 4GB RAM, 144GB storage, Windows 2000 Software: SAS Enterprise Miner, Enterprise Guide, etc.

Page 13: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE Tasks– DBA (1-2 FTE) – Design Oracle DB, write/run

ETL jobs and production support (i.e. monitor system and DB performance, enforce security, schedule backups, etc.)

– Data Administration (2-3 FTE) – User interface, develop requirements document for all DW projects and new views, evaluate data quality, develop specialized reports, test, train users and coordinate projects

– Reporting (1-2 FTE) - Develop enterprise reports

– All – Infrastructure design (with Systems staff), and tool evaluation (ETL, OLAP and desktop reporting) with help from the C/S group.

Page 14: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Implementation Strategy - Educate Users • Basics – “What is a Data Warehouse?” Create a

“single-source-of-truth.” “What it’s not!” (It is not all the data, with daily updates and online storage.)

• Change in culture – “Let’s make better decisions based on objective analysis of data.”

• Set realistic expectations - No silver bullet. It can help you make better decisions, but you still have to be responsible for implementing those decisions.

• Focus on institutional goals – “What is it we need to achieve? What metrics do we need to evaluate our progress in attaining goals?”

• Importance of business sponsors – Make timely business decisions and support requests for necessary resources.

Page 15: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Implementation Strategy - Requirements

• Develop DWE in a phased approach.

• Develop detailed requirements documents with users and institutional administrators for applications within the DWE (DW/DM and reports).

Page 16: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Course Management (I.V.C.)

Business Functions and Goals Optimize course offerings to meet student need.

Improvement Opportunities Increase number of high demand courses/sections Increase maximum enrollment in sections Eliminate or reduce frequency of low demand courses Improve course meeting patterns and delivery mode Performance Measures

# and % decrease of students who do not get any section of the course requested

# and % decrease of low demand courses # and % increase in enrollment % usage of classroom capacity % decrease in length of time to graduate # and % increase in courses taught through

preferred mode Business Questions What are the characteristics of high/low demand courses? What characteristics of the student are related to demand? What courses can be eliminated? Which courses should/can be moved to smaller/larger facilities? What impact does the meeting time and location have on demand? What improvements can be made with/without additional money?

Data Model Data Mart/ Warehouse (American Management Systems, Inc.)

Courses

Student

Enrollment

Defines

Economic Data

Course Demand

College Budgets

Degree Reqs.

Facilities

Available Faculty

Page 17: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Implementation Strategy – Data Quality

• Focus on improving data quality, and establishing standards for data view and element names and data content.

Page 18: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Implementation Strategy – Enterprise Reports

• Gather user input on most important reports required by many users, and develop these reports with an enterprise reporting tool that allows us to deliver pre-defined parameter-driven reports via the web.

Page 19: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

2001-2002: Infrastructure and Planning 1. Create IDMS data dump to Oracle 2. Implement WebFOCUS

3. Purchase data mining tools and server for IR

4. Create views for Query Repository (ad hoc reporting repository)

5. Establish enterprise standards for key data – Analysis and recommendations are ongoing

6. Identify and prioritize data mart development – Course Management Data Mart top priority for Data Stewards

Page 20: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

2001-2002: Infrastructure and

Planning (Continued) 7) Initiate GASB – Phase I 8) Initiate data quality projects 8) Review Desktop Reporting Tools – Ongoing review

and testing of: • Brio• Crystal Reports• SAS• WebFOCUS

Page 21: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

2002-2003: Data Mart Development, etc.

1. Complete GASB – Phase I

2. Implement SAS data mining server

3. Conduct data quality projects – vendor, facilities, FRS, TA data

4. Select and Purchase ETL Tool 5. Begin requirements on Course Management DM

6. Define standards for data view and element names

Page 22: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

2003-2004: DWE Upgrades and User Support

1) Implement ETL tool

2) Upgrade database servers

3) Create Metadata application – “Data about data”

4) Conduct SAS data mining project on freshmen data

5) Provide user and technical training on reporting tools, support listservs and web page

6) Purchase enterprise reporting tool and develop reports

Page 23: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

2003-2004: DWE Upgrades and User Support

7) Create new data views with standardized names

8) Complete GASB - Phase II

9) Continue development of the Course Management DM requirements

10) Initiate development of the requirements for the Resource Management DM

Page 24: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

2004-2005: SAP, etc.

1) Complete standardization of remaining data views

2) Create additional enterprise reports

3) Evaluate SAP Business Warehouse (BW)

4) Conduct extensive data quality analysis for SAP

Page 25: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Reporting Web Site and Metadata

1) Reporting URL: https://reporting.uky.edu/

2) Metadata URL: http://iweb.uky.edu/RptDataDesc/

3) Metadata directions URL: http://www.uky.edu/IS/DataAdmin/DOCS/metadata/MetadataDirections.pdf

4) Data element standards URL: http://www.uky.edu/IS/DataAdmin/DOCS/ware/IUUN0020-QRVE/QRVE-NamingStds/DataElementNamingStds.pdf

5) Data Administration URL: http://www.uky.edu/IT/DataAdmin/

Page 26: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Naming Standards

1) All data view names start with “V_”.

2) All standard element names are comprised of words:– Prime (required) – describes the subject area of the data

(i.e. account, student, department, course, etc.),– Qualifier (optional) – further defines and distinguishes the

“prime” and “class” words (i.e. gender, ethnic, first, etc.), – Class (required) – describes the major classifications or

types of data (i.e. name, date, code, amount, etc.).

3) Standard Name: “Prime”_”Qualifier”_”Class”; standard abbreviations

4) View - V_POSTN; Element - POSTN_BEG_DT

Page 27: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Current Query Repository Data

1) UKFRS_FOC and UKHRS_FOC: to be used by WebFOCUS.

2) UKFRS_SYB: will be removed within 3-4 months.3) GASB: non-standard views used by OC in producing

institutional financial statements.

4) UKFRS_RPT, UKHRS_RPT, UKSIS_RPT and UKSIS_FAMSBR: standardized views will be created over the next couple months, and old views will be removed in 90 days after new views are available. Purchasing views in UKFRS_RPT are in development. UKHRS_RPT also contains standard Labor Distribution views.

5) UKHRS_STAT_RPT: HRS Stat File standard views currently in development and being tested.

Page 28: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE/SAP Issues

1. How does the SAP Business Warehouse functionality compare to what we originally planned for the DWE?

2. Will the SAP BW replace our Data Warehouse/Marts?

3. Should we continue our plans for the historical legacy data in the DWE, and use the SAP BW for data “from this point forward”?

4. Can/how do we “merge/join” historical data with the new data in SAP?

5. What are our options to “interface” the SAP BW with our DWE (API, etc.)?

6. Should the SAP BW feed our DWE or vice versa?

Page 29: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

DWE/SAP Issues (Continued)

7. How much (years of data) should we load into the SAP OLTP system?

8. How much (years of data) should we load directly to the SAP BW?

9. What level of detail data should be loaded into the SAP BW, if the corresponding data is not available in the OLTP system?

10. Should we continue with the “data mart” concept within the SAP environment?

11. How easy is it to add new functionality to the SAP BW (data, reports, “cubes”, etc.)?

Page 30: Data Administration Data Warehouse Environment (DWE) Implementation 8/19/04

Data Administration

QUESTIONS?