master data management instructor: pankaj mehra teaching assistant: raghav gautam lec. 4 april 8,...

17
Master Data Management Instructor: Pankaj Mehra Teaching Assistant: Raghav Gautam Lec. 4 April 8, 2010 ISM 158

Upload: verity-hicks

Post on 28-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Master DataManagement

Instructor: Pankaj MehraTeaching Assistant: Raghav Gautam

Lec. 4April 8, 2010

ISM 158

What is master data management?

• Processes and technologies for creating the go to source of consistent, integrated information about core business entities

• Master DataAbout:-Customers- Products- Parts- Employees- Suppliers

What:-Entities- global ID- Attributes- Taxonomy

Standard global schema

Centralized governance

Why MDM?

• Supporting single view of … business imperatives

• Gain visibility and control over vital information– Cleanse,

standardize, consolidate

– Apply data governance

• Examples– MDI

• Improve procurement and distribution by removing duplications, errors and inconsistencies in supply chain data

– HLS• Track physician outreach

sales activities for compliance reporting

MDM Problem Statement

• The goal: Create and maintain a high-quality view for the whole enterprise, across all its functions, of mission-critical information objects

• Starting with: duplicate, inconsistent or incomplete records in locally governed silos, each with its own quality control and data model

Key Elements of MDM Solutions - I

• Business Logic– Complex rules

capturing the strategic analytics and data quality intent of the business

– Complex rules capturing regulatory intent

• Example– A defense signals

agency in Australia records as many cell phone calls as can

– Rules define the entities of interest

Key Elements of MDM Solutions - 2

• Data integration tools– Data discovery– Extraction,

Transformation & Loading (ETL)

– Data lifecycle management

• A market campaign management project needs to optimize the allocation of advertising dollars– Composite Discovery Server could

help you locate the right source for “PC sales data by geography”

– Informatica PowerCenter 9 or Composite Integration Server will let you set up complex information extraction and transformation steps using a visual query language

– Database archiving tools from IBM/Princeton Softech will let you sample and manage the retention of data from diverse sources

The lifecycle of data

DATA MARTS

TEST & DEV

ExtractionTransformation &Loading(nightly)

Archiving(weekly)

Subsetting(as needed)

PRODUCTIONDATABASES

WAREHOUSE(refreshed frequently)

?

HISTORICALARCHIVE(long-term retention)

businesstransactions

Data Discovery Tools show what/how much is out there

Semantic Technologies and Policy Engines automate complex tasks

dis

cover

ap

ply

polic

y

cla

ssify

StorageResourceManagement

ApplicationResourceManagement

BusinessProcessResourceManagement

FeatureExtraction

CategoryMetadata

SemanticMetadata(meaning)

Specialplatform

Captureat source

Migrate toplatform

Integrateon demand

Managein place

Key Elements of MDM Solutions - 3

• Entity Taxonomies– Describe how entity

names, attribute names, attribute values are to be interpreted

• Ontologies can define more complex semantics

Source: Wand, Inc. catalog

Key Elements of MDM Solutions - 4

• Common Data Model– capturing core entities

in a standard schema– Allows long-term

enterprise-wide investment in quality and analytics regimens

• The HP Enterprise Data Warehouse consolidates customer, product, and sales data from thousands of operational systems and in turn consolidates hundreds of data marts

Focusing on differentiationthrough industry data models

• ADRM and other providers are helping standardize the schema of common data types across and within industries

• Equally potent open-source initiatives are part of the Semantic Web and Linked Object Data work– E.g. Dublin Core

80% universal data model

Tech

Retail

FSI

Govt/Defense

Comm/media

MDI

Energy

Differentiating from the Competition

• Ultimately, data quality improvement is achieved through going the extra mile using every trick in the book

• What helps?– Statistics– Semantics

• Example:– Statistical analysis of whether data

missing from resource utilization traces of supply chain management applications is MAR (missing at random) or NMAR (not missing at random)

• A Systematic Approach for Improving the Quality of IT Data

Jul 6, 2008 ... Martin Arlitt, Keith Farkas, Subu Iyer, Preethi Kumaresan, Sandro Rafaeli. HP Laboratories. HPL-2008-83. http://www.hpl.hp.com/techreports/2008/HPL-2008-83.pdf

Where to learn more

• Whitepapers from suppliers of MDM technology:– Informatica/Siperian– IBM/Initiate

• Industry analysts: Gartner, in particular• Wikipedia: http://

en.wikipedia.org/wiki/Master_Data_Management (chase the See Also links)

• Learn about industry-standard data models– ADRM.net, IBM xyz Industry Frameworks– Learn about ACORD and insurance industry

In the next lecture …

• Guest lecture by Dr. Julie Ward

Questions?

NEWS PRESENTATION