master data management instructor: pankaj mehra teaching assistant: raghav gautam lec. 4 april 8,...
TRANSCRIPT
Master DataManagement
Instructor: Pankaj MehraTeaching Assistant: Raghav Gautam
Lec. 4April 8, 2010
ISM 158
What is master data management?
• Processes and technologies for creating the go to source of consistent, integrated information about core business entities
• Master DataAbout:-Customers- Products- Parts- Employees- Suppliers
What:-Entities- global ID- Attributes- Taxonomy
Standard global schema
Centralized governance
Why MDM?
• Supporting single view of … business imperatives
• Gain visibility and control over vital information– Cleanse,
standardize, consolidate
– Apply data governance
• Examples– MDI
• Improve procurement and distribution by removing duplications, errors and inconsistencies in supply chain data
– HLS• Track physician outreach
sales activities for compliance reporting
MDM Problem Statement
• The goal: Create and maintain a high-quality view for the whole enterprise, across all its functions, of mission-critical information objects
• Starting with: duplicate, inconsistent or incomplete records in locally governed silos, each with its own quality control and data model
Key Elements of MDM Solutions - I
• Business Logic– Complex rules
capturing the strategic analytics and data quality intent of the business
– Complex rules capturing regulatory intent
• Example– A defense signals
agency in Australia records as many cell phone calls as can
– Rules define the entities of interest
Key Elements of MDM Solutions - 2
• Data integration tools– Data discovery– Extraction,
Transformation & Loading (ETL)
– Data lifecycle management
• A market campaign management project needs to optimize the allocation of advertising dollars– Composite Discovery Server could
help you locate the right source for “PC sales data by geography”
– Informatica PowerCenter 9 or Composite Integration Server will let you set up complex information extraction and transformation steps using a visual query language
– Database archiving tools from IBM/Princeton Softech will let you sample and manage the retention of data from diverse sources
The lifecycle of data
DATA MARTS
TEST & DEV
ExtractionTransformation &Loading(nightly)
Archiving(weekly)
Subsetting(as needed)
PRODUCTIONDATABASES
WAREHOUSE(refreshed frequently)
?
HISTORICALARCHIVE(long-term retention)
businesstransactions
Semantic Technologies and Policy Engines automate complex tasks
dis
cover
ap
ply
polic
y
cla
ssify
StorageResourceManagement
ApplicationResourceManagement
BusinessProcessResourceManagement
FeatureExtraction
CategoryMetadata
SemanticMetadata(meaning)
Specialplatform
Captureat source
Migrate toplatform
Integrateon demand
Managein place
Key Elements of MDM Solutions - 3
• Entity Taxonomies– Describe how entity
names, attribute names, attribute values are to be interpreted
• Ontologies can define more complex semantics
Source: Wand, Inc. catalog
Key Elements of MDM Solutions - 4
• Common Data Model– capturing core entities
in a standard schema– Allows long-term
enterprise-wide investment in quality and analytics regimens
• The HP Enterprise Data Warehouse consolidates customer, product, and sales data from thousands of operational systems and in turn consolidates hundreds of data marts
Focusing on differentiationthrough industry data models
• ADRM and other providers are helping standardize the schema of common data types across and within industries
• Equally potent open-source initiatives are part of the Semantic Web and Linked Object Data work– E.g. Dublin Core
80% universal data model
Tech
Retail
FSI
Govt/Defense
Comm/media
MDI
Energy
…
Differentiating from the Competition
• Ultimately, data quality improvement is achieved through going the extra mile using every trick in the book
• What helps?– Statistics– Semantics
• Example:– Statistical analysis of whether data
missing from resource utilization traces of supply chain management applications is MAR (missing at random) or NMAR (not missing at random)
• A Systematic Approach for Improving the Quality of IT Data
Jul 6, 2008 ... Martin Arlitt, Keith Farkas, Subu Iyer, Preethi Kumaresan, Sandro Rafaeli. HP Laboratories. HPL-2008-83. http://www.hpl.hp.com/techreports/2008/HPL-2008-83.pdf
Where to learn more
• Whitepapers from suppliers of MDM technology:– Informatica/Siperian– IBM/Initiate
• Industry analysts: Gartner, in particular• Wikipedia: http://
en.wikipedia.org/wiki/Master_Data_Management (chase the See Also links)
• Learn about industry-standard data models– ADRM.net, IBM xyz Industry Frameworks– Learn about ACORD and insurance industry