bringing agility and flexibility to data design and integration
TRANSCRIPT
Bringing Agility and Flexibility to Data Design and Integration
Phasic Systems IncDelivering Agile Data
www.phasicsystemsinc.com888-735-1774
Introduction to Phasic Systems Inc
• Bringing Agile capabilities to data lifecycle for business success• Methods and tools tested and refined over years of in-depth large-
scale efforts• Solve toughest data problems where traditional methods fail• Based on extensive consulting lessons learned and real-world
results• Began in 2005 to commercialize advanced Agile methods
successfully deployed in competitive development contracts
2
Phasic Systems Inc Management
• Geoffrey Malafsky, Ph.D, Founder and CEO▫ Research scientist▫ Supported many organizations in their quest to access the right
information at the right time• Tim Traverso, Sr VP Federal▫ Technical Director, Navy Deputy CIO
• Marshall Maglothin, Sr VP HealthCare▫ Sr. Executive multiple large health care systems
• Deborah Malafsky Sr VP Business Development
3
Our Agile Methods • Why be Agile?▫ Provide flexibility and adaptability to changing business needs while
maintaining accuracy and commonality▫ Segmented approach is too slow, rigid, and costly
• How?▫ Treat data lifecycle as one continuous operation from governance to
modeling to integration to warehouses to Business Intelligence▫ Emphasize value produced at each step and overall coordination▫ Seamlessly fit with existing organization, procedures, tools but add Agility,
commonality, flexibility, and reduced cost and time• We are Agile and comprehensive▫ Typical 60-90 day engagement▫ Deliver completed products not just plans or partial results
4
Methods and Tools• DataStar Discovery: Agile data governance, standards and design▫ Add business and security context to data▫ Flexible, common data definitions/ semantics, models
• DataStar Unifier: Agile warehousing and aggregation▫ Simplified, common semantics using Corporate NoSQL™▫ Source to target mapping with flexibility, standardization▫ Aggregate data using all use case and system variations simply and
easily into standard or NoSQL databases
5
6
“As a COO of a Wall Street firm and a former Vice Admiral in the United
States Navy in charge of a large integrated organization of thousands of people
and numerous IT systems, I have seen firsthand the critical role that high-quality
enterprise data plays in day-to-day operations of an organization. Without
timely access to reliable and trusted data all of our operations were vulnerable
to poor decision making, weak performance, and a failure to compete. With
Phasic Systems Inc.’s agile methodology and technology, we were finally able to
solve our data challenges at a fraction of the time, cost, and organizational
turmoil that all the previous and more expensive, time-consuming approaches
failed to do. Phasic Systems Inc. offers a new and much-needed approach to
this important area of Business Intelligence.”
PSI Customer Testimonial
VADM (ret) J. “Kevin” Moran
7
The Business CaseToday’s Response Timeline (15 to 27 Months)
Tomorrow’s Initial Response Timeline with PSI (Subsequent Response Timeline – Days)
IT Groups• Develop Systems & Applications• Physical Data Models• Databases / Data Warehouse• ETL controls• MDM
Business Groups• Requirements• Conceptual/Logical Models• Data Quality• Business Rules• Standards
BI Groups
• BI Data Models• Reports• Dashboards
Users• Capability Problems• New Capabilities• Missing Data
3 to 6 Months 6 to 9 Months 3 to 6 Months 3 to 6 Months
• Requirements• Conceptual Data Model• Logical Data Model• Business Rules• Standards• BI Data Models• Data Quality
• Develop Systems & Applications• Physical Data Models• Databases / Data Warehouse• ETL controls• MDM
2 to 6 Months
Agile: Overcome Hurdles• Group rivalry▫ Embrace important business variations; recognize no valid reason
to force everyone to use only one view exclusively. • Terminology confusion ▫ Use a guided framework of well-known concepts to rapidly identify,
and implement variations as related entities. • Poor knowledge sharing ▫ Use integrated metadata where important products (business
models, data models, glossaries, code lists, and integration rules) are visible, coordinated, and referenceable
• Inflexible designs▫ Use a hybrid approach (Corporate NoSQL™) for Agile
warehousing and integration blending traditional tables and NoSQL for its immense flexibility and inherent speed
8
Schema Are Not Enough
Must be agile in order to adapt quickly to new business needs ▫ Continuous change is norm: requirements, consolidation▫ We must use all the important business variations of key terms (e.g.
account, client, policy) – No such thing as single version for all!
GovernanceDesign MDM
Integration?
Which Value? Whose?
?
My “customer” or your “customer”?
Sales, Accounting
CEO/CFO/CIO SAP/IBM/ORACLE
How is data used?
D. Loshin 2008
Status Quo: Non-Agile
10
Agile: Visible, Common
Unified Business Model™11
Intuitive, List-based
Real Estate Listing Example
• Seems simple and well-defined▫ Each house has a type, id, address, etc..▫ Industry standards: OSCRE, RETS
• Yet, data systems are very different▫ Data model tied tightly to business workflow▫ Extensions and “make-it-work” changes added over time
• Similar to customer relationship mgmt, ERP, and many other fields
12
Semantic Conflict in Real Estate Models
13
NKY
HOMESEEKERS
NKY attribute ‘basement’ does not have a corollary in
HOMESEEKERS
Data Value Semantic Errors = Inconsistent, Difficult to Merge, Report, Analyze
14
Lot_dimensions: implied semantics for size data. Actually has all sorts of data
Semiannual_taxes: implied semantics for numeric data. Actually has all sorts of data
15
NKY HomeSeekers Texas
16
17
Fully Integrated Metadata for Business, IT, and BI
18
19
DataStar Corporate NoSQL™• Large systems use NoSQL for its flexibility, performance,
and adaptability▫ But, it is poorly suited for corporate use – lacks connection to
business• DataStar Corporate NoSQLTM
▫ Blends traditional techniques and NoSQL ▫ Entities come directly from Unified Business Model▫ Object structure with simple tables▫ Key-value pairs are basic repeating structure of all tables▫ Business driven terminology▫ Easily handles semantic variations & updates w/o changes to
logical or physical models▫ Can be as ‘dimensional’ or ‘normalized’ as desired
20
Speed&
Agility
Position Data Model21
Results• Applied to production data:▫ Fully cleaned & integrated data governance approved Requirement: 500,000 records in 2 hrs on Sun E25K Actual: 50 minutes on 3 year low-cost server
• Governance documents produced and approved▫ Legacy data models – first time in ten years▫ Common data model – directly derived from ontology.
Position-Resume model• Standing governance board created with short decision-
making monthly meetings▫ Position-Resume Governance Board
• Process approach and technology applied to new IT systems
Navy HR Data Analysis• Groups “share” data and control only if they don’t lose project
control or funds• Governance, business process, data engineers create separate
designs and don’t know how to coordinate• Try hard to follow industry guidance but stuck• Actual data is very different than policy, mgmt awareness▫ Example 1: Multiple Rate/Rating entries. Person xxxxxx has 5
entries: 4 end on the same date, 2 have start dates after they their end dates , 2 start and end on the same days but are different
▫ Example 2: 30 different values used for RACE but only 6 allowed values in the Navy Military Personnel Manual derived from DoDpolicy
Agile Warehousing and BI24
Agile Warehousing and BI25
v
26
Resume Data Model
Key-Value Vocabulary
27
Resume Identifiers
Key-Value Vocabulary
28
Competency KSAs