whipping - zango group · pain of data management. without an initial data-architecture effort, an...

7

Upload: others

Post on 28-Oct-2019

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz
Page 2: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz

I N F OWO R L D . C O M 0 2 . 0 6 . 0 6 27

No one likes data integration. it’s painstaking, hard to automate, and hard to mea-sure in terms of business ROI. Yet it’s required for making systems work together, whether as a result of an acquisition, as part of a migration to new tools, or in an effort to consolidate existing assets.

“The first question is, ‘What database are we going to use as our customer source?’ ” notes John Kolodziejczyk, IT director at Carlson Hotels Worldwide. Rather than keep asking — and answering — that question, the hospitality company devised a common data architecture, and a platform for managing it, for all its applications as part of the migration to a service-oriented architecture. Similarly, ball-bearing manufacturer GGB decided it needed a central product information hub to ensure consistent data mapping among its Oracle e-Business Suite and three aging ERP systems, rather than try to maintain a raft of point-to-point con-nectors, says Matthias Kenngott, IT director at GGB.

Much enterprise data is either locked away in data stores or encapsulated within applica-tions. Traditionally, applications “know” what the data means and what the results of their manipulations mean, in essence creating a consistent data model, at least locally. As modern enterprises mix and match functions across a variety of applications, however, the data mod-els get mixed together as well — often without the IT developer being aware of it.

“The more you distribute the data, the more likely there will be problems,” says Song Park,director of pricing and availability technology at Starwood Hotels. The result could be what

B Y G A L E N G R U M A N | I L L U S T R A T I O N B Y C A M P B E L L L A I R D

Whipping

DATAInto ShapeNow’s the time to tackle the ugly problemof reconciling widely distributed data, particularly if you’re considering a service-oriented architecture

Page 3: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz

28 I N F OWO R L D . C O M 0 2 . 0 6 . 0 6

Don DePalma, president of the Com-mon Sense Advisory consultancy, calls “frankendata,” calling into question the accuracy of the results generated by the services and applications.

“There’s always a context to data. Even when a field is blank, different applications impose different assump-tions about what that means,” notes Ron Schmelzer, senior analyst at SOA research company ZapThink.

Ultimately, frankendata can make a set of integrated applications or a vast web of services both unreliable and hard to repair. Many relationships must be traversed to understand not

only the original data components but how they were transformed along the way. The antidote to frankendata is to provision data needed for multiple applications as a service — incorpo-rating contextual metadata where needed and reconciling discrepancies among distributed data sources.

The SOA ImperativeA twofold advantage of SOA is that cre-ating services that perform oft-used functions reduces redundant develop-ment — and increases agility by making application functionality available across a variety of systems using standardized interfaces and wrappers. The loosely cou-pled, abstracted nature of SOA has pro-found implications for the data that the services use, manipulate, and create.

“Do you divvy it up, or do you provide

a central service?” asked Starwood Ho-tels’ Park when the company began its SOA effort. That question led it down a path many enterprises must travel en route to SOA: a services approach to data based on knowing what data means no matter where it comes from. “SOA raises the fact that data is het-erogeneous,” Schmelzer says.

As services exchange data, the potential for mismatches and unmapped transformations grows considerably. “SOA propels this prob-lem into the stratosphere,” Common Sense’s DePalma says. “Put together your first three- or four-way data ser-

vice,” and you’ll quickly discover the pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz Group.

The solution, according to analysts and consultants, is to develop a data services layer that catalogs the cor-rect data to use and exposes its con-text to other services. This approach decouples the data logic from the business logic and treats data access and manipulation as a separate set of services invoked by the business processes. Without such a scheme, enterprises will find themselves with loosely coupled business processes that rely on tight data dependencies, eliminating SOA’s core benefit of loose coupling.

This effort is a change from past data integration approaches. “We used to solve data integration by impos-ing controls at critical choke points,” ZapThink’s Schmelzer recalls. “SOA eliminates these choke points, so I now have a data integration problem everywhere. That means every data access point has to be able to trans-form and manage data,” he says.

“Data integration and process integra-tion are inexorably linked,” says Henry Morris, group vice president of integra-tion systems at IDC. “You need to think of services to manage data. Think about the processes that affect the master data wherever it lives,” he advises.

SOA also raises concurrency issues, notes Nikhil Shah, lead architect at the Kanbay International consultancy. For example, how data changes dur-ing the process may affect the results, especially in a composite application, as old data is propagated through the process, or when multiple services access the data at different times. Shah recommends that IT implement monitoring services — or at least ser-vices that notify other services when changes occur — so that they can determine whether to restart the pro-cess or adjust their computations.

Moreover, the more granular the data services, the greater the impact orchestration overhead has on pro-cesses, which could slow response time and create synchronization is-sues, Shah says. He advises IT to model data management requirements before a service can consume that data. Gen-erally speaking, the more transactional the service, the more the specific data manipulation should be hard-coded into the business logic, he says.

Another SOA data issue is the “snow-plow effect,” which occurs when ser-vices pass on the context about their

As modern enterprises mix and match functions across applications, the data models get mixed together as well.

Page 4: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz

gA

ry

str

eng

I N F OWO R L D . C O M 0 2 . 0 6 . 0 6 29

data manipulations to subsequent ser-vices in a composite application, says Ken Rugg, vice president of data man-agement at Progress Software, which provides caching technology for data management in SOA environments.

Publishing those transformations can help later services understand the context of the data they are working with, IDC’s Morris says. But that can also flood the system with very large data files and slow down each service. IT needs to consider carefully how much context is passed through as aggregated parameters versus limiting that metadata and having the service interface look for exceptions, Kanbay’s Shah says.

The Return of Master DataThe rise of SOA has given vendors reason to revisit their tools to sim-plify data management, for both SOA and non-SOA environments. Many are now promoting MDM (master data management) tools to help ensure that applications or services use only correct, current data in the correct context. “Master data” incorporates not only the data itself but attributes, semantics, and context (metadata) needed to understand its meaning for proper use by a variety of systems. (Some vendors call these systems enterprise information integration, or EII, tools.)

Although not new, the concept was largely relegated to after-the-fact data systems such as data ware-houses and business intelligence, notes Bill Swanton, research director at AMR Research. Before SOA, enter-prises could largely get away without worrying about master data because most information resided in applica-tion suites, where the vendors had at least an implicit, internal master

data architecture in place. IT could thus focus just on transmitting pro-cessed or raw data between applica-tion suites — by creating connectors — and allowing the applications to handle most of the contextual issues, he notes.

SOA’s many-to-many architecture no longer allows IT to leave the problem to application vendors and to limited integration conduits. Even non-SOA environments, though, benefit from moving from the one-off approach of creating connectors to a more ration-alized data architecture that makes integration simpler, Swanton says.

Some providers — including IBM, Informatica, Oracle, and Siperian — approach the issue as an operation-al data warehouse, providing one or more data hubs that services access both from stores of cleansed data and from services that generate validat-ed data from other applications as a trusted broker. These emulate the hub-

and-spoke architecture common to traditional enterprise environments. Others — such as BEA Systems, i2 Technologies, and Xcalia — approach the issue at a more federated level to better mirror the loosely coupled, abstracted nature of an SOA.

Analysts and consultants warn that today’s technology is very immature and at best can help only specific data management processes. “There is no silver bullet,” says Shawn McIntosh, senior manager at consultancy AGSI. For example, Starwood’s Park notes that his IT group is hopeful that IBM’s planned Systems Integration Bus will provide a way to manage the data ser-vices in the hotelier’s SOA. “But we can’t wait for the tools to come out,” he says.

Many of the data hubs currently offered are geared to one data subject, such as customer or product informa-tion. That’s fine as an initial building block; later, however, IT will have to

Master Data Management ArchitectureAt its heart, master data management relies on a set of metadataencapsulated as rules. Services and applications call on those rules when working with the data.

Content

SOURCE: AMR RESEARCH

Analytics

Externalsources anddestinations

Global dataregistries

Industrydata

repositories

Tradingpartners

Datacleansingservices

Master data architecture MDM components and services

Transaction Apps

ERP CRM

PLM SCM

Service

Stewardship and business rules

Employees

Customers

Assets

Products

Suppliers

Unstructureddata

Internalsources anddestinations

Reporting,scoreboards,dashboards

Taxonomy and relationships

Master datarepositories

Organize

Validate

Work�owsMapand

move

Cleanse andenhance

Webcontent

Secure

MASTER DATA GOVERNANCE

Page 5: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz

30 I N F OWO R L D . C O M 0 2 . 0 6 . 0 6

generalize the hub or work with a federation of specific data hubs, says Satish Krishnaswamy, senior director of MDM business at i2. “We won’t ever get to one single hub, so IT should instead work toward a standard canonical [hierarchical] view” of data across its sources, IDC’s Morris says.

To make the scope manageable, IT organizations generally define the rules and context for one subject area and then extend the system out to oth-er subject areas over a period of time. That’s what Carlson Hotels is doing, starting with the customer-oriented hub IBM acquired from DWL. Accord-ing to Carlson’s Kolodziejczyk, howev-er, the hospitality company is not yet sure whether it will extend that hub to include product data or use the prod-uct hub IBM acquired from Trigo.

Deciding whether to start with a subject-specific system — such as product information within SCM —or a generalized system depends on how targeted the integration efforts are to specific application suites. It may make more sense to start with a subject-specific hub if your focus is on interactions with an ERP or SCM system, whereas a generic hub makes more sense if your focus is on an SOA in which services interact with a wide variety of applications.

Building a Data ArchitectureMDM tools can help, but they do little good if the enterprise doesn’t under-stand its data. “I see a fair amount of hype around the concept of master data management,” says Fred Cum-mins, a fellow at EDS. Because cen-tralized data stores deal typically with after-the-fact results, not with states and transactions, the more an MDM system looks like a traditional data warehouse or master database, the less likely it meets the needs of a transac-tional system, whether in a traditional or SOA environment, Cummins says. “It’s unrealistic to expect that there is one master database that everything reads or feeds. Some of the data is transactional,” concurs Paul Hernacki, CTO of consultancy Definition 6.

For an SOA, MDM tools that simply repackage EAI tools are not very help-ful, Cummins says. That’s because an SOA should be driven by business pro-cesses, whereas EAI typically focuses on connecting applications together without worrying about the underly-ing data context for each. Even for tra-ditional integration efforts, “you can’t just put in middleware and off you go,” adds Brian Babineau, an information integration analyst at Enterprise Strat-egy Group.

“Primarily, it’s a design issue,” echoes ZapThink’s Schmelzer. “We have great tools for databases, messaging, trans-formation, etc.,” to implement the design, he adds. Designing the architec-ture and the specific services correctly requires that developers understand all the data used and generated by services and the applications they interact with — a labor-intensive process.

That’s why IT needs a commonly ac-cessible set of data services or at least data mappings. “At some point this will have to be formalized as a repository,”

Common Sense’s DePalma says. Criti-cal for an SOA, this approach is also very useful in traditional environments, he adds.

With those mappings created, IT can then focus on building the connectors or services that implement them. IT must understand which mappings should be available to multiple servic-es and applications — and thus imple-mented as separate processes — and which are endemic to specific business logic and should be encapsulated with that business logic, consultant Hur-witz says.

Many enterprises have avoided such data architecture efforts because there’s no obvious ROI, notes Common Sense’s DePalma. Some remember ear-lier-generation efforts such as custom data dictionary creation, which also involved understanding the organiza-tion’s data architecture; by the time they were completed, they were already outdated. Fortunately, IT can approach the data understanding incrementally, creating the rules and metadata around the information used for specific appli-cations’ or services’ needs, says Marcia Kaufman, partner at Hurwitz Group. Over time, the enterprise will build up a complete data architecture. “It’s a long-term journey,” says Hurwitz.

That data architecture will typically include multiple data models, each oriented to a specific type of subject or process, notes Paul Patrick, chief archi-tect at BEA Systems. That actually helps IT by allowing the data architecture to be developed in stages, plus it limits the mapping required between data mod-els. (A unified data model must account for all possible mappings, whereas a federated model does not.)

Furthermore, IT should concen-trate on dealing with exceptions, says ZapThink’s Schmelzer. For example,

Page 6: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz

32 I N F OWO R L D . C O M 0 2 . 0 6 . 0 6

IT should develop services that check for data that are out of normal bounds, rather than try to develop an enter-prisewide ontology that maps out every possible state or relationship, he says.

Ultimately, the enterprise should build up layers of data services in which master data is distributed, says William McKnight, senior vice president of data warehousing at consultancy Conver-sion Services International, although the infrastructure and tools to deliver on this goal aren’t yet mature.

Roll Up Your SleevesProvisioning data sources as services across an organization is a monster undertaking. For a traditional integra-tion effort, it means understanding the context within each application and how data is transformed for delivery

to other apps. For an SOA, it requires understanding the multiple relation-ships and dependencies data can have with various business processes. “There are so many variables here,” notes Com-mon Sense’s DePalma.

Analysts and consultants agree that this complexity requires both an up-front investment in modeling data architecture and an ongoing effort to systematically think through data dependencies and context. Discover-ing the data models and relationships among your systems to create the map-pings is about 70 percent of the effort in an SOA’s data architecture, says IDC’s Morris. At GGB, IT director Kenngott said the modeling and discovery effort was about 30 percent of the data-inte-gration effort within its ERP consolida-tion project.

That initial push is well worth it, ar-gues Starwood’s Park. “Otherwise, you can get pretty far along with your proj-ect and discover that you have 10 fields that you don’t need, 10 that you do but didn’t know when you designed the service, and five that are different than you thought. When you have a complex system with hundreds of services, these interfaces have to be nailed down.”

In most organizations, the tough slog of codifying interfaces and recon-ciling distributed data models is long overdue. But today, with the majority of large organizations pushing ahead with some sort of SOA initiative, the natural inclination to avoid this ugliest of hairballs can no longer be sustained. “The problem is too big to sweep under the rug any more,” Conversion Service’s McKnight says. i

the tough slog of codifying interfaces and reconciling distributed data

models is long overdue.

working with data across an enterprise — especially in an SOA environment — requires understanding its context and semantics, not just its format and field attributes. And that means metadata. For developers as well as services to track that metadata, a repository would be useful. Theoretically, they would provide the intermediary services, but with today’s technology, “this is just too ... hard to do,” says Paul Patrick, chief architect at BEA Systems. “No one has assembled the pieces yet.”

The metadata repositories in use today tend to be part of ETL (extract, transform, load) and business intelligence systems, says William McKnight, senior vice president of data warehousing at consultancy Conversion Services International. “They’re com-plex, mainframe-oriented, very expensive, and not integrated with modern tools,” he says.

“Previous efforts at a metadata repository were a debacle,” says Don DePalma, president of the Common Sense Advisory consultancy. In addition to the high licensing costs, “the work to create an encyclopedia of all applications, with its indeter-minate benefit, was too high,” he says. Not only were the tools “too academic,” they asked developers to adhere to very formal

processes and methods at a time when “all of this formalization went out the window with the move to HTML” and the quick-and-dirty development of the early Web period, DePalma says.

But vendors are now revisiting the metadata repository con-cept. Some are incorporating the technology in their information management tools. For example, Xcalia uses an XML table-based metadatabase in its Intermediation Platform, which allows IT to create metadata-based transformation rules so services can interact with data sources in a consistent way that is mindful of the data’s context and semantics. The company hopes to develop a stand-alone metadata repository that allows these rules to be used by multiple applications, says Eric Samson, Xcalia’s CTO. And Informatica uses a metadata repository in its PowerCenter Data Federation data-integration platform, notes Ashutosh Kulkarni, Informatica’s principal product manager.

Other vendors, including BEA Systems and IBM, are also work-ing on less expensive, easier to implement metadata repositories. “The master data management has to be in a repository, whether the architecture is federated, distributed, or centralized,” says Dan Drucker, director of enterprise master data solutions at IBM.— G.G.

Finding a Home for Metadata

Page 7: Whipping - Zango Group · pain of data management. Without an initial data-architecture effort, an SOA won’t scale across the enterprise, says Judith Hurwitz, president of Hur-witz

APPLICATION FOR FREE SUBSCRIPTION

Apply online at: http://subscribe.infoworld.com

A. MAILING ADDRESS Publisher reserves the right to limit the number of complimentary subscriptions. Free subscriptions available in the U.S. (including APO and FPO) and Canada

NAME

TITLE

COMPANY NAME

DIVISION / DEPT. / MAIL STOP

MAILING ADDRESS

CITY / STATE / ZIP / POSTAL CODE

Is the above address a home address? q 1. Yes q 0. No

E-MAIL ADDRESS

BUSINESS PHONE (INCLUDING AREA CODE) BUSINESS FAX NO. (INCLUDING AREA CODE)

IT / Technology Managementq 01. CTO, CIO, CSO, Vice Presidentq 02. Directorq 03. Manager / Supervisorq 04. Network Manager / Directorq 05. Engineerq 06. Systems Analyst / Programmer /

Architectq 07. Other IT ManagementIT / Technology Professionalq 08. Consultant / Integratorq 09. Developer

q 10. IT Staffq 11. Other IT Professional Corporate / Business Managementq 12. CEO, COO, President, Owner,

Vice Presidentq 13. CFO, Controller, Treasurerq 14. Directorq 15. Manager / Supervisorq 16. Other Business Management Title

q 98. Other Title

(specify)____________________________

2. What is your primary job title? (PLEASE CHECK ONE ONLY)

General Business Industriesq 01. Defense Contractor / Aerospaceq 02. Retail / Wholesale / Distribution

(non-computer)q 03. Pharmaceutical / Medical / Dental /

Healthcareq 04. Financial Services / Bankingq 05. Insurance / Real Estate / Legalq 06. Transportation / Utilitiesq 07. Media (print / electronic)q 08. Communication Carriers (telecomm,

data comm., TV / cable)q 09. Construction / Architecture / Engineeringq 10. Manufacturing & Process Industries

(non-computer)q 11. Research / Development

Technology Providersq 12. Service Provider (MSP, BSP, ISP,

ASP, etc.)q 13. Computer / Network Consultantq 14. Systems / Network Integrator, VAR / VADq 15. Technology Manufacturer (hardware,

software, peripherals, etc.)q 16. Technology - Related Retailer /

Wholesaler / Distributor

Government / Educationq 17. Government: federal (including military)q 18. Government: state or localq 19. Education

q 98. Other

(specify)____________________________

5. What is your organization’s primary business activity at this location? (PLEASE CHECK ONE ONLY):

Software / Products / Technologiesq 01. Customer Relationship Managementq 02. Enterprise Resource Planningq 03. Business Process Management /

Outsourcingq 04. Business Intelligence / Data Mining /

Data Warehousingq 05. Portalsq 06. Financials / Payroll / Billingq 07. Performance / Application Managementq 08. .NETq 09. Other Softwareq 10. Networkingq 11. Web Servicesq 12. Content Delivery Networksq 13. Network and Systems Managementq 14. VoIP (Voice Over IP)q 15. Telecommunicationsq 16. Wirelessq 17. Remote Access

q 18. Web / Video Conferencingq 19. Storageq 20. Disaster Recoveryq 21. Securityq 22. Anti-Virus / Content Filteringq 23. Firewallq 24. VPNq 25. Identity Managementq 26. Authentication / Authorizationq 27. Intrusion Detection & Preventionq 28. Encryptionq 29. Other IT Products / Technologies

Hardware / Peripheralsq 30. Serversq 31. Notebooks / Laptopsq 32. PDAs / Handhelds / Pocket PC /

Wirelessq 33. Printersq 34. Other Hardware / Peripherals

4. Are you involved in buying, specifying, recommendingor approving the following IT products / services?(PLEASE CHECK ALL THAT APPLY):

I wish to receive a free subscription to InfoWorld. qYes q No

SIGNATURE DATE

GET TECHNOLOGY RIGHT

7. Which of the following operating systems are in useor planned for use at this location?(PLEASE CHECK ALL THAT APPLY):

q 01. Windows XPq 02. Other Windowsq 03. Mac

q 04. Linux / Unix / Solarisq 05. Other

(please specify) ___________________

B. CONTACT PREFERENCESYou may receive a renewal reminder via e-mail. May we send other information about InfoWorldproducts, services, or research via e-mail? q 1. Yes q 0. No

We occasionally send our subscribers email messages with news about technology solutions andspecial offers from qualified third parties. Would you like to receive these messages?

q 1. Yes

IT / Technology Functionsq 01. Executiveq 02. Department Management - ITq 03. Networks / Systems Managementq 04. Applications Developmentq 05. Management of Enterprise Applications

(CRM, ERP, SCM, etc.)q 06. Research / Development Managementq 07. Consultant / Integratorq 08. Other IT Functions

Corporate / Business Functionsq 09. Executiveq 10. Department Management - Businessq 11. Financial / Accounting Managementq 12. Research / Development Managementq 13. Sales / Marketing Managementq 14. Other Business Functions

q 98. Other Functions

(specify)____________________________

3. Please indicate your job function(s)? (PLEASE CHECK ALL THAT APPLY):

1. Over the course of one year, do you buy, specify,recommend, or approve the purchase of the followingproducts or services worth: Please include amounts for all locations of your organization. Consultants: please include what you recommend foryour clients as well as what you buy for your own business.

01. $100 million or more02. $50,000,000 to $99,999,99903. $30,000,000 to $49,999,99904. $20,000,000 to $29,999,99905. $10,000,000 to $19,999,999

06. $5,000,000 to $9,999,99907. $2,500,000 to $4,999,99908. $1,000,000 to $2,499,99909. $600,000 to $999,99910. $400,000 to $599,999

11. $100,000 to $399,99912. $50,000 to $99,99913. Less than $49,99914. None

Product category Write code in boxLarge systemsClient computersNetworking / Telecom (including servers)WirelessInternet / Intranet / ExtranetSecurityStoragePeripheral equipmentSoftwareService / Support / Outsourcing

q 1. 20,000 or moreq 2. 10,000 - 19,999q 3. 5,000 - 9,999q 4. 1,000 - 4,999

q 5. 500 - 999q 6. 100 - 499q 7. 50 - 99q 8. Less than 49

6. How many people are employed at this organization,including all of its branches, divisions and subsidiaries?(PLEASE CHECK ONE ONLY):

PRIORITY CODE: WW5PDF