agco’s multi-national, multi-language conversion to dita
TRANSCRIPT
Center for Information Development CMS/DITA NAApril 2016
AGCO’s Multi-National, Multi-language Conversion to DITA
2
Abstract
In 2015 AGCO, working with DCL, successfully converted over 70,000 pages to DITA with a small multi-national team, working under the gun to make sure all content was converted before a cutover to a new system. Dowdell and Gross discuss the challenges of managing a project with a multi-national, multi-language team across multiple sets of requirements.
We discuss both the people challenges and the technical challenges of producing consistent content from a variety of materials developed from previously independent companies.
Aside from the conversion issues, we also discuss how to maintain translation efficiency across a complex conversion and how the identification of reusable content allowed us to reduce conversion volume by almost 50%–instead of over 70,000 pages, only about 38000 pages were converted.
3
What does Data Conversion Laboratory (DCL) do?
Services DCL provides:o Scanning and digitizing – we capture content from images, paper, microfilm
and retrieve data with unique automated processes to greatly improve accuracy.o Converting – we convert data captured from any format, automatically and
reliably, to XML, HTML, EPUB, and other formats needed to support new uses.o Enriching – we enhance data to make it more findable and usable making use
of the latest data mining, metadata extraction, xml tagging and other enrichment techniques
o Automating – we automate where possible , with our extensive collection of tools, to improve content reliability and reduce costs
o Delivering Content – eBooks, data files, web-ready files, in the precise form you need, for whatever device your clients need.
DCL converts documents from all formats to enhanced digital formats such as XML, SGML and HTML for publishing, databases, eBooks, and distribution over the web.
5
Fortune 500 Company NYSE Listed
AGCO products are sold through 5 core brands, Challenger®, Fendt®, GSI®, Massey Ferguson® and Valtra® and are distributed globally.
AGCO Corporation is a world leader in designing, manufacturing, marketing, and distributing of agricultural equipment.
AGCO has a legacy of brands and technologies.
About AGCO
7© 2011 AGCO Corporation All rights reserved.
7
AGCO Pressures and Responses
Cost / Margins Design Anywhere, Build Anywhere 3200 Dealers Globally Selling in 140 countries Authoring in 5 languages Publishing in 32 languages Financial Segregated by Regions Lots of Legacy Products, with long
service life High Growth
o Acquisitions o 25 year old company
Pressures: Automation/Tools Integration Process Development Standards Best Practices
Responses:
8
Suolahti, Finland
Marktoberdorf, Germany
Beauvais, France
Jackson, Minnesota
Hesston/Beloit, Kansas
Canoas, Brazil
Mogi, Brazil
Others
Custom Topic Based DTDDITA
Randers, Denmark
Today
Assorted Desktop Publishing
Global Reuse
Breganze, Italy
AGCO Global Authoring System
Plan, Improve, Consolidate
TIME
9
Cultures
The Germans ---“It must all be done programmatically” The French --- “It won’t work” The Danish --- “There must be a way” The Italians --- “Let’s try this” The Brazilians --- “I need the output now. I can’t wait for you guys”
Facilitated by DCL and AGCO HQ Colorful Meetings— Diversity in team is a strength, even when it doesn’t initially seem that
way.
The Players
10
1. Convert or Rewrite?
2. Automated (Programmatically) or Manual conversion?
3. Resources--Internal or External?
4. How Do We Retain Multi-Language Investment?
5. Is the Data There in the Source?
6. How Do We Maximize Reuse?
7. How Did We Overcome Technical Obstacles?
The Seven Questions
11
1. Convert or Rewrite?
o DITA is based on a foundation of semantics and minimalism
o AGCO source was Topic-based content which allowed for a migration path to DITA
o Ask yourself, is your source material up to the challenge?
o If not, is there a workable approach to make it convertible?
Will you have to make the content Task Based? Does the content need to be Minimalized?
Is the content suitable for conversion?
12
2. Automated or Manual Conversion?
How much content is there?
o Sometimes it just plain isn’t worth automating – “brute force” wins sometimes
Manual conversion lends itself to content changes and leverage loss
Is time a factor?
Automated conversion allows for better consistency, accuracy, retention of translation memory investment and faster turnaround
In an automated environment using an Agile methodology, changes to specification can be easily incorporated at any point in the process
13
3. Resources - Internal or External?
Can you have an external resource without the thought:
“We have to make our instructions so detailed ... we should have done this ourselves”
o Topic-based XML source resolved this since conversion was based on tags, not text
o Source material in various languages did not affect conversion software because of structured tagging
o Business rules were defined in collaboration with DCL to address cases where structures in source did not match DITA structures. For example, one type of source structure contained a graphic followed by steps, DITA has no equivalent structure so graphic was placed in first step.
Day job?
14
4. Language Leverage loss?
Translation Efficiency - Tags in source marked “text that was not to be translated”. Such tags were mapped to a DITA equivalent.
Elimination of redundant data – for example, source had metric measures followed by imperial measures, each in its own tag. The decision was made to only retain metric measures since all others could be automatically generated during rendering.
Automated conversion allows you retain text phrasing and minimize leverage loss.
Get the Language Service Provider (LSP) in early to minimize leverage loss.
How do we retain the multi-language investment?
15
5. Is the Data There in the Source?
How often does information get better after translation?
You will likely have to inject something new during or after conversion
o Enhancing Data
o Resolving Ambiguities
o Semantic Tagging
o Normalize Content
When does 1+1 = 1.5?
16
6. How Do We Maximize Reuse?
Finding duplicate material/reusable-content was a major cost savings for AGCO
o Source file names included a topic id and version number.
o As the file manuals were converted, topic ids with their version numbers were automatically logged.
o When the same topic id with the same or earlier version number came in another manual it was not converted, if the version number was greater it was converted.
The Case of Human Genome Mapping--
17
7. How Did We Overcome Technical Obstacles?
We found inconsistent tagging of source material among the facilities in different countries
For example, some sets of material had all topics tagged as concepts when in reality, some should have been tasks and references.
After the initial conversion, DCL used AGCO turnaround spreadsheets of topic IDs and titles for document owners’ review and annotation.
These spreadsheets were input into the automated conversion process to retrospectively tag topics.
Normalized Content:o Eliminated empty paragraphso Modified attribute valueso Modified text
19
Lesson Learned Summary
Ask the seven questions
Don’t skimp on the leverage loss analysis. And test!
When you think you are done. You may be halfway.
o Test your assumptions.
o Heuristic approach?
Have frequent meetings
o Distance can be overcome … somewhat.
o Minutes and Assignments
Collaboration is key
20
Contact Information
Data Conversion Laboratory, Inc.Mark Gross, CEO
[email protected](718) 307‐5711
61‐18 190th StreetFresh Meadows, NY 11365
www.dclab.com
AGCOCharles Dowdell
Manager, Global Technical Service Information(770) 232‐8257
[email protected] River Green Parkway
Duluth, GA 30096www.AGCOcorp.com