2013 gala miami: breaking into latin maerican markets on a small budget

37
An MT Case Study: Breaking into Latin American Markets on a Small Budget María Azqueta (SeproTec) & Diego Bartolomé (tauyou)

Upload: tauyou

Post on 18-May-2015

124 views

Category:

Technology


0 download

DESCRIPTION

The Latin American market is composed of a mix of various Spanish dialects. If a company really wants to reach a specific audience in Latin America, it must use the right dialect. But how is it possible to translate marketing materials into four or five Spanish dialects without dramatically increasing costs? This session will discuss how a joint effort to create an MT engine for translating international Spanish into specific Latin American dialects (Spanish for Argentina, Chile, Columbia, Mexico, and Puerto Rico) made this challenge feasible, economical, and replicable.

TRANSCRIPT

Page 1: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

An MT Case Study:

Breaking into Latin American Markets

on a Small Budget

María Azqueta (SeproTec) & Diego Bartolomé (tauyou)

Page 2: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide

Spanish Language:

• Also known as Castellano.

• Latin-derived Romance language.

• Spanish is one of the six official languages of

the United Nations and an official language of

the European Union.

Page 3: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide

Page 4: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide

0 200 400 600 800 1000 1200

Mandarin Chinese

Spanish

English

Hindi/Urdu

407 million

311 million

955 million

360 million

Second most spoken language by number of native speakers

Page 5: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide

• For demographic reasons, the percentage of the

orld’s populatio that speaks Spa ish as a ati e language is increasing, while the percentage of

Chinese and English speakers is decreasing.

• Withi three or four ge eratio s, % of the orld’s population will communicate in Spanish.

• I 5 , the U ited States ill e the orld’s foremost Spanish speaking country.

Page 6: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish on the Internet

• Spanish is the third most widely used language on

the Net.

• The use of Spanish on the Net has experienced a

growth rate of 807.4% between 2000 and 2011.

• Spain and Mexico are among the 20 countries with

the highest number of internet users.

• The demand for documents in Spanish is the fourth

largest fro a o g the orld’s la guages.

Page 7: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide and its Differences

High demand for translations into Spanish.

But… is the same Spanish spoken everywhere?

Page 8: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide and its Differences

RAE (Royal Spanish Academy) :

– Created in the 18th century, it is widely seen as

the arbiter of what is considered standard

Spanish.

– It produces authoritative dictionaries and

grammar guides.

– Although its decisions are not formally binding,

they are widely followed in both Spain and Latin

America.

Page 9: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide and its Differences

Lexical variations

Grammatical differences

Idioms

Different dialects and many differences:

Page 10: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Spanish Worldwide and its Differences

‘Neutral’ or ‘International’

Spanish

Latin American Spanish & European Spanish

Market Trend:

Page 11: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Why Adapt to the

Local Spanish of Each Country?

To reach different markets

People are most likely to buy when a product is advertised in their dialect

Page 12: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Why Adapt to the

Local Spanish of Each Country?

EN: Take a card from the deck

ES: Coge una carta de la baraja

Client A (Gaming Industry)

Page 13: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Why Adapt to the

Local Spanish of Each Country?

ES: Coge una carta de la baraja

AR: Agarrá una carta del mazo

CL: Toma una carta del naipe

CO: Coge una carta de la baraja

MX: Saca una carta de la baraja

PR: Coge una carta de la baraja

Page 14: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Coger (32 entries) http://rae.es/rae.html

1.tr. Asir, agarrar o tomar. U. t. c. prnl.

31. intr. vulg. Am. Realizar el acto sexual

Why Adapt to the

Local Spanish of Each Country?

Page 15: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget
Page 16: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Advise Clients

If you really want to break into a specific

market, you must decide which country

you want to target and localize your

material for the different Spanish dialects

spoken in each individual country.

Page 17: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

The Main Problems Clients Face

Page 18: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Is there a cost-efficient solution

on the market?

Page 19: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

tauyou MT Solution at SeproTec

Hybrid machine translation since January 2011

La guages: EN, ES, PT, GA, FR, IT…

Do ai s: Legal, Te h i al…

Glossaries and forbidden words lists

Average translated words per month: 700,000

Page 20: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Initial Brainstorming

MT from

EN > different ES dialects

Extensive post-editing would be required

Page 21: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Final Scope of the Project

Human translation + revision

English > Spanish (Spain)

MT of Spanish (Spain) into Spanish from:

• Argentina

• Chile

• Colombia

• Mexico

• Puerto Rico

Page 22: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Initial Approach for Latin American MT

Traditional Workflow

. Gather tra slatio e ories (EN → ES-XX)

2. Add generic material

3. Develop engine

4. Add linguistic pre- and post-processing

5. Improve quality over time

Page 23: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Drawbacks

Varying MT Quality

Depending on the domain and dialect

Initial Inconsistencies among Dialects

Handled with glossaries

Medium Post-Editing Effort

Could be improved over time

Page 24: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

New Approach

Translate EN to Standard ES

Via standard high-quality human translation

Convert Standard ES to Latin American Variants

From Spanish to Spanish

Better final quality is achieved

Page 25: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Specifications

Countries

Argentina, Chile, Colombia, Mexico, Puerto Rico

Internal Glossaries to Handle Lexical Variations

It corrects discordance

Idioms

Grammatical Differences

It adapts verb tenses

Page 26: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Testing the Prototype Engine

Extraction of several texts (fashion, real-estate, human resources, automobile)

Sent to linguists and/or translators in each target country for localization

Performance of the same localizations by the engine

Comparison and contrasting of human and machine localization results

Page 27: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

First Bug Report

Not all terms were localized

Concordance issues

(masc./fem.; sing./pl.)

Verbal tenses for Argentina

Human vs. Machine MT: 7.78 % error rate

Page 28: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

First Bug Report

Some terms were changed/localized by the engine, but not by the humans.

(example)

Human error or MT error?

Page 29: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Testing the Prototype Engine

A glossary was created by extracting the terms localized by the linguists/translators.

This glossary was then sent to the same people who localized the texts to verify that all the terms were correctly localized and nothing was missing.

Page 30: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Testing the Prototype Engine

The glossary grew by 36.91%!

Page 31: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Testing the Prototype Engine

People can miss things.

Although many different variants of Spanish

exist, Spanish speakers understand many

terms that are foreign to their own dialect

when they read them in context,

sometimes to the point of accepting them

as their own. I believe that this may be

due to the phenomenon of globalization

and the internet.

Page 32: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Latest Bug Report

MT: 1.21% error rate

Page 33: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Achievements

Very little post-editing needed

Reduced error rate

Shortened deadlines

Significant cost reduction

Page 34: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Conclusions

Human localization is not perfect.

MT is not perfect either.

Combining human and machine translation

helps achieve high quality and reduce cost.

Page 35: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Further Work

Improving Glossaries

Through a simple web interface for PE

Extending Spanish Language Coverage

More dialects

Traductor.cervantes.es

Incorporating more languages

English, French and Portuguese

Page 36: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

Bibliography

Yule, G. (2006). The Study of Language: Third

Edition, Cambridge University New York.

RAE

Instituto Cervantes

http://www.linguapress.com

Page 37: 2013 GALA Miami: Breaking into Latin Maerican Markets on a Small Budget

THANK YOU FOR

YOUR TIME!

María Azqueta

[email protected]

Diego Bartolomé [email protected]