taus mt showcase, mt@ec for european public administrations and online services, spyridon pilos,...

28

Upload: taus-enabling-better-translation

Post on 20-Aug-2015

555 views

Category:

Technology


3 download

TRANSCRIPT

Wednesday,  4  June    

MT@EC  for  Europen  Public  Administra>ons  and  Online  Services  

Spyridon  Pilos,  European  Commission  

TAUS  Machine  Transla>on  Showcase  2014  Dublin  (Ireland)  

The  research  within  the  project  MosesCore  leading  to  these  results  has  received  funding  from  the  European  Union  7th  Framework  Programme,  grant  agreement  no  288487  

 

MT@EC European Commission machine translation

for public administrations and digital services in the European Union

Spyridon Pilos Head of Language applications, IT unit

Directorate-General for Translation (DGT)

Dublin, 4 June 2014 2

European Commission machine translation

•  European Commission and languages •  MT@EC: machine translation for EU users •  What next?

3

EU official languages over time

4

EU translation services DGT

5

6

Why do we need machine translation?

•  The Commission… •  DGT has 1700 translators •  Over 2 M pages translated in 2013

•  But… …just to make europa.eu fully multilingual

almost 6.8 M documents to be translated or 8 500 translators/year!

The result: Thousands of non-translated documents (and this does not include user generated content)

MT and EC: a long history Started in the 1970s •  Eurotra (78-92): research, high expectations •  Rule-based ECMT (75-97), costly to develop – not scalable

(18 language pairs in 20 years - coverage of post-2004 languages never attempted- system shut down in 2010

Data-driven systems (Statistical MT) : •  cheap and quick to develop… if you have good data •  EC needs solution for all EU languages… and has good data EC action plan (2009), Inter-service task force (2010) •  The goal: MT@EC offering machine translation for all

languages to and from English, operational in July 2013

MT for understanding (inbound)

MT

L2

L3

Ln

L1

Robustness, Coverage Practically unlimited demand; free web-based services cover much of it

Requirements for MT@EC •  Provide MT as a (simple and robust) service •  Optimise quality for understandability (gisting) •  Deal with many domains, document types, formats, … •  Scale to huge volumes

Two Usage Scenarios for MT@EC

MT for dissemination (outbound)

Textual quality

MT

L2

L3

Ln

L1

Publishable quality can only be authored by humans; Translation Memories & CAT-Tools used by professional translators

•  Requirements for MT@EC •  Provide MT as a tool within a CAT workflow •  Develop new ways to incorporate feedback

•  explicit feedback on MT quality, implicit feedback via TM •  improvements requiring language-specific knowledge •  towards hybrid approaches

•  Optimise quality for post-editing

Two Usage Scenarios for MT@EC

MT@EC: a European Commission product • 

•  Released : 26 June 2013 (version 1.0) •  Languages: All 24 EU official languages

552 language pairs (61 direct) •  Technology: Statistical machine translation

using open source software Moses co-funded by EU Framework Programmes for research and innovation

•  Development by DGT: between 2010-2013 co-funded by the ISA* programme (action 2.8)

•  * Interoperability solutions for public administrations

10

•  Delivery: - web user interface (human to machine) - web services (machine to machine)

•  Security: Host (EC data centre) + access (ECAS) + transfer (sTesta)

•  Special features: •  Source document format/formatting maintained •  Specific output formats for translation: tmx and xliff •  Can translate multiple documents to multiple languages •  Translation can also be returned by email •  Indication of quality for language pairs •  Feedback mechanism

11

MT@EC description

Quality evaluation and improvement…

•  “Maturity Check” (April-May 2011) •  Can baseline MT engines already be used as such? •  Identify main sources of problems for various languages,

cluster them across languages •  Real-life trial (July 2011-June 2013)

•  Make first MT results available to translators •  Auto-MT for 10..19 “best” language pairs (now: all) •  On-demand MT for others (now all languages get MT)

•  Automatic scores •  BLEU scores for internal tuning and regression testing •  Can help to identify domains/document types where MT

is most useful, but also point to systematic difficulties

… with the help of DGT translators

Maturity check 2011 (EN->X)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%ES FR IT PT RO DE DA NL SV BG CS PL SK SL EL M

T LT LV ET FI HU

useful useless

Romancelanguages

inflected

Germaniclanguages

Slaviclanguages

Balticlang.

analytic

Sem

itic

highly inflected languages

Helle

nic

Finno-Ugric

compositastrong aggluti-nation

DGT's SMT maturity check outcome as a ( ) sentences ratio + morphology

Language differences

+ Aid for typing + time savings + “original” proposed solution + guides the terminological research

From the translator's point of view

— gender/numbers and order of words — can be "fluent", but with mistranslations — omissions and additions — risk of error when incorrect terminology suggested — quality dependent on the quality of the originals

14

15

§  … the staff of European institutions and bodies:

§  European Commission, §  European Parliament, §  Council of the European Union, §  European Court of Justice, §  Court of Auditors, §  Economic and social committee §  Committee of the regions §  European Central Bank, §  European Investment Bank §  Translation Centre §  … and more

MT@EC is already available to…

è DGT took into account the needs of translators and other staff when designing the servcie

A web interface for documents…

16

… and for short text

17

A private space for users….

18

… who can also opt for email

19

MT@EC is also integrated into EC digital services

à operational

20

Service   Description/URL  IMI   Internal Market Information System

http://ec.europa.eu/internal_market/imi-net/index_en.html

SOLVIT   SOLVIT is an on-line problem solving network concerning missapplication of Internal Market law by public authorities. http://ec.europa.eu/solvit/

è DGT supports and advises for better integration on the customer side

Integration into EC digital services à under development (indicative list)

21

Service   Description/URL  

nLex A common gateway to National Law http://eur-lex.europa.eu/n-lex/

TED TED (Tenders Electronic Daily) is the online version of the 'Supplement to the Official Journal of the European Union', dedicated to European public procurement. http://ted.europa.eu/

e-Justice The future electronic one-stop-shop in the area of justice. http://e-justice.europa.eu/

Joinup Joinup is an open collaborative platform supporting interoperability in Europe. https://joinup.ec.europa.eu/

Integration into EC digital services à initiated (indicative list)

22

Service   Description/URL  

ODR Platform to facilitate the resolution of consumer disputes out-of-court (Alternative Dispute Resolution) http://ec.europa.eu/consumers/redress_cons/adr_en.htm

EURES The European Job Mobility portal newtorking the European employment services. https://ec.europa.eu/eures/

EQF The portal supporting the implementation fo the European Qualifications Framework for lifelong learning. http://ec.europa.eu/eqf/home_en.htm

ESCO The multilingual classification of European Skills, Competences, Qualifications and Occupations; identifies and categorises skills and competences, qualifications and occupations in 22 European languages. Supports EURES and other similar portals. https://ec.europa.eu/esco/

MT@EC for public administrations

23

Free real-life trial in 2014: §  - Staff can have direct free access to the standard MT@EC

service (upon request)

•  - Organisations can participate in a "customisation" pilot project, where DGT builds specific engines with their data (based on bilateral cooperation agreements)

è DGT to understand better their needs and constraints and develop appropriate service delivery models

Customisation pilots •  Pilot A: Connect an information system to the standard

MT@EC service. •  Pilot B: DGT builds custom engines (their data) available

to all through MT@EC •  Pilot C: DGT builds custom engines (their data) available

only to them through MT@EC •  Pilot D: DGT builds custom engines (their data) for you to

run in their premises •  Pilot E: DGT assists you to build their own custom

engines for you to run in their premises

24

MT@EC: right for the EU

Quality: •  built on data derived from EU translations

(Euramis translation memory system: 800 M segments in 24 languages and annual growth rate > 20% )

•  designed for EU relevant collaboration •  team of computational linguists working with

translators and linguists in DGT •  work to improve MT for all EU languages

Security

Customer support

25

MT@EC: what next

26

•  CEF (Connecting Europe Facility) •  A funding programme for building and deploying

infrastructures. •  Includes deploying mature technologies to build, enable and

operate pan- European Digital Services. •  Includes an Automated Translation (AT) platform as one of

its core building blocks for digital services. •  A key component of the AT platform is MT@EC.

The automated translation platform

27

•  To facilitate cross-border information exchange and enable cross-border access to online content and services provided by the digital service infrastructures of the CEF.

•  To offer MT services to EU institutions and public administrations in the Member States.

•  To build on the existing Commission Machine Translation service (MT@EC)

•  Emphasis is placed on secure, quality, customisable machine translation.

è Follow this space: http://ec.europa.eu/digital-agenda/en/connecting-europe-facility

Thank you

contact: [email protected]

[email protected] 28