mt@ec european commission machine translation supporting e-government spyridon pilos head of...

37
MT@EC European Commission machine translation supporting e-government Spyridon Pilos Head of language applications Directorate-General for Translation MT@Work Brussels, 5.12.2014

Upload: kristopher-mccoy

Post on 18-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

MT@ECEuropean Commission machine translation

supporting e-government

Spyridon PilosHead of language applications

Directorate-General for Translation

MT@WorkBrussels, 5.12.2014

European Commission machine translation and public administrations

• MT@EC: a service for the EU • The context of the free trial • Implementation• What next?

2

EU official languages over time

3

EU translation servicesDGT

4

5

Why does the Commission need MT?

• The Commission…• DGT has 1700 translators• Over 2 M pages translated in 2013

• But……just to make europa.eu fully multilingual

almost 6.8 M documents to be translatedor 8 500 translators/year!

The result: Thousands of non-translated documents(and this does not include user generated content)

There are also interactions with and between actors in the Member States

Member State X

Administration

Member State Y

A2C

BusinessA2B

Citizens

Administration

EU Administrations

A2B

A2C

A2A

A2A A2A

First type

Second type6

7

Vision

Wouldn’t it be great if I could start using a public service in any Member State from any place and obtain the information in my mother tongue?

8

9

• ISA=programme for interoperability solutions for public administrations

• EIF=European Interoperability Framework

10

EIF*: 12 Underlying principles

Need for EC action• Subsidiarity and Proportionality

User needs and expectations• User Centricity, Inclusion and Accessibility,

Security and Privacy, Multilingualism, Administrative Simplification, Transparency, Preservation of Information

Collaboration• Openness, Reusability, Technological Neutrality

and Adaptability, Effectiveness and Efficiency

* European Interoperability Frameworkhttp://ec.europa.eu/isa/documents/isa_annex_ii_eif_en.pdf

The role of Machine TranslationMT is the only viable solution for: quick and cheap access to information in

foreign languages. understanding information received in a

foreign language that otherwise could not be used or would require substantial time and costs to translate.

making multilingual use of websites possible facilitating cross-lingual information search

and analytics.

That is why machine translation (MT) is acritically important technology for multilingual Europe

MT@EC: a European Commission product •  

• Released : 26 June 2013 (version 1.0)• Version 2.0 released on 3 July 2014•

Languages: All 24 EU official languages552 language pairs (62 direct)

• Technology: Statistical machine translation using open source software Moses co-funded by EU Framework Programmes for research and innovation

• Development by DGT: between 2010-2013co-funded by the ISA programme (action 2.8)

• * Interoperability solutions for public administrationshttp://ec.europa.eu/isa/actions/02-interoperability-architecture/2-8action_en.htm

12

• Delivery: - web user interface (human to machine)- web services (machine to machine)

• Special features: • User interface in 24 languages• Source document format/formatting maintained [not for pdf]

• Specific output formats for translation: tmx and xliff• Translation can also be returned by email• Can translate multiple documents to multiple languages• Indication of quality for language pairs (using BLEU Scores)• Feedback mechanism (using EU Survey)

13

MT@EC description

• Secure hosting in the EC data centre • Access through ECAS (EC Authentication Service)• Secure document transfers :

- over sTESTA*, a very secure private network between public administrations in the EU, separate from the internet

- over the internet (through a secure https connection)

• * You can check if your organisation has access to sTESTA on: https://portal.testa.eu/jetspeed/portal/homepage/about.psml.

14

MT@EC security

MT@EC is already available for…

15

… the staff of European institutions and bodies:

… online services funded or supported by the EU … real-life trial and pilot projects with public administrations in the EU Member States

… collaboration projects with EMT* Universities

* European Masters in Translation

Commission Parliament Council Court of Justice Court of Auditors

Economic and Social Committee Committee of the Regions European Central Bank, European Investment Bank etc.

Online services connected to MT@EC in production

16

Service Description/URL

IMI Internal Market Information Systemhttp://ec.europa.eu/internal_market/imi-net/index_en.html

SOLVIT SOLVIT is an on-line problem solving network concerning misapplication of Internal Market law by public authorities.http://ec.europa.eu/solvit/

nLex A common gateway to National Lawhttp://eur-lex.europa.eu/n-lex/

Online services connected to MT@EC in test

17

Service Description/URL

e-Justice The future electronic one-stop-shop in the area of justicehttp://e-justice.europa.eu/

ODR Platform to facilitate the resolution of consumer disputes out-of-court (Alternative Dispute Resolution) http://ec.europa.eu/consumers/redress_cons/adr_en.htm

CircaBC Communication and Information Resource Centre for Administrations, Businesses and Citizens https://circabc.europa.eu/

EU Survey Tool for creating multilingual online surveys http://ec.europa.eu/eusurvey/

Online services to be connected to MT@EC in preparation

18

Service Description/URL

TED TED (Tenders Electronic Daily) is the online version of the 'Supplement to the Official Journal of the European Union', dedicated to European public procurementhttp://ted.europa.eu/

Joinup Joinup is an open collaborative platform supporting interoperability in Europehttps://joinup.ec.europa.eu/

Online services interested in using MT@EC discussions initiated (indicative list)

19

Service Description/URL

EURES The European employment services network (European Job Mobility portal)https://ec.europa.eu/eures/

EQF The portal supporting the implementation of the European Qualifications Framework for lifelong learning  http://ec.europa.eu/eqf/home_en.htm

ESCO The multilingual classification of European Skills, Competences, Qualifications and Occupations which identifies and categorises skills and competences, qualifications and occupations in all 22 European languages and supports EURES and other similar portals https://ec.europa.eu/esco/

EPALE The European Portal for Adult Learninghttp://ec.europa.eu/epale

MT@EC for Public Administrations

20

Context: MT@EC "Pilot operation" phase until Q4/2014 (ISA)Objective: Develop and test in real-life conditions methods and structures for most efficient use of MT@EC by different beneficiaries (including PAs); normal operation of service.Conditions• PAs participate on a voluntary basis.• No cost for PAs other than use of internal resources.• No commitment by DGT on use of service after the end of

the pilot.Output• Service delivery models (including pricing)• Operational support structure and methods

MT@EC for Public Administrations

21

- Free real-life trial - Staff members can have direct access to the

standard MT@EC service [upon request by the individual PA staff member]

• - The Organisation can participate in a customisation pilot project, where DGT can also build specific engines with their own data.• [Administrative Agreement between PA and DGT needed,

to be signed until end of June 2015]

Customisation pilots for PA• Pilot A:Connect a PA information system

to the standard MT@EC service.• Pilot B: DGT builds custom engines with PA data

available through MT@EC to all• Pilot C: DGT builds custom engines with PA data

available through MT@EC only to the PA• Pilot D: DGT builds custom engines with PA data

for PA to run in PA premises• Pilot E: DGT assists PA to build own custom

engines to run in PA premises

22

If you are interestedemail [email protected]

23

Ongoing pilots

Country Name of administration Type Pilot

Finland Prime Minister's Office Central translation service C

Germany Bundesprachenamt Translation service of the Armed Forces E

Greece Hellenic Quality Assurance and Accreditation Agency for Higher Education Education administration A

Discussions were held with more PAs but did not lead to signature of agreements on pilots usually because: • there was no need for custom engines• the necessary data were not enough or could not be shared• resources could not be made available for the work

to be performed on the PA side.

Special types of "pilots" Networks (Association des Conseils d’État et Cours administratives suprêmes de

l'UE, Réseau des Présidents des Cours suprêmes judiciaires de l'UE, Legivoc project) New languages (Norwegian)

Staff access to MT@EC

24

• Get an individual ECAS user name and password (self-registration) using your work email address. [go to https://webgate.ec.europa.eu/cas/eim/external/register.cgi and follow the instructions]

• Send an email to [email protected] asking for the activation of access to the service.

• DGT will activate your access and inform you by email.

25

Users - total

Country reg'd using

Austria 3 3

Belgium 5 3

Bulgaria 1 1

Croatia 0 0

Cyprus* 77 46

Czech Republic* 25 15

Denmark 0 0

Estonia 3 3

Finland 2 2

France* 21 15

Germany* 30 28

Greece* 37 23

Hungary 1 0

Ireland 0 0

Country reg'd using

Italy 2 1

Latvia 0 0

Lithuania 1 1

Luxembourg 3 2

Malta 0 0

Netherlands 8 8

Poland 0 0

Portugal* 7 5

Romania 9 7

Slovakia* 86 39

Slovenia* 13 7

Spain* 9 7

Sweden* 3 3

UK 1 1

TOTALregistered

347using

22063,4%

* Countries where

national events were organised

Requests per user

Only one 32%

2 to 9 54%

10 or more 14%

26

Top 40 users

Domain Requests

Economy and finance 674

Agriculture 218

Foreign affairs 92

European affairs 61

Health 61

Modernisation 55

Education 48

Local government 48

Country Requests

Germany 633

Slovakia 313

France 156

Greece 125

Cyprus 75

Portugal 22

Finland 15

Spain 14

Slovenia 12

Bulgaria 10

Czech republic 10

Lithuania 10

Domain Requests

Transport 37

Telecom 20

Statistical authority 14

Employment 12

Interior 11

Justice 11

Police 11

27

Implementation

• Usually individuals ask for their own translations.• In some cases a translation service centralises requests

(for example through functional mailbox)• No guidelines on feedback or evaluation were imposed by

DGT. Quality is "fit for purpose" (compliance with user requirements). A feedback function is available in MT@EC.

• Translation to/from non-EU languages is very important in several cases.

• For translators, if MT is not integrated in their translation workflow so as to post-edit easily, then they will not use it.

• Original is sometimes hand-written or "confidential".

28

Feedback

• Different depending on whether it comes from translators or other users

• Little understanding of statistical MT technology and its constraints

• Several problems were pointed out:• document formats and formatting• national names and acronyms• non translation of "common" words• ommission of words• consistency• syntax, grammar etc.

Hint: Do not test on only one document to draw general conclusions. Usefulness depends greatly on factors such as type of document, quality of original, domain and language pair.

29

On the pilots• In most cases the generic engines were sufficient.• Difficult to find data that are useful in terms of quality and

quantity for building engines while ownership and confidentiality is an issue.

• Lack of clarity on status of the service after the end of the pilot discouraged investment on the side of PAs.

• Translation services asked for guidelines for evaluation and structured feedback.

• Information to technicians should be provided in their own language.

• Need more clarity on scope of "public administration".

Intermediate conclusions (1)

Intermediate conclusions (2)

30

On the service• Do not need too much security: sTesta to internet https• The interface should be multilingual• A tool for translators and other users: different attitudes.• Use depends on "fitness for purpose" and not on some

general quality of languageOn communication• Difficult to find the right network to promote (used ISA,

EUPAN, COTSOES, DGT Field Offices in MS etc.) • Promotion in national events in the language of the country

(even in videoconference) worked best.

OK

OK

MT@EC for EMT universities

31

• Free use for teaching or research.• Mutually beneficial project-based cooperation. • The teacher/researcher may ask for access to see how it

looks like and check whether it is relevant for his/her work.• If interested s/he sends a short project description (title,

duration, objectives, approach, expected volume of requests) and a list of more persons to access.

• At the end of the project s/he informs DGT on the outcome of the project or study, as well as any other feedback considered useful to improve the service and its use.

Status: On 30.11.2014 we had 103 registered users, of which 75 are

students, from 21 universities from 12 countries (11 EU MS and CH),

of which 9 have communicated a research/teaching plan.

from MT@EC...

32

to the CEF automated translation platform

What next?

CEF.AT will: • build on the existing MT@EC service• put emphasis on secure, quality, customisable MT

DISPATCHERmanaging

MT requests

MT enginesby language,

subject…

MT datalanguage resources

specific for each MT engine Language resources

built around Euramis

DATA

MODELLING

Customised interfaces

ENGINES HUB USER FEEDBACK DATA HUB

Users and Services

Generic MT

& piloting customisation

MT@EC Outline

CEF.AT platform Outline

DISPATCHERmanaging

MT requests

MT enginesby language,

domain… Engines factory Language resources DSIs

Multilingual corpora

Monolingual corpora

NLP Tools

Other

From data to engines Collect and clear

The service

SECURE(and performing) CUSTOMISABLEQUALITY

Real-life trial and customisation pilotsfor Public Administrations

35

- There is still time for your organisation to participate in a pilot (sign agreement until end of June 2015).

- Any staff member of a public administration can ask for access at any time.

- Access will be free of charge until further notice.- Service delivery models (including pricing) will be

developed only under the Connecting Europe Facility.- Lessons learned from the pilots will be used for developing

the operational support structure and methods for the CEF.

36

• DGT MT page on europa.euhttp://ec.europa.eu/dgs/translation/translationresources/machine_translation/index_en.htm

• ISA page on action 2.8 Machine translationhttp://ec.europa.eu/isa/actions/02-interoperability-architecture/2-8action_en.htm

Includes:• The ISA Work programme 2010-2014 for MT@EC• Presentations for public administrations• and more…

• CEF work programme for 2014 where section 3.1.7 is on the CEF.AT platform

https://ec.europa.eu/digital-agenda/sites/digital-agenda/files/WP2014%20-%20official%20published.pdf

• Language technologies (CEF, H2020,…)http://ec.europa.eu/digital-agenda/language-technologies

• Language technology resources (DGT-TM, EuroVoc,…)http://ec.europa.eu/jrc/en/language-technologies

Useful links