- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
Specialised translation and terminology
Koen Kerremans
Centrum voor Vaktaal en CommunicatieErasmushogeschool Brussel
http://cvc.ehb.be
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
Part 1:“Terminography for translators:
methodology”
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Purpose
1. To show some steps terminographers go through in order to develop specialised dictionaries
2. To raise awareness concerning the specific problems that may arise during the compilation of such dictionaries
3. To present a method in terminology description, Termontography, which supports the development of ontologically-underpinned terminological dictionaries
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayTerminology specialised dictionary
• Within the present scope:Terminology / Terminography
• is the study and the field of activity concerned with the collection, the description and the presentation of terms (Sager 1990:2). Terms are related to subject-field communication (e.g. technical writing, technical documentation).
• “the practical task of producing dictionaries of lexical items that are specific to specialised domains of knowledge” (Meyer 2001:279).
Specialised dictionary• results from the process of creating, storing, processing,
recording, reusing, etc. specialised information and knowledge
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Preliminary remarks (1/3)
• Within the present scope:User of the specialised dictionary?
• Translator
Requirements of this specific user• Content of the dictionary?• Format of the dictionary?
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Preliminary remarks (2/3)
• data gathering for lexical analysis may be based on:introspectionelicitation of dataobservation of non-elicited language use
= text-oriented approach
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Preliminary remarks (3/3)
• Lots of texts are currently available in electronic formats
• It becomes possible to ‘process’ these texts using specific software tools
‘Terminotics’
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayTerminology specialised dictionary
1. Corpus compilation2. Term identification3. Information extraction4. Analysis and synthesis5. Encoding6. Organisation7. Management
Specialised Specialised dictionarydictionary
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 1. Corpus compilation
• = searching and categorising texts considered relevant for terminological analysis
• Problem: representivenessScientific specialised discourseScientific official discourseScientific pedagogical or didactic discourseScientific semi-popularised discourseScientific popularised discourse
(e.g. Laurian 1983; Meyer and Mackintosh 1996; Pearson 1998)
At least 2 At least 2 languages!languages!
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 1. Corpus compilation
• Tools (examples):Search engine:
• “an information retrieval system designed to help find information stored on a computer system, such as on the World Wide Web” (http://en.wikipedia.org/wiki/Search_engines).
Web crawler• “a program or automated script which browses the World
Wide Web in a methodical, automated manner” (http://en.wikipedia.org/wiki/Web_crawler).
Text aligner• a tool that organises “different language versions of a text in
order to be able to identify equivalent terms, phrases, or expressions”(http://portal.bibliotekivest.no/terminology.htm).
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
• = extracting terms from texts that have been gathered during the corpus compilation phase
• What is a term?“A semantically charged linear structure,
which names an abstract or concrete reality studied [in] a special-subject field” (Collet 2004:109).
A lexical unit that has a special meaning depending on the thematic context.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
TOPIC: early retirement
When the eligibility criteria for early retirement were tightened, early retirees began being granted the status of older unemployed. Standard unemployment benefits are higher for unemployed persons over the age of 50 who have been unemployed for a year but have spent 20 years in work. Until very recently, those in the “older unemployed” category were exempt from the ‘actively seeking work’ rule, which suggested that it was virtually impossible to find work again after the age of 50.
Since summer 2002, however, this exemption for the older unemployed is gradually being phased out. It is also the case that early retirement arrangements have become opaque and inequitable. The range of measures is now so wide that there has clearly been some duplication. They include early retirement on a half-time basis and career break measures, now replaced by the time-credit scheme.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
• Automatic term extraction ≠ automatic keyword extraction!
Knowledge of the Knowledge of the languagelanguage
Knowledge of the Knowledge of the world (the domain)world (the domain)
Knowledge of the Knowledge of the (dictionary) user (dictionary) user profileprofile
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
• Our solution in application-orientedapplication-oriented terminology projects:
1. Set up a categorisation frameworkcategorisation framework2. Map terminology to the framework
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
• A categorisation framework:= an ontologically-underpinned framework of
(meta)categories and (meta)relations which is used to extract and organise multilingual terminology
• Advantages:Helps us to establish extraction criteria as to what
terms in text are or should be (cf. ‘15th day of the month following that in which the chargeable event took place’)
Facilitates the process of aligning multilingual terminology
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
hyperonym of
transactions not allowing the
supplier to deduct VAT
transactions occurring outside the
territory of the VAT legislation at
stake
transactions occurring outside the
scope of VAT
transactions allowing the supplier
to deduct VAT
transactions for which no VAT is required
hyponym of
Dutch (Belgium):
vrijstelling
niet onderworpen aan BTW
…
French (Belgium):
exemption
…
English (UK):
exemption
zero-rated
outside the scope of VAT
…
English (Ireland):
exemption
zero-rated
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
• Idea of mapping terminology to a categorisation framework is adopted in the Termontography approach
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
Knowledge Analysisphase (1)
Informationgatheringphase (2)
Searchphase (3)
TSR + categorisation
framework
(mono- or multilingual)
domain-specific corpus
Domain-experts
Refinementphase (4)
first version of termontological
database
(mono- or multilingual)
termontological database
Verificationphase (5)
Validationphase (6)
??
Dictionary
2. Term identification
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 2. Term identification
• Termontography is a terminological approach in which one structures terminological information, retrieved from a corpus of texts, according to a framework of domain-specific knowledge.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• = adding ‘supplementary information’ to each term
• Dictionaries should be designed for special users groups in response to specific needs (cf. ‘Knowledge analysis phase’ in Termontography)
• What supplementary information do translators require?Synonyms? Translation equivalents? Part of speech
tags? Examples? Contexts? Collocations? Domain specifications? Definitions? ( what type of definition?)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Techniques to find out user requirements are amongst others:
SurveysExperimental research & Model Building
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Surveys:To ask people what they use dictionaries for
and howNot very reliable
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Experimental research:Look-up behaviour of subjectsError analysis
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Model buildingModel building(based on translation (based on translation process)process)
(Agirre et al. 2001)(Agirre et al. 2001)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Translators need insight in at least three different types of contexts:
linguistic context of a translation unit,cultural (situational) contextcognitive (ontological) context
• A translator having access to terminological knowledge resources providing him with information on these different types of contexts, is likely to produce high quality translations
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
• On the whole translating dictionaries and traditional multilingual
terminological resources do not provide sufficient information
for the translator
• Multilingual terminology management must widen its scope
towards knowledge management and representationknowledge management and representation
(Meyer 1992, Dancette 1997, Temmerman 2000, 2003, 2005):
providing a cognitive structure in order to improve the
understanding of the specialised domain
providing extralinguistic / encyclopaedic information in
order to improve the understanding of terms and categories
in the specialised domain (of source and target language)
3. Information extraction
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
3. Information extraction
Dancette, J. & C. Réthoré (2000). Dictionnaire Analytique de la Distribution. Analytical Dictionary of Retailing. Les presses de l’université de Montréal
Users: translators who are to translate from English into French on ‘retailing’
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
Lay-out
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
• AimsAims: to maximally stimulate the creativity of the
translator by offering ontologically enriched information on the subject, in the French language (target language for the translator)
to optimise understanding by stimulating the semantic network in the brain of the translator
3. Information extraction
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
• The dictionary user gets introduced to the meaning of the term in several textual modules formulated in French: définition précisions sémantiques relations internotionelles compléments d’information informations linguistiques contextes exemples
• Cross-referencing is provided for by printing entries that are covered in another article for French in bold and for English in small capitals. Related terms for French are in bold, for English in italics.
3. Information extraction
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: ‘label’
Définition:Document d’identification du produit qui lui est apposé ou y est attaché et qui en décrit les caractéristiques (nature, prix, provenance, marque, etc.).
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: ‘label’
Précision sémantiques:Depuis les années 1970, l’étiquette comprend généralement un code-barre (BAR CODE). Le code-barre contient des informations telles que la description et le prix du produit, qui seront lues à l’ aide d’un lecteur optique (OPTICAL READER).
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: ‘label’
Relations internotionelles:Le terme anglais TAG désigne une étiquette que l’ on peut facilement enlever, ce qui n’ est pas le cas de label.Ne pas confondre l’anglais LABEL avec son homonyme label, qui a le sens de marque (BRAND), comme dans le terme PRIVATE LABEL (marque de distributeur).
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: ‘label’
Compléments d’information:Les producteurs ont l’ obligation, en vertu de la Loi sur la protection du consommateur (Consumer Protection Act), de répertorier sur l’étiquette tous les ingrédients contenues dans le produit alimentaire.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: ‘label’
Information linguistique:Étiqueter: to ticketétiqueteuse: labeler, label machine
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: ‘label’
Contextes:But it wasn’t until 1900 that [he] put the first Polar label on
a bottle of cool, naturally purified water taken directly from one of these springs on his property. http://www.water.com/polar/index.html (30-3-99)
Dans ce but, la réglementation mise au point par les organismes de la CEE et par l’ administration française prévoit sur chaque étiquette la présence d’un certain nombre de mentions obligatoires, en fonction de la catégorie du vin. http://www.vin.champagne.com/etiq.htm (30-3-99)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Challenge: how to arrive at specialised dictionaries offering ontologically-enriched information?
analysis of Knowledge Rich Contexts (Meyer 2001)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• ‘Knowledge Rich ContextsKnowledge Rich Contexts’ (Meyer 2001:281):“a context indicating at least one item of
domain knowledge that could be useful for conceptual analysis. In other words, the context should indicate at least one conceptual characteristic, whether it be an attribute or relation.”
can be used to derive synonyms and translation equivalents
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
KWIC concordancer
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Certain contextual markers may indicate in KRCs specific conceptual relations.Compost: a ready-to-use soil enricher that looks and
feels like dark, crumbly soil.Compost contains nutritients, nitrogen, potassium
and phosphorus.Compost is perhaps best defined as organic material
assembled for fast decomposition.Compost, a dark, nutritient-rich soil conditioner,
consists of a small amount of soil along with decomposed or partially decomposed plant residues.
->-> meronymymeronymy
-> purpose-> purpose
-> -> hyperonymyhyperonymy
-> attribute-> attribute
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
• Synonyms and translation equivalents are identified based on a comparison between KRCs:
cooccurrence or substitution tests
feature analysis
3. Information extraction
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
• Superstore Assortment:
• Food: very wide assortment
• Non-food: very wide assortment (house-hold products, clothing, kitchen utensils, gardening tools, etc.)
Area: • 2300 to 4600 m2
• Hypermarché Assortment:
• Food: very wide assortment
• Non-food: very wide assortment (house-hold products, clothing, kitchen utensils, gardening tools + electronical appliances, furniture, etc.)
Area: • Up to 24.000 m2
• Supermarché Assortment:
• Food: very wide assortment
• Non-food: fairly wide assortment (house-hold products, clothing, etc.)
Area: • 400 to 2500 m2
FEATURE ANALYSIS
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Category: Category: “an event on which VAT has to be “an event on which VAT has to be paid”paid”Domain: VAT lawDomain: VAT law
English-UK: chargeable eventVAT will be due on the date the invoice is issued
English-Ireland: chargeable eventVAT is due no later than the 15th day of the month following the month in which the supply takes place
French: fait générateurVAT is due at the moment the goods are supplied
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Summary
• Steps terminographers have to go through in order to develop specialised dictionaries for translators:Requirements of translators ( knowledge about the
linguistic, situational and cognitive contexts)
• Problems discussed:Representiveness of the corpusTerm identification ( categorisation frameworks?)Terminology structuring (variation)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Other problems?
4. Analysis and synthesis (definitions)5. Encoding (précision sémantiques vs.
relations internotionelles vs. complément d’information)
6. Organisation (tree structure, hyperlinks, ‘traditional’ term records)
7. Management (dictionary up-to-date?)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway
Part 2:“Towards ‘intelligent’ dictionaries for
translators”
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Purpose
• Which information sources do translators use during the translation of a given text sample?
• How do we arrive at ‘intelligent’ dictionaries?Possibilities?Technology?
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Translation sample
• Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area. The White Nile brought highly mineralized silt which would be eroded from Abyssinian Alps 1500 miles inland in Central Africa. The silt from the Blue Nile was heavy with humus from the jungle and swampy sources. Not only did the flood bring silt, the soil would be soft and easy to plow. They would plant and harvest in early spring and then allow the fields to lay until July when the floods would come again. Based on: http://historylink101.com/lessons/farm-city/egypt1.htm
Resource
s?
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Translation dictionaries
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Translation dictionaries
http://www.wordreference.com/
http://www.ectaco.com/English-Multilanguage-Dictionary/
http://www.allwords.com/
http://www1.cs.columbia.edu/~radev/dictionary/
http://www.foreignword.com/Tools/dictsrch.htm
http://www.langtolang.com/
http://users.otenet.gr/~vamvakos/multilingual.htm
http://www.tritrans.net/…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Translation dictionaries
http://www.freedict.com/onldict/onldict.php
http://www.tritrans.info/
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Explanatory dictionaries
Ref.: http://dictionary.cambridge.org/define.asp?key=51745&dict=CALD
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Contexts
"While the computer linked these reduced northern hemisphere temperatures to Laki, it also connected the dots to a weak monsoon – the seasonal winds that bring the annual rains to southern Asia and northern Africa. The unusual cold in the North lessened the temperature contrast between the land and the oceans, upon which the monsoon winds rely for their development and strength. "
http://www.physorg.com/news83338494.html
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Contexts
"It may be that only with the additional water provided as a result of the intensifying monsoon that the upstream Nile was able to erode its way through the Nubian Swell and continue north to the Mediterranean Sea. "
http://www.utdallas.edu/geosciences/remsens/Nile/geology.html
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Encyclopedia
http://en.wikipedia.org/wiki/Monsoon
http://no.wikipedia.org/wiki/Monsun
Example:
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Encyclopedia
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Encyclopedia
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Encyclopedia
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Translation forums
http://www.oasisllc.com/transtrad/forum.htm
http://www.translatorsbase.com/Forum/Forums/
http://www.english-spanish-translator.org/translation-issues/
http://disc.server.com/Indices/6657.html
http://members3.boardhost.com/translate2/
http://www.all-translations.com/forum/index.php
http://www.foreignword.com/Forum/default.asp
http://tech.groups.yahoo.com/group/sptranslators/
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Translation engines
• Sample: Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area.
http://www.freetranslation.com/
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Some resources
Resources
Resources
Explanatory dictionaries
Translation dictionaries
Specialised dictionaries
Translation engines
Translation forums
Encyclopedia
Picture dictionaries
Combinatory dictionaries
Synonym dictionaries
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Specialised dictionaries
• Sample: Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Specialised dictionaries
• Sample: Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banks, from both a jungle area and also a mountainous area.
GEOGRAPHY
HYDROLOGY
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Specialised dictionaries
• Monsoon A regional scale wind system that predictably
change direction with the passing of the seasons. Monsoon winds blow from land to sea in the winter, and from sea to land in the summer. Summer monsoons are often accompanied with precipitation.
• Flood Inundation of a land surface that is not normally
submerged by water from quick change in the level of a water body like a lake, stream, or ocean.
• Silt Mineral particle with a size between 0.004 and 0.06
millimeters in diameter. Also see clay and sand.
http://www.physicalgeography.net/physgeoglos/
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayConclusion first part
• Translation is a complex process. Translators need to consider both intra- and extratextual factors
• There are a lot of resources (freely available) on the Internet that translators can use for their own translation projects.
• Disadvantage: these resources need to be consulted one-by-one
• Consequence: there is a need for a more ‘intelligent dictionary’
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway The intelligent dictionary
… requirements:it should be possible to combine results from
existing electronic resources and to present the relevant information to translators
The dictionary should be context-sensitive. Translation segments should be automatically linked to information in the available knowledge resources.• (the dictionary may be able to suggest a
translation, based on ‘intelligent reasoning’)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway The intelligent dictionary
• Each year the "Gift of the Nile" would be a flood brought on by the monsoon.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
Example: IATE (Inter Active Terminology for Example: IATE (Inter Active Terminology for Europe)Europe)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway 3. Information extraction
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayThe Semantic Web
• “[…] an extension of the current web in which information is given well-defined meaning”
(Tim Berners-Lee, James Hendler, Ora Lassila (2001). “The Semantic Web”. Scientific American)
• “[…] provides a common framework that allows [smart] data to be shared and reused […]”
(W3C, http://www.w3.org/2001/sw)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Opportunities of a “semantic web”
• For instance: Improvement of information retrieval: e.g. ‘general
interest in MT’ as a query will no longer lead to websites of MT companies that present their products
Software agents will be able to detect the pieces of information they need to make for hotel bookings via the Semantic Web
Question-answer machines will be able to formulate better answers on the basis of the user’s question (cf. http://www.answerbus.com)
…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayAn example…
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayThe Semantic Web
• “[…] an extension of the current web in which information is given well-defined meaning”
(Tim Berners-Lee, James Hendler, Ora Lassila (2001). “The Semantic Web”. Scientific American)
• “[…] provides a common framework that allows [smart] data to be shared and reused […]”
(W3C, http://www.w3.org/2001/sw/)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway The content of webpages?
XML:
-…a beautiful hotel in <European country>France</European country>…
-…each room has an <minibar>AMS</minibar>, a <television>Phillips</television>,…
-…for one night, you pay <price> € 50</price>.
XML:
-…a beautiful hotel in <country>France</country>…
-…each room has an <minibar>AMS</minibar>, a <TV>Phillips</TV>,…
-…for one night, you pay <cost> € 50</cost>.
Resource Description Framework (RDF):
A default framework for structuring XML tags. With RDF it becomes clear that in the example above the following tags carry the same meaning:
“European country” and “country”“minibar” and “minibar”“television” and “TV”“price” and “cost”
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, NorwayThe Semantic Web
• Dream or reality?
2000: the development took off in the US with the DARPA Agent Markup Language (DAML)
2001: In Europe, researchers set up the ‘Ontoweb thematic network’ in order to federate the research activities
Research activities for building the Semantic Web were central to the ‘knowledge technologies’ area of the EU 6th framework programme
In fact, research activities can be found world-wide
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway The intelligent dictionary
… requirements:it should be possible to combine results from
existing electronic resources and to present the relevant information to translators
The dictionary should be context-sensitive. Translation segments should be automatically linked to information in the available knowledge resources.• (the dictionary may be able to suggest a
translation, based on ‘intelligent reasoning’)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway An example…
• http://www.inreallife.be/Articles/BELbruxEuropeenne.php
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Context-sensitivity
Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoonmonsoon. These floods brought only a thin layer of silt, dropped on the banksbanks, from both a jungle area and also a mountainous area.
Financiële instelling (‘financial institution’)Reserve (‘reserve’)Oever (‘river edge’)Rij (‘row’)
Context
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Schematically visualised
Text ‘word_A’ concept_A ‘word_B’ Resource B
Resource A‘word_A’
‘word_C’ Resource C
Problem: Is the right concept activated?
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Technology?
• Ontology:A formal knowledge repository of concepts and
relations with the possibility to derive new facts from given knowledge
Example:
Nile river bank
Is instance of
has instance
has property
is property of
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Technology?
Nile river bank
Is instance of
has instance
has property
is property of
Nijl(Nl)
Nile(En)
rivier(Nl)
river(En)
rivière(Fr)
oever(Nl)
bank(En)
Egypt was in the best position to develop a great civilization. Each year the "Gift of the Nile" would be a flood brought on by the monsoon. These floods brought only a thin layer of silt, dropped on the banksbanks, from both a jungle area and also a mountainous area.
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway Conclusion
• Thanks to the new technology, dictionaries will become much more intelligent:Context-sensitive search possibilitiesFlexibility (customisation according to user
profiles)Dynamic (management in time)Interactive (Self-learning)
• Dictionaries could become intelligent translation engines (that consider the context in which a translation segment occurs)
- - Nordterm 2007Nordterm 2007 - -
Bergen, NorwayBergen, Norway To finish…
An ontology-based application:http://www.20q.net/index.html