nlp related activities in thailand virach sornlertlamvanich information research and development...
TRANSCRIPT
NLP Related Activities in Thailand
Virach Sornlertlamvanich
Information Research and Development Division
National Electronics and Computer Technology Center
Thailand
27 August 2002, AFNLP/COLING2002, Taipei, Taiwan
SNLP-O-COCOSDA 2002Hua-Hin, Thailand
May 9-11, 2002
✔ About 100 participants
✔ Invited talks:
• 'Information Retrieval and Modeling of Tonal Features of Speech', Prof. H. Fujisaki, U. of Tokyo
• 'Speech Synthesis for Tonal Languages' Prof. Fangxin Chen, IBM China Lab.
• 'Natural Language Understanding and Action Control', Prof. Takenobu Tokunaga, TIT
• 'Cross-Language Projection of Linguistic Knowledge', Prof. David Yarowsky, John Hopkins U.
✔ Presentations:• 57 oral presentations
- 28 regular papers- 17 short papers - 8 COCOSDA papers- 4 student papers
✔ Submission countries:
• 23 from Thailand• 14 from Japan• 5 from China• 3 from Korea and India• 2 from Taiwan• 1 from Malaysia, Indonesia and Guam• 4 student paers from Thailand
✔ Types of papers:• 11 papers in IR/IE• 7 papers in pattern recognition• 6 papers in NLP application• 4 papers in language resources• 3 papers in morphology• 2 papers in syntax• 13 papers in speech processing• 8 papers in O-COCOSDA• 4 papers in student session
✔ SNLP• 1993 Chulalongkorn U. and NECTEC• 1995 Kasetsart U. and NECTEC• 1997 AIT and NECTEC• 2000 King Mongkut's U. Of Technology
(Thonburi) and NECTEC• 2002 Sirindhorn International Institute of
Technology and NECTEC
Survey on Research and Development of Machine Translation
in Asian Countries
Merlin Beach ResortPhuket, Thailand
May 13-14, 2002
Participation
Participants : 11 Countries
India, Indonesia, Japan, Korea, Lao PDR, Malaysia, Myanmar, Philippines, Singapore, Thailand, Vietnam
: and 1 region Hong Kong
Total participants: 50 (oversea 24, local 26)- R&D group (39)- Supporting, policy and planning group (11)
Objectives:
1) To update technology status of machine translation in Asia.
2) To exchange research and development experience in the field.
3) To establish collaboration for developing a cross language web navigation in Asia.
4) To establish activities for technology transfer from experienced countries to the inexperienced countries.
5) To develop human resources in the field.
Activities
- Keynote speech: 'AAMT Activities and General Trends of MT' by Dr. Hitoshi Iida
- 17 papers by participants* MT Research Techniques: spoken language translation,
semantic annotation, research on particular cases* MT Research Status in the Countries* Digitalization Research and Infrastructure Status* Problems in R&D
- Roundtable Discussion
* Collaboration within Asia* Standardization for languages within this region* Financial support problem* Possibility of joining the existing working bodies
MT status in Asia
Basic components:- Standardization (Character code and locale) Lao PDR, Myanmar, Philippines, Vietnam
Working on Regional MT:- Philippines and India (official lang. - dialects) Malaysia, Indonesia and Brunei (share resources
for Malay-English MT development)
Experience on English and mother language MT Hong Kong, Indonesia, Japan, Korea, Malaysia, Singapore, Thailand.
Expected Collaboration in Asia
Establish Asian chapters for Intra-Asian Collaboration.
Construct Help-desk operation for standardization.
Establish a Working group or Liaison Secretariat for Language resources.
Establish a Working group or Liaison Secretariat for MT.
Make standardization among Asian languages Share language resources
Exchange information for research & application
What should we do?
Establish Asian chapters for Intra-Asian Collaboration
in ISO; Language resource, code, document
Install BBS and mailing group
Provide tutorial programa and application projects
Promote the contribution from each country
What should we do?
Construct Help-desk operation for standardization
Standardized documents/ activities (ISO.MPEG7) Code standardization (unicode, etc.)
Establish a Working group or Liaison Secretariat for Language resources.
What should we do?
Coding description
Basic descriptors and mechanisms for language resources
Representation schemes
Multilingual text representation
Lexical databases
Workflow of Language Resource Management
What should we do?
Establish a Working group or Liaison Secretariat for MT
Machine Translation (may be under AAMT)
Language Processing (General/ Infrastructure)
Verification for standardization
Copyright
Operation personnel/ fund cooperation
Three Levels of Collaboration
StandardizationFont, Character code, I/O method, Print, Locale
Language Resources and Processing ToolsDictionary, Corpus, NLP generic tools
Cross-Language R&D and ServicesMachine Translation, Search engine, Information Retrieval/Extraction
Resource Sharing
Cross-Language Technology
Open Source Software
Cross Language Technology
ChineseLanguage Processing
JapaneseLanguage Processing
FrenchLanguage Processing
KoreanLanguage Processing
MyanmarLanguage Processing
VietnamLanguage Processing
IndonesiaLanguage Processing
ThaiLanguage Processing
cc� �Language Processing
cc� �Language Processing
MT
MT MT MT MT
MT MT MT
MTMT
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary
e-Content Dictionary e-Content Dictionary
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary
ChineseLanguage Processing
JapaneseLanguage Processing
FrenchLanguage Processing
KoreanLanguage Processing
MyanmarLanguage Processing
VietnamLanguage Processing
IndonesiaLanguage Processing
ThaiLanguage Processing
.......Language Processing
.......Language Processing
MT
MT MT MT MT
MTMT MT
MTMT
e-Content Dictionary e-Content Dictionary e-Content Dictionary E-Content Dictionary
e-Content Dictionary e-Content Dictionary
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary
EnglishLanguage Processing
e-Content Dictionary
EnglishLanguage Processing
e-Content Dictionary
Application over the Cross Language Tech.
Cross Language Technology
VisualizationPresentation Extraction Retrieval Summarization MT Mining
E-services
EnglishLanguage Processing
ChineseLanguage Processing
JapaneseLanguage Processing
FrenchLanguage Processing
KoreanLanguage Processing
MyanmarLanguage Processing
VietnamLanguage Processing
IndonesiaLanguage Processing
ThaiLanguage Processing
……Language Processing
……Language Processing
MT
MT MT MT MT
MT MT MT
MTMT
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary
e-Content Dictionary e-Content Dictionary
e-Content Dictionary
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary
EnglishLanguage Processing
ChineseLanguage Processing
JapaneseLanguage Processing
FrenchLanguage Processing
KoreanLanguage Processing
MyanmarLanguage Processing
VietnamLanguage Processing
IndonesiaLanguage Processing
ThaiLanguage Processing
……Language Processing
……Language Processing
MT
MT MT MT MT
MT MT MT
MTMT
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary
e-Content Dictionary e-Content Dictionary
e-Content Dictionary
e-Content Dictionary e-Content Dictionary e-Content Dictionary e-Content Dictionary