atks (arabic toolkit services)

23
Ramzy Hassan ramzyhassan 44 @gmail.com ATKS Arabic Toolkit Service

Upload: faculty-of-computing-and-information-system

Post on 15-Feb-2017

424 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Atks (Arabic Toolkit services)

Ramzy [email protected]

ATKSArabic Toolkit Service

Page 2: Atks (Arabic Toolkit services)

Outline

• Introduction • ATKS Modules• Link ATKS with C#• Example • Comparison

Page 3: Atks (Arabic Toolkit services)

Introduction

Page 4: Atks (Arabic Toolkit services)

Introduction

• The Arabic Toolkit Service (ATKS) offers a set of APIs for basic processing of writtenArabic language.

• The Toolkit is designed to help the Arabic developer by providing high-quality Arabic NLP APIs.

• The ATKS provides a rich set of APIs as SOAP Web Services, and covering the basic language processing operations through the following components

Page 5: Atks (Arabic Toolkit services)

ATKS Modules

Page 6: Atks (Arabic Toolkit services)

Modules

• Colloquial Converter (Work)• Diacritizer (Work)• Named Entity Recognition(Work)• Parser(Work)• POS Tagging (Work)• Sarf (Work)• Speller (dose not work)• Transliterater (dose not work)

Page 7: Atks (Arabic Toolkit services)

Modules (Cont.)• Colloquial Converter

• The Colloquial Converter provides translation of Egyptian colloquial text into the equivalent Modern Standard Arabic text along with rich mapping information

Page 8: Atks (Arabic Toolkit services)

• Diacritizer

Modules (Cont.)

• The automatic Diacritizer component performs vowel restoration on input Arabic text.• The main objective of the Diacritizer is to insert both missing vowels—diacritics—of the

stem and the missing vowel for the case ending.

Page 9: Atks (Arabic Toolkit services)

• Named Entity Recognition

Modules (Cont.)

• The Named Entity Recognizer (NER) detects and classifies named entities in Arabic text.• It classifies them into three categories: persons, locations, and organizations. • It also provides a character index at which the named entity is located in the original text.

Page 10: Atks (Arabic Toolkit services)

• Parser

Modules (Cont.)

• The Parser determines the grammatical structure of Arabic sentences, such as which groups of words combine to form phrases and which words are the subject or the object of a verb.

• The Parser relies heavily on the Arabic POS Tagger to identify the correct part of speech for each token in an input Arabic sentence, and the Arabic Named-Entity Recognizer to identify named entities in the input sentence after it has been corrected using the Arabic Auto-Corrector.

Page 11: Atks (Arabic Toolkit services)

• POS Tagging

Modules (Cont.)

• The Part of Speech (POS) Tagger is responsible for identifying the correct part of speech for each token of any given Arabic sentence.

• The POS Tagger relies heavily on the Morphological Analyzer to extract the relevant morpho-syntactic features for the input words.

• The POS Tagger also relies on the Auto-Corrector to correct input text.

Page 12: Atks (Arabic Toolkit services)

• Sarf

Modules (Cont.)

• Sarf provides automatic morphological analysis of Arabic words.• It provides all possible morphological analyses for any given input Arabic word.• Each analysis consists of the diacritized word and the morphological breakdown of the

analysis in terms of prefixes, stem, and suffixes. The stem is further decomposed into its root and morphological pattern.

• Moreover, each analysis carries the part of speech and a set of morpho-syntactic features such as gender, number, transitivity, verb voice, and verb mood.

Page 13: Atks (Arabic Toolkit services)

• Speller

Modules (Cont.)

• The Speller detects and corrects misspelled words in Arabic text and is designed for Modern Standard Arabic.

• The Speller APIs also enable auto-correction of Common Arabic Mistakes, frequent orthographical errors.

• The main objective of the Speller is to enhance the quality of written Arabic text, hence improving the accuracy of the various Arabic text-processing components.

Page 14: Atks (Arabic Toolkit services)

• Transliterater

Modules (Cont.)

• Transliteration is the conversion of text from one script to another while preserving the same pronunciation.

• The Transliterator provides translation of named entities, such as human and city names, from English to Arabic and vice versa—and conversion of text from Romanized Arabic to native Arabic script.

Page 15: Atks (Arabic Toolkit services)

Link ATKS with C#

Page 16: Atks (Arabic Toolkit services)

Steps• Sign up in this website :

http://research.microsoft.com/en-US/projects/atks/default.aspx• If you already have Microsoft account, you will just need to sign in and then complete

registration details for using ATKS Tool.• After registration they will send you a verification email to active your account.• After 1-2 days they will review your Application and then send other email that contains

AppID which will be use in your Application later.• After that go to Visual Studio – create new C# windows or console Application.• Go to solution explorer on the right side of your Visual studio on your screen and then

click on References menu the choose the second option (add service reference).• A pop-up window will appear copy your module link and name it the press OK.• In your Project code your must include the namespace of your service to make you be

able to use it public classes• After that write your own code and debug it. Wow Success and I get result • It’s good try other modules now.

Page 17: Atks (Arabic Toolkit services)
Page 18: Atks (Arabic Toolkit services)
Page 19: Atks (Arabic Toolkit services)
Page 20: Atks (Arabic Toolkit services)

Example

Page 21: Atks (Arabic Toolkit services)

Comparison

Page 22: Atks (Arabic Toolkit services)

ComparisonComparison

faces ATKS MADAMIRA NLTK

Simplicity Simple HardBecause of it

standard dealing

average

Programming languages

C# Java Python

Accessibility Web service only Stand –alone versionClient- server version

Python downloaded module

Arabic Support Guaranteed Guaranteed Limited

Adjustability Not available just sending feedbacks

Not available Available you can add your own grammer

Page 23: Atks (Arabic Toolkit services)

Contacts

• Facebook.com/ramzy.hassan35• Twitter.com/Rezoo_N1• [email protected]

Question ?