the named entity recognition (ner)2

18

Click here to load reader

Upload: arabicnlpimamu2013

Post on 26-May-2015

395 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: The named entity recognition (ner)2

The Named Entity Recognition (NER)

• Al-Shehri ,Aisha• Almutairi ,Shaikhah• Alswelim ,Haya

KINGDOM OF SAUDI ARABIA Ministry of Higher Education Al-Imam Muhammad Ibn Saud Islamic University College of Computer and Information Sciences

Page 2: The named entity recognition (ner)2

Abstract

Name Entity Recognition is an important part of many natural language processing tasks .

There are different type of name entity such as people , location and organization .

Page 3: The named entity recognition (ner)2

Introduction • The Named Entity Recognition is the identification and

classification of Named Entities within an open-domain text.

• The task of named entity recognition was defined as three subtasks:

• ENAMEX.• TIMEX, and NUMEX.

Page 4: The named entity recognition (ner)2

• We present the attempt at the recognition andextraction of the most important proper name entity, that is, the person name, for the Arabic language(PERA).

Components of an Arabic Full Name:divided into five main categories, Ibn Auda (2003):1. An ism (pronounced IZM).2. A kunya (pronounced COON-yah).3. By a nasab (pronounced NAH-sahb).4. A laqab (pronounced LAH-kahb).5. A nisba (pronounced NISS-bah).

Page 5: The named entity recognition (ner)2

Methodology

1-Parallel Corpora . a-Reliability b-Representativeness2-Previously developed tools for other languages . a-Person names b-Location names (Geographical locations and Toponyms) c-Organizations (Political of Administrative Entities) d-Position (job titles) e-Acronyms

Page 6: The named entity recognition (ner)2

Challenges • 1- There is no capital letters or a specific signal in the

orthography like many other language.

• 2-The Arabic has different meaning

• 3-Abiguity

Page 7: The named entity recognition (ner)2

Ambiguous exampleexampleCorrect Incorrect English

translationAmbiguous example

Date Person 15th of Ramadan Al karim 2005

Company Location Saudi Aramco

Page 8: The named entity recognition (ner)2

Features

• Machine-learning features Word-Length.

• Noun-Flag

• Speech-Tag

• Type-Current

• Type-Left.

• Type-Right.

Page 9: The named entity recognition (ner)2

System Architecture and Implementation

• Architecture of the NERA System:

Page 10: The named entity recognition (ner)2

System Architecture and Implementation

• Gazetteers.

• Grammar.

• Filter.

Page 11: The named entity recognition (ner)2

System Architecture and Implementation

1)Gazetteers:

Gazetteer containing: lists of known named entities.

White list:The White list plays the role of fixed static dictionaries ofvarious NE.

Page 12: The named entity recognition (ner)2

System Architecture and Implementation

2) Grammar:The grammar performs recognition and extraction of Arabicnamed entities from the input text based on derived rules.

The following are examples of indicators used within rules:

• Job title:الدكتورة (the doctor), العلوم the sciences)أستاذprofessor).

• Person title: (Mr.) السيدة, .(.Mrs)السيد

Page 13: The named entity recognition (ner)2

System Architecture and Implementation

3) Filter:filter rules hels in dealing with recognitionambiguity between named entities.

filtration mechanism is used that serves two different purposes:revision of the NE extractor results and disambiguation

of matches returned by different NE extractors.

Page 14: The named entity recognition (ner)2

Example:variationTypographic

Entity type English translation

Arabic example

Two dots removed from taa marbouta

Location Saudi Arabia

Drop of the letter madda from the aleph

Location Asia

Page 15: The named entity recognition (ner)2

The Experiment

Page 16: The named entity recognition (ner)2

Results

Page 17: The named entity recognition (ner)2

Conclusion • 1-We tried in the majority of cases to follow more general

criteria, applicable on English-Arabic transliteration or French-Arabic transliteration.

• 2-This work is part of a new system for Arabic NER. It has several ongoing activities.

Page 18: The named entity recognition (ner)2

References• Sherief Abdallah, Khaled Shaalan, and Muhammad Shoaib ,

Integrating Rule-Based System with Classification for Arabic Named Entity Recognition, 2012

• Yassine Benajiba , Mona Diab , and Paolo Rosso ,Using Language Independent and Language Specific Features to Enhance Arabic Named Entity Recognition, 2009

• Yassine Benajiba , Mona Diab , and Paolo Rosso , Arabic Named Entity Recognition: AN SVM-BASED APPROACH, 2009

• Doaa Samy, Antonio Moreno, and José Mª Guirao, A Proposal For An Arabic Named Entity Tagger Leveraging aParallel Corpus,2005

• Khaled Shaalan, Hafsa Raza, Person Name Entity Recognition for Arabic,2009