context-based semantic role labeler for amharic …

DSpace Institution

DSpace Repository http://dspace.org

Computer Science thesis

2021-03-17

CONTEXT-BASED SEMANTIC ROLE

LABELER FOR AMHARIC

SENTENCES USING DEEP LEARNING

ALEMU, BELAY

http://ir.bdu.edu.et/handle/123456789/12394

Downloaded from DSpace Repository, DSpace Institution's institutional repository

BAHIR DAR UNIVERSITY

BAHIR DAR INSTITUTE OF TECHNOLOGY

SCHOOL OF RESEARCH AND POSTGRADUATE STUDIES

FACULTY OF COMPUTING

CONTEXT-BASED SEMANTIC ROLE LABELER FOR AMHARIC


MSC THESIS

BY

ALEMU BELAY TESSEMA

March 17, 2021

BAHIR DAR, ETHIOPIA

i | P a g e

CONTEXT-BASED SEMANTIC ROLE LABELER FOR AMHARIC


BY

ALEMU BELAY TESSEMA

A thesis submitted to the school of graduate studies of Bahir Dar

Institute of Technology, BDU in partial fulfillment of the requirements

for the degree of Masters in Information Technology program in the

Faculty of Computing

Advisor Name: Tesfa Tegegne (PhD)

March 17, 2021

Bahir Dar, Ethiopia

ii | P a g e

iii | P a g e

© 2021

ALEMU BELAY TESSEMA

All Rights Reserved

iv | P a g e

v | P a g e

ACKNOWLEDGEMENT

First and foremost, I would like to thank God for making everything complete in their time

and giving me strength to complete the thesis.

Secondly, I would like to express my deepest gratitude to my advisor Tesfa Tegegne (PhD)

for his generous and fruitful advice and willingness to help me all the time and for his

timely and critical evaluation of my thesis and providing different constructive comments

at each step of the thesis.

Last but not least, I would like to thank my families Mr. Misgan Belay and Mekdes Belay

and my friend Bedru Yimam for their encouragement, inspiration and day to day support

to undertake this endeavor from beginning to end.

Alemu Belay

vi | P a g e

LIST OF ABBREVIATIONS

AI Artificial Intelligence

Bi-LSTM Bidirectional Long Short-Term Memory

CRF Conditional Random Forest

IB1 Instance Based 1

IE Information Extraction

IR Information Retrieval

LOO-CV Leave One Out Cross Validation

LSTM Long-Short Term Memory

MBL Memory Based Learning

ME Maximum Entropy

MLP Multilayer Perceptron

MT Machine Translation

NLP Natural Language Processing

NMT Neural Machine Translation

POS tagging Part of Speech tagging

QA Question Answering

ReLU Rectified Linear Unit

RNN Recurrent Neural Network

SMO Sequential Minimal Optimization algorithm

SMT Social Media Text

SRL Semantic Role Labeling

SVM Support Vector Machine

TTS Text to Speech

vii | P a g e

Table of Contents

ACKNOWLEDGEMENT ...................................................................................................v

LIST OF ABBREVIATIONS ............................................................................................ vi

ABSTRACT ...................................................................................................................... xii

CHAPTER ONE: INTRODUCTION ..................................................................................1

1.1. Background ............................................................................................................................ 1

1.2. Statement of the Problem ....................................................................................................... 3

1.3. Objectives of the Study .......................................................................................................... 5

1.3.1. General Objective .......................................................................................................... 5

1.3.2. Specific Objectives ........................................................................................................ 6

1.4. Scope and Limitation ............................................................................................................. 6

1.5. Significance of the Study ....................................................................................................... 6

1.6. Methodology .......................................................................................................................... 7

1.6.1. Problem identification and motivation ........................................................................... 8

1.6.2. Objectives for a solution ................................................................................................ 8

1.6.3. Design and development ................................................................................................ 8

1.6.4. Demonstration ................................................................................................................ 8

1.6.5. Communication .............................................................................................................. 8

1.6.6. Evaluation Metrics ......................................................................................................... 9

1.7. Design and Development Tools ............................................................................................. 9

1.7.1. Python ............................................................................................................................ 9

1.7.2. TensorFlow .................................................................................................................... 9

1.8. Organization of the Thesis ................................................................................................... 10

CHAPTER TWO: LITERATURE REVIEW ....................................................................11

2.1. Natural Language Processing .............................................................................................. 11

2.2. Semantic Role ...................................................................................................................... 12

2.2.1. Common List of Semantic Roles ................................................................................. 13

2.2.2. Semantic Role Labeling Challenges ............................................................................ 15

2.3. Lexical Resources for SRL .................................................................................................. 15

2.3.1. PropBank ..................................................................................................................... 15

viii | P a g e

2.3.2. FrameNet ..................................................................................................................... 24

2.4. Classification Methods for SRL ........................................................................................... 25

2.4.1. Maximum Entropy ....................................................................................................... 25

2.4.2. Support Vector Machines ............................................................................................ 26

2.4.3. Conditional Random Forest (CRF) .............................................................................. 27

2.5. Deep Learning ...................................................................................................................... 28

2.5.1. Recurrent Neural Network (RNN) ............................................................................... 29

2.5.2. Long-Short Term Memory (LSTM) ............................................................................ 30

2.5.3. Multi-Layer Perceptron (MLP) .................................................................................... 32

2.6. Amharic Language ............................................................................................................... 34

2.6.1. Amharic Sentences ....................................................................................................... 36

CHAPTER THREE: RELATED WORKS ........................................................................39

3.1. Introduction ......................................................................................................................... 39

3.2. Semantic Role Labeling for English and European Languages........................................... 39

3.3. Semantic Role Labeling for Chinese Language .................................................................. 40

3.4. Semantic Role Labeling for Arabic Language .................................................................... 41

3.5. Semantic Role Labeling for Amharic Language ................................................................. 42

CHAPTER FOUR: SYSTEM DESIGN AND IMPLEMENTATION ..............................44

4.1. Introduction ......................................................................................................................... 44

4.2. Proposed System Architecture ............................................................................................ 44

4.2.1. Preprocessing Phase ..................................................................................................... 45

4.2.1. Training Phase ............................................................................................................. 48

4.2.2. Testing/Semantic Role Labeling Phase ........................................................................ 53

4.3. The Proposed Network Model ............................................................................................. 55

CHAPTER FIVE: RESULTS AND DISCUSSION ..........................................................56

5.1. Introduction ......................................................................................................................... 56

5.1.1. Dataset Collection ........................................................................................................ 56

5.1.2. Dataset Preparation ...................................................................................................... 57

5.2. Experiment/Implementation ................................................................................................. 58

5.2.1. Hyperparameter tuning ................................................................................................ 59

ix | P a g e

5.2.2. Proposed model training and Validation accuracy....................................................... 62

5.2.3. Proposed model Training and Validation loss ............................................................. 63

5.3. Experiments ......................................................................................................................... 64

5.3.1. Evaluation .................................................................................................................... 66

5.4. Discussion of Results ........................................................................................................... 71

CHAPTER SIX: CONCLUSION AND RECOMMENDATION .....................................72

6.1. Conclusion ........................................................................................................................... 72

6.2. Contribution of the study ..................................................................................................... 72

6.3. Recommendation ................................................................................................................. 73

REFERENCES ..................................................................................................................74

APPENDIXES ...................................................................................................................80

x | P a g e

LIST OF TABLES

Table 2. 1 List of arguments in PropBank and their description ...................................... 17

Table 2. 2 List of Annotated Adjuncts in PropBank with their Explanations .................. 17

Table 4. 1 Sample output of Preprocessing Phase ............................................................ 45

Table 4. 2 Sample Normalized verbs in the Dataset ......................................................... 46

Table 4. 3 Sample Amharic Sentence Tagged by online Amharic tagger module ........... 47

Table 4. 4 The Forward LSTM and backward LSTM Result for the sentence “ሳሙኤል ምሳሩን

በሞረድ ሳለ” .......................................................................................................................... 50

Table 4. 5 A Work of MLP Scoring function on the sentence “ዳንኤል ቢላዋ ሳለ” ............... 53

Table 4. 6 Sample Labelled Amharic Sentence Generated by the model ......................... 54

Table 5. 1 Collected Sample Amharic Sentences with their Domains ............................. 56

Table 5. 2 Collected Amount of Dataset from Different social media platforms ............. 57

Table 5. 3 Predicate-Argument Relation .......................................................................... 58

Table 5. 4 Training, Testing and Validation Dataset Description .................................... 59

Table 5. 5 List of Hyperparameters Used in the Model with their Description ................ 60

Table 5. 6 Comparison of Role Labeler Performance with and without context of multi-

sense predicate .................................................................................................................. 65

Table 5. 7 Average precision, recall, and F-score result for testing the model ............... 68

Table 5. 8 Individual role label performance on predicate sense based annotated data ... 68

Table 5. 9 Testing Result of the Semantic Role Labeler Model ....................................... 69

xi | P a g e

LIST OF FIGURES

Figure 4. 1 Proposed System Architecture ....................................................................... 44

Figure 4. 2 Concatenation of Argument and POS tag embedding .................................... 49

Figure 4. 3 Visualize MLP classifier model used in semantic role labeling..................... 51

Figure 4. 4 Proposed Network model ............................................................................... 55

Figure 4. 1 Proposed System Architecture ....................................................................... 44

Figure 4. 2 Concatenation of Argument and POS tag embedding .................................... 49

Figure 4. 3 Visualize MLP classifier model used in semantic role labeling..................... 51

Figure 4. 4 Proposed Network model ............................................................................... 55

Figure 5. 1 Proposed Model Training and validation status rate ...................................... 62

Figure 5. 2 Proposed model Training and Validation Loss Rate ...................................... 63

Figure 5. 3 Role labeling Performance with Predicate Sense Based Annotated Data ...... 66

Figure 5. 4 Sample Semantic role labels assigned by the model with the Testing data ... 70

xii | P a g e

ABSTRACT

Currently, different scholars are doing search on different natural language processing

tasks for different languages like, Arabic, English and Chinese such as machine

translation, question answering, information extraction and text summarization.

Nowadays, SRL has become a hot research issue and one of the main focusing areas. Since

it is a crucial and sentence-level semantic task to specify the main role of each argument

in a given text and used as an input for doing other NLP tasks. Unfortunately, the previous

researchers have not focused on Amharic sentences for semantics relationships of

constituents and predicate. In order to solve this gap, we have developed a context-based

semantic role labeler model for Amharic language using a deep learning approach called

Bidirectional Long-Short Term Memory networks by considering different senses of

predicate on simple Amharic sentences during annotation of dataset. The data were

collected from different social media platforms and student Amharic textbooks and

annotated semantically based on a PropBank data annotation guidelines and linguistic

experts from wollo university. From these datasets, we have 40 predicates which have more

than one contextual meaning each of them annotated depending on their sense and

assigned different role labels for multi-sense predicate data for training and testing the

model. The MLP classifiers were used to classify each argument to its associated role label

based on the score predicted by biaffine attentional scorer. The proposed model achieved

95.9% training accuracy and 84.9% Testing accuracy.

1 | P a g e

CHAPTER ONE: INTRODUCTION

1.1. Background

Natural language processing (NLP) is a branch of artificial intelligence (AI) that deals with

the interaction between computers and humans using natural language and help computers

to understand, interpret and manipulate human language [1].

NLP is important for machines to communicate with humans in their own language. Using

NLP for creating a seamless and interactive interface between humans with machines will

continue to be a top priority for today and tomorrow’s increasingly cognitive applications

[3]. During the creation of an interactive interface between humans with machines and

resolving natural language ambiguity semantic analysis and identifying argument roles is

necessary to hold the meaning of contents in a text [3, 4].

Semantic analysis is the process of understanding the meaning of words in natural

language [4] and starts by reading all of the words in content to capture the real meaning

of any text. It identifies the text element and assigns them to their logical and grammatical

roles.

NLP tasks such as Text Classification and Categorization, Textual entailment, Semantic

Parsing and Question Answering (QA), speech and character recognition, Machine

Translation (MT) required semantic role labeling to derive meaning from human languages

using different deep neural network approaches [1, 2]. Since, lower-level tasks such as

parsing provide very little attention towards the semantics of texts in a sentence [4].

Based on the above issue, different scholars across the world motivated to develop a

Semantic Role Labeling (SRL) model for different languages that provide greater attention

towards the semantic relations between the major constituents and predicate of a sentence.

Semantic Role Labeling is an important technique to perform semantic role analysis on

these NLP applications. Natural Language processing applications such as question

answering [5], Textual Entailment [6], machine translation [6], text summarization [7] and

2 | P a g e

information extraction [8] used semantic role labeler techniques as an intermediate task to

govern the semantic relationship between predicates and its constituents and this achieved

a good performance result.

In NLP, semantic role labeling is also called shallow semantic parsing [9] is the process

that assigns associate labels to words or phrases in a sentence that indicate their semantic

role in the sentence. It is a crucial and sentence-level semantic task that specifies Who did

what to whom, and How, When and Where? in a given sentence [1, 9]. For example, the

following Amharic sentence can be labeled as አበበ [ARG0-AGT] ቦርሳውን [ARG1-PAT] ለአልማዝ

[ARG2-BEN] ሰጠ. [REL]

During the time of assigning associate role labels to each argument in a sentence,

considering the sense of predicates in different domains is important for assigning role

labels for each argument properly. Because, different predicates have different senses and

can take a unique set of semantic roles to their arguments [15]. This will affect the

performance of semantic role assignment in semantic role labeling tasks because their

semantic roles relate with their sense. Therefore, the sense of predicates has to be

considered and polysemy verbs should be disambiguated and take an appropriate set of

semantic roles to their arguments depending on the predicate sense [24].

Let us see the sense of the predicate “አለፈ” in the following two simple Amharic sentences

(1) የሞላ አባት ትናንት ህይወቱ አለፈ and (2) የወንጀለኞቹ የፍርድ ቤት ቀጠሮ ትናንት አለፈ.

For instance, in the above two simple Amharic sentences the predicate “አለፈ” has two

senses each of them takes different semantic roles.

On the first sentence, the predicate “አለፈ” has the sense of “ሞተ፣ አረፈ፣ ከዚህ አለም ተለየ”.

Therefore, the verb/predicate takes who died (things died), when died (time of died) and

cause of died argument roles. Nevertheless, in the second sentence, the meaning of the

predicate “አለፈ” different from the first and takes argument roles different from the first.

i.e., things pass, time passes. Therefore, based on the sense of a predicate different set of

roles must be considered.

3 | P a g e

Due to this reason, we are motivated to develop a context-based semantic role labeler

model for simple Amharic sentences using Bidirectional Long-Short Term Memory (Bi-

LSTM) networks. So, the focus of this study is identifying polysemy verbs in our collected

dataset. Later on, assigning a semantically annotated role labels for their associated

arguments of predicates according to their contextual meaning and mixing multi-sense

predicate dataset to single-sense predicate dataset during training our model and try to show

the role of predicate (verb) sense disambiguation which is a sub component of word sense

disambiguation for semantic role labeling task on simple Amharic sentences.

1.2. Statement of the Problem

Semantic role labeling is an important step towards meaning understanding in natural

language processing which identifies and classifies the arguments associated with the

predicates of a sentence [5]. Several works have been done on the area of semantic role

labeling which considers sense of predicates during semantic role labeling using different

deep neural networks for different languages and achieves a better performance result from

the other state-of-art models [15]. However, it is impossible to use these proposed methods

for Amharic language because by its nature Amharic is highly inflectional,

morphologically rich language and uses a different way of semantic role extraction [17].

By considering the above problem, [18] developed a semantic role labeler model for simple

Amharic sentences using a conventional machine learning approach called Memory Based

Learning. They have developed the general architecture for semantic role labeler and

implemented feature extraction algorithms to extract 551 instances from 240 simple

Amharic sentences and obtained an F1-score of 82.51% with default parameters (i.e.,

number of nearest neighbors, distance metrics, class voting weight and feature weighting

metrics) and 89.29% with optimized parameters of IB1. Different predicates have different

senses in different Amharic sentences. For example, in the following two sentences “አማረ

ትልቅ የስጦታ ስዕል ሳለ” and “አማረ ጉንፋን ስለያዘው ክፉኛ ሳለ” the predicate “ሳለ” have different

senses and takes different role arguments.

However, E. Yirga and her colleagues [18] work did not consider the sense of predicate on

simple Amharic sentence structure taken from different domains. For example, the

4 | P a g e

following simple Amharic sentences contain the same predicate “ሳለ” with different

contextual meaning each of them takes different semantic roles to represent their associated

arguments found in each sentence.

So, based on their context the sentences can be labelled as follows: -

አማረ [ARG0-AGT] ትልቅ ስዕል [ARG1-BEN] ሳለ [REL] and

አማረ [ARG0-AGT] ጉንፋን ስለያዘው [ARGM-CAU] ክፉኛ [ARGM-MNR] ሳለ [REL]

The above two sentences are taken from two different domains. i.e., Entertainment and

Sport respectively. Due to this reason, on the above labeled simple Amharic sentence if we

ask the question, who draws or ማን ሳለ? The argument label [ARG0-AGT] provide answer

for this question, what is drawn or ምን ተሳለ? The argument label [ARG1-BEN] provide

answer for this question whereas if we ask the question, why Amare cough or አማረ ለምን

ሳለ? The argument label [ARGM-CAU] provides an answer for this question. Therefore, in

these sentences the predicate “ሳለ” have two different sense in each domain and each of the

arguments in each sentence takes different role label. However, E. Yirga and her colleagues

work did not consider such types of predicate cases in different domains rather they have

considered only single sense predicate dataset.

E. Yrga [18] uses a Memory Based Learning technique, in Memory Based Learning the

learning component is memory based all training examples should be stored in memory

and classified concepts based on their similarity with previously seen concepts on the

memory [20].

Memory Based Learning is a shallow machine learning technique, so, we need to train the

model with manually extracted task-specific features [22] and concepts are lost during a

backpropagation that causes vanishing/exploding gradient problems. Therefore, this

handcrafted based feature extraction consumes more time and causes feature unbalancing

on the concepts [21].

5 | P a g e

A study of [18] also uses k-Nearest neighbor (k-NN) algorithm which is a supervised

machine learning algorithm used for both classification and regression problems [36].

However, k-NN is a lazy learner algorithm (i.e., as the size of the data increases it does not

learn anything from the training data simply uses the training data itself for classification)

and used for only classifying numeric data [36].

In this study, we have developed a context-based Semantic Role Labeling model for simple

Amharic sentence structures by considering different senses of predicates during dataset

preparation using a deep learning technique called Bidirectional Long-Short Term Memory

networks. This will solve the problem of Memory Based Learning technique stated above

by extracting task-specific features automatically. In addition to this, we have used a deep

learning classifier i.e., multilayer perceptron classifier to solve the limitation of k-Nearest

neighbor (k-NN) algorithm which works based on a biaffine attentional scorer function and

can classify any types of data independent of the size of dataset.

Bidirectional Long-Short Term Memory networks are a special kind of Recurrent Neural

Network that have a capability of learning long-term dependencies [12] and contain a new

state called cell state that allows for avoiding vanishing/exploding gradient problems [21]

during a backpropagation.

To this end, this study attempts to answer the following research questions.

RQ 1: To what extent does deep Bi-LSTM classifier improve our semantic role labeling

performance in Amharic sentences?

RQ 2: How Predicate Sense disambiguation affects semantic role labeling on simple

Amharic sentences?

1.3. Objectives of the Study

1.3.1. General Objective

The general objective of this thesis work is designing a context-based semantic role label for

simple Amharic sentences by considering different senses of predicates using Deep Learning.

6 | P a g e

1.3.2. Specific Objectives

In order to achieve the general objective, we have set the following specific objectives.

To review literature on the area of SRL and verb sense disambiguation.

To identify the importance of predicate sense consideration for semantic role

labeling task on simple Amharic sentences

To select an appropriate deep learning classifier for our model.

To develop architecture for semantic role labeling systems.

To prepare a semantically annotated Amharic corpus for training and testing the

system.

To design a semantic role labeler network model

To evaluate the performance of the system using different performance evaluation

metrics.

1.4. Scope and Limitation

In this study, we developed a context based Semantic Role Labeling model for simple

Amharic sentences by considering different senses of predicates. The model does not work

for other language types other than Amharic and considers only simple Amharic sentences.

This work covers identifying the predicate and associated arguments in a given Amharic

sentence and assigning labels which indicates their role in a sentence for each argument

over the predicate in the given simple Amharic sentences. Our model used normalized

predicate and tagged dataset to assign associated role labels for each token.

1.5. Significance of the Study

This research work provides a benefit for other high-level NLP tasks. Because

understanding the semantic content and role of arguments in a given text is an intermediate

task in developing such natural language processing applications. Specially, the result of

this study used as an input for researchers work on the area of machine translation,

information extraction and text summarization.

7 | P a g e

Text Summarization: -Text summarization is a technique for producing automatic short

and important summaries from a huge document based on the user needs. Text

summarization uses SRL result as an input for estimating and identifying in which semantic

relations the entities participated and for estimating sentence similarity to know which

entities participating in which semantic relations are contained in two sentences [24].

Therefore, Predicates and Heads of Role are important for summarizing contents in each

document.

Information Extraction: - Information extraction refers to extracting structured

information from unstructured machine-readable documents automatically [8].

Information Extraction applies SRL for generalization purposes in template systems.

Therefore, semantic role labeling techniques used to construct useful rules that were used

for information extraction.

Machine Translation: - Machine translation is the process of translating from source

language text into the target language. Before performing translation, it is important to

understand the grammatical structure (word order) of the language. Therefore, SRL is

important to identify the structure of the sentence in different languages.

Example, to translate from 1. Abebe [AGENT/S] kicked [REL/V] the ball [ARG1/O] to

2. አበበ [AGENT/S] ኳሷን [ARG1/O] መታት [REL/O]

Therefore, to perform language translation understanding the sentence formation and word

order of a sentence is very important.

1.6. Methodology

This is an empirical research work. We have followed design science research

methodology. Design Science Research Methodology a basic deductive logic of discovery,

because an unsolved problem is taking and tries to find a justificatory knowledge or a

kernel theory which helps in solving the problem. The design Science research

8 | P a g e

methodology process generally followed six steps [19]. These steps and their description

are shown below.

1.6.1. Problem identification and motivation

In this step, we define the research problem that allows us to develop an effective model

that can provide a solution. We also define and justify the significance portion of the

research, the motivational factors for doing the research and identifying necessary

resources for this study include knowledge of the state of the problem.

1.6.2. Objectives for a solution

In this step, we have explained about the objectives of the study that are inferred from the

problem identification step and we have also tried to review various resources related to

our research to know the state of the problem and their solutions.

1.6.3. Design and development

In this step, the artifactual solution is created and determines the artifact’s desired

functionality. We used Pytorch (using TensorFlow as a backend) to design the model.

visual studio code is used for writing required source codes. In addition to this, we have

collected Simple Amharic sentences from different sources such as student textbooks,

Walta newspapers, and Addis Admas Newspaper with the help of linguistic students.

1.6.4. Demonstration

In this step, the developed artifact is demonstrated by simulating how the developed model

is labeled their roles within the given new Amharic sentences. We have used the windows

10 operating system and scientific Python development environment to implement the

model.

1.6.5. Communication

In this section, we present about the problem and its importance, the artifact model, and its

effectiveness to researches and other relevant related information are communicated to

relevant audiences when appropriate.

9 | P a g e

1.6.6. Evaluation Metrics

After we proposed the new system, we have evaluated its performance using different

evaluation metrics such as accuracy, precision and recall in order to check whether the

proposed model achieved the promising result and solved the statement of problem

properly or not.

1.7. Design and Development Tools

1.7.1. Python

We have used python programming language to develop the model. python is a

general purpose and high-level programming language used for text and image processing.

It is open-source and freely available, extensible, portable and easy to use software.

For the purpose of our study, we have used the latest version of python (i.e., python 3.7)

with its important features.

1.7.2. TensorFlow

TensorFlow is an open-source python-based library created by the google brain team used

for numerical computation and large-scale machine learning. TensorFlow bundles a slew

of machine learning and deep learning models and algorithms together and makes them

useful by the way of a common metaphor. It is also an open-source artificial intelligence

library that uses data flow graphs to build models and allows developers to create large

neural network models mainly used for classification, meaning understanding, prediction

and creation.

For our study, we have used the latest version of TensorFlow (i.e., TensorFlow 2.1.0) to

build a sequential deep learning model and classify input data.

10 | P a g e

1.8. Organization of the Thesis

This section presents an overview of the remaining chapters on this thesis work. The rest

of this thesis is organized as follows.

In Chapter Two, we have reviewed literature on the area of semantic role labeling,

classification methods used for SRL and available lexical resources for SRL are presented.

In addition to this, we have explained a detailed description of deep learning architecture

and different neural network classifiers such as Multi-Layered Perceptron.

In Chapter three, related works done on semantic role labeling using different approaches

(Artificial Neural network, Memory Based Learning, Neural network classifier i.e., MLP

etc.).

The fourth chapter describes the overall architecture of the proposed Context-Based

semantic role label for Amharic text. Detail description about phases of the proposed model

(preprocessing, training and semantic role labeling phase) and tasks done in each phase.

Chapter five presents evaluation of the proposed Context-Based semantic role labeler for

Amharic text for simple Amharic sentences. In addition to this detail description about

semantically annotated dataset preparation and training and evaluating the proposed

network model.

Chapter Six shows the major research work findings and the conclusion about the problem

statement. In addition to this, it outlined the major gaps this research work does not cover

and show for next researchers as a recommendation.

11 | P a g e

CHAPTER TWO: LITERATURE REVIEW

2.1. Natural Language Processing

Natural language processing is a field of computer speech synthesis and human language

technology that deals with interaction between computers and humans using natural

language [1]. It allows computers to analyze natural language and convert it into a useful

form of data representation and helps us to understand human language.

NLP basically focused on making computers perform important and interesting tasks and

well understood in human languages such as text-to-speech or speech-to-text conversion,

natural language understanding and machine translation to communicate machines

(computers) with humans in their own language [3].

To generalize the nature, application and behavior of NLP understanding the knowledge

of the language’s meaning is a preliminary task.

For instance, the level of linguistic analysis needs to be understood the following

knowledge of language: -

➢ Phonology: it realized how words related to the speaker sounds.

➢ Syntax: related to putting the correct form of a sentence and the structural role of

each word plays in the sentence.

➢ Semantics: studies about meaningful formation of sentences, context-independent

meaning of words and how these meanings are combined.

➢ Pragmatics: focuses on how sentences are used in different conditions and how

they affect the meaning of the sentence.

➢ Disambiguation: concerns solving the occurrence of ambiguities at different levels

of language.

➢ Discourse: study how the immediately preceding sentences affect the interpretation

of the next sentence.

Therefore, natural language processing focuses on the above levels of language analysis to

process natural language. Our study focuses on the area of context-based semantic

12 | P a g e

annotation of simple Amharic sentences particularly called semantic role labeling which is

important to resolving ambiguities in natural language understanding [29], considering

senses of predicates in different domain and identifying associated arguments and

assigning their appropriate roles for each of them.

2.2. Semantic Role

Semantic roles are also called thematic roles that characterize semantic relationships

between syntactic constituents (arguments) and a predicate [30]. They are abstract models

of the role of an argument expressed by the predicate and identifies the role of a verbal

argument in the event expressed by the verb usually an agent, a patient, experiencer [18].

Semantic roles describe the relation between a predicate typically a verb and its arguments

whereas Semantic role labeling extracts and assign roles to these relations in the sentences.

According to [32], Semantic roles can express in three different levels of generality:

➢ Verb-specific semantic roles: which are roles associated with a specific role of a

verb in a sentence.

E.g., runner, seller, cutter, broker, etc.

➢ Thematic relations: which expresses generalizations across the verb-specific roles.

E.g., agent, instrument, experiencer, theme, patient

➢ Generalized Semantic Roles: those expresses generalization across thematic

relations using two different arguments i.e., Actor and Undergoer. Actors expressed

generalization across the door of the action such as agent, experiencer, instrument

and Undergoer expressed generalization across subsuming patient, theme, recipient

and other roles.

Semantic Role Labeling is an NLP task aimed to capture and represent the participants and

circumstances of events expressed in human languages those revealed to provide answers

to questions such as who did what to whom, where, when and how by assigning different

semantic roles to each constituents of the sentence [31]. SRL approaches are usually

considered intermediary techniques in extracting meaning from text that play an important

13 | P a g e

role towards the natural language understanding. SRL task involves identifying the

constituents of each target predicate i.e., argument identification and classifying the

arguments i.e., argument classification before assigning an associated role label for them.

Nevertheless, in order to perform argument identification and classification accurately,

first the SRL task is required to perform finding the target predicate in the given sentence

and then assign a certain sense number to each of them. The input information contains

several levels of annotation in addition to the role labeling information i.e., parse trees,

POS tags, named entities [32].

2.2.1. Common List of Semantic Roles

According to their semantic roles, arguments are grouped into two major types: (i)

necessary arguments: those are arguments that represent central participants in an action

such as agent, patient, instrument, etc. (ii) optional arguments (adjuncts): those arguments

are optional for an event but have a capability of providing more information about the

action which includes manner, location, cause, time, etc.

We have seen a list of major thematic roles usually considered based on [5].

Agent: those are participants usually considered as a subject of a sentence. In which the

verb specifies the doer or initiator of action as doing or deliberately performs the action.

Example: [መላኩ] Agent እባቡን በዱላ ገደለው

Patient: shows the entity affected by the action, undergoes the action and shows a certain

state of change on its normal situation.

Example: ግንበኞቹ [ድንጋዩን] Patient ፈለጡት

Experiencer: the entity moving or being “located” receives sensory or emotional input.

Example: የጤና መድን ድርጅት [አቅመ ደካሞችን] Experiencer ረዳ

Theme: represents the entity moving and undergoes the action but does not show any state

change. Sometimes they are interchangeably used with patients.

14 | P a g e

Example: ጊፍት ሪል እስቴት [8 አፓርታሞችን] Theme ገነባ

Instrument: Used to perform the stated action.

Example: ካሳ በሩን [በቁልፍ] Instrument ከፈተ

Manner: it represents the way in which an action is carried out.

Example: በብራዚል የኮሮና ቫይረስ ተጠቂወች ቁጥር [በእጥፍ] Manner ጨመረ

Location: the thematic role associated with expressing where the action occurs or shows

the place of the action performed.

Example: ጥራቱን የጠበቀ 250 ካራት ወርቅ [በደቡብ አፍሪካ] Location ተገኘ

Direction or Goal: Where the performed action goes towards.

Example: የአሜሪካ ጎብኝዎች [ወደ ላሊበላ] Direction ተጓዙ

Source: the thematic role associated with where the action originated.

Example: ደረጀ [ከድሬዳዋ] Source ወደ ባህር ዳር መጣ

Purpose: The aim or goal in which an action is performed

Example: አሜሪካ በኢትዮጵያ [የኩፍኝ በሽታን ለመከላከል] Purpose አራት ሚሊየን ዶላር ድጋፍ አደረገች

Time: The time that the action occurred.

Example: ጠቅላይ ሚኒስትሩ ለይፋዊ የስራ ጉብኝት [ትናንት ማታ] Time ፈረንሳይ ገቡ

Beneficiary: The entity which benefits from the action occurs.

Example: ትምህርት ሚኒስቴር [ለ2000 አቅመ ደካማ ወላጅ ልጆች] Beneficiary የትምህርት ቁሳቁስ ሰጠ

Cause: Represents because of what? Something was happening or the action occurred in

the first place.

Example: [በህገወጥ ስደት] Cause የ30 ኢትዮጵያዊያን ስደተኞች ህይዎት አለፈ

15 | P a g e

2.2.2.Semantic Role Labeling Challenges

Efficient semantic role assignment is useful for developing many NLP applications.

However, the problems raised during role assignment can be either from the structural point

of view i.e., the input sentence is text which enriched with morpho-syntactic information

and the output would be a sequence of labeled arguments so, it is difficult to mapping from

input structures of a sentence to output structures. This would cause sequential segmenting

and labeling problems [34].

The other is syntactic variation of a sentences i.e., a single sentence can be expressed in

different structures syntactically, but all of them may have same semantic interpretation or

representation and also it is very difficult to decide a standard set or number of roles and

produce a formal definition for these roles like AGENT, TIME, SOURCE, or

INSTRUMENT [35].

2.3. Lexical Resources for SRL

In this section we have seen important and applicable lexical resources used for semantic

role labeling which are developed by different scholars. These lexical resources are

FrameNet, PropBank are lexical resources used for semantic role labeling.

2.3.1. PropBank

The PropBank is a treebanked structure corpus which provides predicate-argument

annotation for the entire Penn Treebank and creates a corpus of text annotated with

information about basic semantic propositions. Each verb in the treebank is annotated by a

single instance in PropBank, containing information about the location of the verb, and the

location and identity of its arguments [42, 45].

It consists of over 1M annotated words of Wall Street Journal text with existing gold-

standard parse trees and syntactically annotated sentences with predicate-argument pairs

for providing consistent argument labels across different syntactic realizations of the same

verb.

16 | P a g e

In addition to semantic role annotation, PropBank annotation requires the choice of a sense

id for each predicate and aims to provide consistent argument labels across different

syntactic realizations of the same verb, as shown in the following sentences: -

1. [ARG0 Jemal] broke [ARG1 the glass]

2. [ARG1 the glass] broke

The arguments of the same verb “broke” labeled as numbered arguments: i.e., ARG0,

ARG1, ARG2 etc.

Secondly, the PropBank annotation involves assigning a certain functional tag to all

modifiers of verbs, such as direction (DIR), manner (MNR), source (SRC), temporal

(TMP) those have a capable of providing additional information about the arguments of a

sentences [42].

For example: - Melaku Came from Addis Ababa to Gondar by plane yesterday.

Arg0: Melaku

Rel: Come

ArgM-SRC: from Addis Ababa

ArgM-DES: to Gondar

ArgM-INS: by train

ArgM-TMP: yesterday

The PropBank project takes a practical approach to semantic representation, adding a layer

of predicate-argument information, or semantic role labels, to the syntactic structures of

the Penn Treebank [6, 43]. It contains sentences annotated with proto-roles and verb-

specific semantic roles. In PropBank, roles (Arg0 to ArgN) specific to each individual verb

to avoid having to agree on a universal set. i.e., Arg0 basically “agent” and Arg1 basically

“patient”.

17 | P a g e

According to [42, 45], Arguments that are labeled from ARG0 to ARG4 are called core

arguments. Those numbered labels represent very general kinds of semantic roles. The

following table shows the list of numbered arguments in PropBank and their descriptions

that are taken from [42].

Table 2. 1 List of arguments in PropBank and their description

No Label Description

1 ARG0 Agent, Experiencer, Causer

2 ARG1 Patient, Theme

3 ARG2 Beneficiary, Instrument, Attribute, End state

4 ARG3 Starting point, Instrument, Attribute

5 ARG4 Ending Point

In addition to verb-specific numbered arguments, PropBank defines several more general

roles to any verb and assigning functional tags to all modifiers of verbs. These adjuncts

(circumstantial objects) can appear in any verb's frame to provide more additional

information about arguments found in a given sentence marked as ARG-Ms (modifiers).

Based on [42], there are 12 secondary tags for ARGMs used in the Proposition Bank i.e.

DIR, LOC, MNR, TMP, EXT, REC, PRD, PRP, DIS, ADV, MOD, NEG. The following

table shows the list of PropBank Annotated Adjuncts or modifier arguments and their

descriptions in [45].

Table 2. 2 List of Annotated Adjuncts in PropBank with their Explanations

Modifier Type Description

DIR Direction

TMP Time

LOC Location

MNR Manner

EXT Extent

CAU Cause

PUR Purpose

MOD Modal verb

18 | P a g e

NEG Negation marker

REC Reciprocal

ADV General-purpose adverbs

DIS Discourse connectives

The PropBank arguments have their own way of interpretation i.e. numbered arguments

are interpreted in a predicate specific manner whereas the ARGM‟S have a global

interpretation. As shown in [6], the general procedure to design PropBank is based on

creating frame sets for each verb and then using them as annotation guidelines for the

annotation.

The PropBank consists of two basic development processes. These are framing and

annotation.

Framing

Framing is the first process in PropBank which refers to the process of creating the frame

files. It is the collection of frameset entries for each verb which begins with the examination

of a sample of the sentences from the corpus containing the verb under consideration.

In PropBank, frame files provide a verb specific description of all possible semantic roles

by identifying the predicate and its possible arguments.

For example, frame set for the verb ‘ሰበረ’ (break):

Roles: -

ARG0: Breaker

ARG1: Things broken

ARG2: Instrument

Example: Transitive, active:

መልካሙ የህንፃውን መስኮት በመዶሻ ሰበረ (Abebe break the window by hammer)

• ARG0: መልካሙ

19 | P a g e

• REL: ሰበረ

• ARG1: የህንፃውን መስኮት

• ARGM-INS: በመዶሻ

Some verbs may have different contextual senses, so it is impossible to provide the same

set of semantic roles for all verb sense. In such cases, frame files distinguish these two or

more verb senses and define specific argument labels to each frame set. For example, in

the following two sentences, the verb “አከበረ” takes different argument.

1.ሚኪያስ (ARG0-AGT) የፋሲካ በዓልን (ARG1-BEN) በጎንደር (ARGM-LOC) አከበረ

2. ሚኪያስ (ARG0-AGT) አለቃውን (ARG1-TEM) አከበረ

Annotation

Choosing ARG0 versus ARG1

In many cases choosing an argument label for a given sentence is simple and

straightforward means easily given the verb specific definition of this label in the frame

files [42, 45]. However, to some extent this condition becomes more difficult and

ambiguous whether an argument should be annotated as Arg0 or Arg1. Thus, the annotator

must decide between these labels based on the following explanations of what generally

characterizes Arg0 and Arg1. The Arg0 label is assigned to arguments which are

understood as agents, causers, or experiencers whereas Arg1 label is usually assigned to

the patient argument, i.e. the argument which undergoes the change of state or is being

affected by the action.

In addition to this, Arg0 arguments correspond to the subjects of transitive verbs and a

class of intransitive verbs. For Example: -

➢ አበበ (ARG0) ፈተናዉን አለፈ

➢ አበበ (ARG0) መጣ

Whereas Internal arguments (labeled as ARG1) are the objects of transitive verbs and the

subjects of intransitive verbs called unaccusatives. For example:

20 | P a g e

➢ ኪሩቤል ሌባውን(ARG1) ያዘው

➢ ሌባው (ARG1) ተያዘ

Semantically external arguments have Proto-Agent properties such as [6, 45]:

➢ Volitional involvement in the event or state

➢ Causing an event or change of state in another participant

➢ Movement relative to the position of another participant

These arguments have Proto-Patient properties, which means that these arguments can

undergo change of state, are causally affected by another participant and are stationary

relative to movement of another participant.

Annotation Modifier

PropBank used the following annotation modifiers.

Comitatives (COM)

Comitative modifiers formerly also called ‘comitative’ that indicates who an action was

done with. This can include people or organizations (entities that have characteristics of

prototypical agents: animacy, volition) but excludes objects, which would be considered

instrumental modifiers [42, 45].

E.g. መላኩ ከ አያቱ ጋር ምሳውን በላ

• ARG0: መላኩ

• REL: በላ

• ARG1: ምሳውን

• ARGM-COM: ከ አያቱ ጋር

Locatives (LOC)

Locative modifiers indicate where some action takes place [42]. The notion of a locative

is not restricted to physical locations, but it also represents abstract locations.

E.g. መላኩ ከ አያቱ ጋር በ ካፒታል ሆቴል ምሳውን በላ

21 | P a g e

• ARG0: መላኩ

• ARGM-COM: ከ አያቱ ጋር

• ARGM- LOC: በ ካፒታል ሆቴል

• ARG1: ምሳውን

• REL: በላ

Destination (DES)

Destination modifier indicates the final resting place or destination of motion. However, if

there is no clear path being followed, a “location” marker should be used instead [42].

E.g. መስፍን ወደ ጎንደር ሄደ

• ARG1: መስፍን

• ARGM-DES: ወደ ጎንደር

• REL: ሄደ

Extent (EXT)

ARGM-EXT indicates the amount of change occurring from an action. ARGM-EXT

are used mostly for the following [42, 45]:

➢ Numerical adjuncts like ‘(raised prices) by 15%’,

➢ Quantifiers such as ‘a lot’ and

➢ Comparatives such as ‘(he raised prices) more than she did’

E.g. የጃፓን አመታዊ ገቢ ከባለፈው አመት በ 15% አደገ

• ARG0: የጃፓን አመታዊ ገቢ

• REL: አደገ

• ARGM-EXT: በ 15%

• ARGM- TMP: ከባለፈው አመት

Manner (MNR)

Manner adverbs specify how an action is performed. Manner tags should be used when an

adverb could be an answer to a question starting with ‘how?’ [42].

22 | P a g e

E.g. የመስፍን አባት ክፉኛ ታመመ

• ARG1: የመስፍን አባት

• ARGM-MNR: ክፉኛ

• REL: ታመመ

Modals (MOD)

Modals are will, may, can, must, shall, might, should, could, would those consistently have

labeled in the Tree Bank as ‘MOD.’ These elements are one of the few elements that are

selected and tagged directly on the modal word itself, as opposed to selecting a higher node

that contains the lexical item [42, 45].

Temporal (TMP)

Temporal argument modifier shows when an action took place, such as ‘in 1987’, ‘last

Wednesday’, ‘soon’ or ‘immediately’. Also included in this category are adverbs of

frequency (e.g., often always, sometimes), adverbs of duration (for a year/in a year), order

(e.g., first) [42].

E.g. መስፍን ትናንት ከወንድሙ ጋር ወደ ጎንደር ሄደ

• ARG1: መስፍን

• ARGM-TMP: ትናንት

• ARGM-COM: ከወንድሙ ጋር

• ARGM-DES: ወደ ጎንደር

• REL: ሄደ

Purpose Clauses (PRP)

Purpose clauses are used to show the motivation for some action. Clauses beginning with

‘in order to’ and ‘so that’ are canonical purpose clauses [42].

E.g. የጤና ሚኒስቴር ለፖሊዮ ክትባት ሁለት ሚሊየን ብር ድጋፍ አደረገ

• ARG1: የጤና ሚኒስቴር

• ARGM- PRP: ለፖሊዮ ክትባት

23 | P a g e

• ARG1-TEM: 2 ሚሊየን <NUM> ብር <N>

• ARG2: ድጋፍ

• REL: አደረገ

Cause Clauses (CAU)

Similar to ‘Purpose clauses’, these indicate the reason for an action. Clauses beginning

with ‘because’ or ‘due to’ are canonical cause clauses. Questions starting with ‘why,’

which are always characterized by atrace linking back to this question word, are always

treated as cause. However, in these question phrases it can often be difficult or impossible

to determine if the ‘why’ truly represents purpose or cause. Thus, as a general rule, if the

annotator cannot determine whether an argument is more appropriately purpose or cause,

cause is the default choice [42, 45].

E.g. የናይጀሪያው ፕሬዚዳንት በገንዘበ ማጭበርበር ተከሰሱ

• ARG1: የናይጀሪያው ፕሬዚዳንት

• ARGM- CAU: በገንዘበ ማጭበርበር

• REL: ተከሰሱ

Negation (NEG)

Negation is an important notation for PropBank annotation. Therefore, all markers which

indicate negation should be marked as NEG. Most of the time in Amharic sentences

negation marker are indicated using prefix “ አይ ” ፣“ አል ” ፣“ አት ” such as አይ መጣም፣ አት

መጣም ፣ አል በላም. Those are identified using morphological analyzer.

Both FrameNet and PropBank resources used to specify what counts as a predicate, define

the set of roles used in the task, and provide training and test sets. Recall that the difference

between these two models of semantic roles is that FrameNet employs many frame-specific

frame elements as roles, while PropBank uses a smaller number of numbered argument

labels that can be interpreted as verb-specific labels, along with the more general ARGM

labels.

24 | P a g e

2.3.2. FrameNet

FrameNet is a lexical database of English appeared in both human- and machine-readable

format that describes English words using Frame Semantics and sentence arguments are

defined in terms of frames rather than verbs [43].

FrameNet is an electronic resource and a framework for explicit description of the lexical

semantics of words. The key concept in the FrameNet method of annotation is a semantic

frame [44]. A semantic frame can be described as a representation of an object, event or

situation in which each frame has its own set of roles. For example, the roles defined for

the frame research are field, question, researcher and topic.

In the FrameNet dataset, the sentences are arranged in a hierarchical order with each frame

referring to a concept. Frames at the higher level refer to a more generic concept while

frames at the lower level refer to more specific concepts [44].

According to [43], in FrameNet three levels of semantic roles can be distinguished:

Core Elements: those elements are conceptually necessary for the frame and roughly

similar to syntactically obligatory to instantiate required roles. Additionally, the core

element allows to distinguish one frame from other frames.

Peripheral Elements: those frame elements are not central to the frame, but have a

capability of providing more additional information about the event, such as cause, time

and place. Those elements are similar to modifiers.

Extra-thematic Elements: those are not specific to the frame and standard modifiers but

describing the frame with respect to a broader context.

Every frame has invoking predicates attached to it. Figure 1 [43] shows structure of frames

in the FrameNet lexicon. These are the verbs and some nouns that invoke the concept,

referred by the frame they are attached to, these sentences that have these predicates would

have constructs that play the role given by the frame elements of the invoked frame. For

example, [judge she] blames [evaluee the government] [reason for failing to do enough to

help]; in this example predicate blame invokes the judgment frame and other constructs in

25 | P a g e

the sentence play the invoked semantic roles. (She) plays the role (Judge), (the

Government) plays the role (Evaluee), (for failing to do enough to help) plays the role

(Reason).

Figure 2. 1 Sample Domain and Frames from the FrameNet Lexicon

2.4. Classification Methods for SRL

2.4.1. Maximum Entropy

Maximum entropy is a probabilistic classifier which belongs to the class exponential

models which works based on the principle of maximum entropy and selects all the models

which have the largest entropy that fit our training data.

Figure 2. 2 Maximum Entropy Classifier

26 | P a g e

The Max Entropy classifier is a discriminative classifier commonly used in Natural

Language Processing, sentiment analysis, Speech and Information Retrieval problems

commonly used to power up our Machine Learning API [38]. The Maximum Entropy

classifier is trained for identification and classification of the predicates’ semantic

arguments together and only among the embedding ones tries to keep the constituents with

the largest probability.

2.4.2. Support Vector Machines

Support Vector Machine (SVM) is a supervised machine learning algorithm which can be

used for both classification and regression problems [39]. SVM is one of a binary classifier

which works based on the maximum margin strategy by finding a coordinate of individual

observation which best segregates the two classes (hyper-plane/ line). Then a hyperplane

divides a dataset into two classes [46].

Figure 2. 3 Support vector Machine Classifier

The basic advantage behind SVM algorithms is that they have high generalization

performance independent of the dimensions of the feature vectors and they can learn with

a combination of multiple features is possible by using the polynomial kernel function [39].

Due to this, SVM classifier is applicable in different natural language processing tasks such

as document classification, semantic role labeling and named entity recognition and

achieved a better performance result [46]. In SVM classifiers, Support vectors are the data

27 | P a g e

points that are very close to the hyperplane and considered as the critical elements of a data

set. These classifications can produce better performance accuracy and more efficient in

identifying a subset of training points [27].

However, SVM classification and regression algorithms require a clean dataset to work

better which means the classifier becomes less effective on a nosier dataset which contains

overlapping classes and takes longer training time for larger datasets [39].

2.4.3. Conditional Random Forest (CRF)

Conditional random forest is a flexible supervised machine learning algorithm applicable

for both classification and regression tasks which works by building multiple decision trees

and merges them together to get a more accurate and stable prediction [41]. CRFs are a

discriminative model with an undirected graphical structure belonging to the general class

of graphical models aimed at structured learning problems such as sequence, graph and

tree labeling which makes them appropriate for labeling natural language data [28].

Random forest classifier has been successfully applied to a variety of natural language

processing tasks for labeling or parsing of sequential data such as POS tagging, shallow

parsing and semantic understanding, named entity recognition, computer vision [41].

When a tree goes growing, instead of searching for the most important features, random

forest adds additional randomness to the model to search the best match and allows to avoid

overfitting problems occurring in machine learning by creating enough trees in the forest

[28, 41].

https://builtin.com/data-science/introduction-to-machine-learning

28 | P a g e

Figure 2. 4 Conditional Random Field Classifier

However, when the number of decision trees increases the random forest algorithm

performance becomes too slow and ineffective for real-time predictions. It is fast to train

but very slow to predict based on what they trained which means to predict accurately the

algorithm needs more trees that leads a model to become slower [41].

2.5. Deep Learning

Deep learning uses self-taught learning and layers of neural-network algorithms to

construct and decipher higher-level information at other layers based on raw input data

with many hidden layers and powerful computational resources [13].

Deep neural networks designed as components of larger machine-learning applications

involving algorithms for reinforcement learning, classification and regression to recognize

data patterns based on an early understanding of how the human brain functions [22].

29 | P a g e

Figure 2. 5 A Deep Learning Architecture

Deep learning architecture is applied to natural language processing, semantic role

labelling, audio recognition, computer vision and semantic analysis [13]. A deep neural

network architecture builds from an input layer, two or more hidden layers and an output

layer composed of many perceptrons that are connected in different ways and that operate

on different activation functions.

Nowadays different scholars apply different neural network approaches on different areas

of natural language processing activities to obtain a better performance result and to replace

the limitation of traditional (shallow) machine learning approaches that need a hand-crafted

feature to train their network model [22, 23, 28].

Even though, Neural network model contains more hidden layers, that makes the model

more-deeper and it produces higher training and testing accuracy results. It has a capability

of extracting important features from the given data for learning and predicting when a new

unseen data comes. The model requires a large dataset to extract important features

automatically and to achieve a good result [13].

2.5.1. Recurrent Neural Network (RNN)

Recurrent neural network is a powerful ANN approach that has a capability of creating a

recurring connection to itself to process sequential data types [25]. It used stochastic

gradient descent (SGD) optimizer to train the network along with a backpropagation

30 | P a g e

algorithm. Recurrent neural networks work by creating a recurring connection to each layer

of the network and process the input sequentially. This allows the network to learn the

effect of previous input x(t-1) along with the current input x(t) while predicting the output

at time “t” [28].

RNN inputs and outputs are dependent on each other and the hidden layer preserves

sequential information from previous steps. This means the output from an earlier step is

used as an input to the next step using the same weights repeatedly for prediction purposes

and then the layers are joined to create a single recurrent layer [28].

RNN used Long Short-Term Memory and Bidirectional Long Short-Term Memory

algorithm to train the model without making a gradient and vanishing problem which lost

weights and contents during backpropagation [21]. Bidirectional recurrent neural networks

are putting two independent RNNs together that allows the networks to have both backward

and forward information about the sequence at every time step.

2.5.2. Long-Short Term Memory (LSTM)

LSTM networks are a special kind of Recurrent Neural Network that able to hold data on

its memory for a longer period of time for learning long-term dependencies [12] and

coming with a new state called cell state and having Constant Error Carousel (CEC) which

allows the error to propagate back without vanishing [21].

LSTM algorithm suited for natural language understanding, semantic role labelling,

document classification and prediction based on time series data by reading the data

sequentially. These enable data scientists to create deep neural network models using large

stacked networks and handle complex sequence problems in machine learning more

efficiently [9, 25].

31 | P a g e

Figure 2. 6 Visualization of Long-Short Term Memory Networks

Bidirectional Long Short-Term Memory (Bi-LSTM) model manages inputs generated by

two separate states of LSTMs i.e., the forward and backward LSTMs. The Forward LSTM,

which is a regular sequence that starts from the beginning of the sentence while the

Backward LSTM, starts from the end of the input sentence which handles the input

sequences information in the backward direction.

Bidirectional LSTM processes the data in both directions with two separate hidden layers,

which are then fed to the same output layer and then It computes the forward hidden

sequence and the backward hidden sequence [2].

32 | P a g e

Figure 2. 7 Working of Bidirectional Long Short-Term Memory Networks

2.5.3. Multi-Layer Perceptron (MLP)

Multilayer perceptron is one of the most common types of Deep Neural Network (DNN)

which can be applicable in the quintessential deep learning models for classification tasks

[49]. Multilayer perceptron classifier consists of multiple layers of nodes in which each

layer is fully connected to the next layer in the network. These are the input layer, hidden

layers or intermediate layers and output layers.

The input layer consists of neurons that accept the input values. The output from these

neurons is same as the input predictors and nodes in the input layer represent the input data.

An output layer makes a decision or prediction about the input. Typically, the number of

hidden layers found in between input and output layers ranges from one to many and used

as the central computation layer that has the functions that map the input to the output of a

node [49].

The output layer returns the result back to the user environment. Based on [49], during

the design of a neural network MLP is the last component in the network, the dimensions

33 | P a g e

of the output match the number of classes. Often, a SoftMax function is applied to the

output to form a probability distribution over the classes.

MLP consists of a fully connected layer [49]. In a fully connected layer, the parameters of

each unit are independent of the rest of the units in the layer, that means each unit possesses

a unique set of weights. each input vector is associated with a label, or ground truth,

defining its class or class label is given with the data. The output of the network gives a

class score, or prediction, for each input.

To measure the performance of the classifier, the loss function is defined [50]. The loss

will be high if the predicted class does not correspond to the true class, it will be low

otherwise. MLPs have the same input and output layers but may have multiple hidden

layers in between the input and output layers.

Figure 2. 8 Multilayer Perceptron Architecture

MLP algorithms work through input layers and pass the input data by taking the dot product

of the input with the weights that exist between the input layer and the hidden layer. This

dot product yields an input value at the hidden layer. In hidden layers, MLPs utilize

activation functions at each of their calculated layers such as rectified linear units (ReLU),

sigmoid function, tanh. Then, Once the calculated output at the hidden layer has been

pushed through the activation function, push it to the next layer in the MLP by taking the

dot product with the corresponding weights. And it works until the output layer is reached.

34 | P a g e

At the output layer, the calculations will either be used for a backpropagation algorithm

that corresponds to the activation function that was selected for the MLP (in the case of

training) or a decision will be made based on the output (in the case of testing) [49].

In multi-layer perceptron, we apply non-linear activation function on hidden layer and

output layer to mapping input with the weights of the neurons and adding bias. In other

words - it is a mapping of the weighted inputs to the output. We also use a learning

algorithm called backpropagation which is used to continuously adjust the weights of the

connections after each out of processing. The adjustment is based on the error in the output.

In other words, the system is learning from mistakes. The process continues until the cost

of the error is at the lowest as possible.

2.6. Amharic Language

Amharic is the most widely spoken Semitic language in the central highlands of the country

next to the Oromo language and used as the official working language of Ethiopia [17]. It

has 33 characters with a left to right writing system to construct a meaningful sentence and

follows its own Subject / Object / Verb (SOV) word order or sentence structure which is

different from English language.

Amharic uses a script which originated from the Ge’ez alphabet with 33 basic characters

with each having 7 forms for each consonant-vowel combination. Unlike Arabic and

Hebrew, Amharic language uses a semi-syllabic system from left to right with 33 basic

characters with each having 7 forms.

35 | P a g e

Figure 2. 9 Amharic Writing Script Source: - taken from

(https://www.amharicmachine.com/default/alphabet accessed date=9/25/2020)

Semantics is the study of the meaning of words, phrases and sentences by drawing the

exact meaning or the dictionary meaning from the text. In semantic analysis, there is always

an attempt to focus on what the words conventionally mean, rather than on what a speaker

might want the words to mean on a particular occasion. This technical approach to meaning

emphasizes the objective and the general. It avoids the subjective and the local. Linguistic

semantics deals with the conventional meaning conveyed by the use of words and sentences

of a language [14].

In linguistic semantic analysis concept [14], semantics deals with the conventional

meaning conveyed by the use of words and sentences of a language to focus on what the

words conventionally mean, rather than on what a speaker might want the words to mean

on a particular occasion.

https://www.amharicmachine.com/default/alphabet

36 | P a g e

2.6.1. Amharic Sentences

A sentence is the basic unit of language, a grammatically complete idea and group of words

or phrases that are put together to express a complete thought. It is a group of words or

phrases. Phrase is a basic building block of a sentence [52]. A phrase is a structure in a

language that is constructed from one or more words in the language. It is a syntactic

structure that is wider than a word and smaller than a sentence. Phrases can be constructed

only from a head word or a head word.

In Amharic, phrases are categorized into five categories, namely noun phrase, verb phrase,

adjectival phrase, adverbial phrase and prepositional phrase [14, 51].

A noun phrase is a syntactic unit in which the head is a noun or a pronoun. For example,

in the noun phrase, ነጭ እርግብ/ “White dove”, ነጭ “White” is an adjective which modifying

the noun እርግብ/ “Dove”. A verb phrase is composed of a verb as a head and constituents

such as complements, modifiers and specifiers. For example, in the verb phrase, ወደ አሜሪካ

ሄደ “He went to America, ወደ አሜሪካ ‘to America’ is prepositional phrase modifying the

verb ሄደ ‘Went’ from place point of view.

In Amharic language [51], the adjectival phrase is similar to that of a noun phrase and verb

phrase. It can be composed of an adjective as a head and constituents such as complements,

modifiers and specifiers. For example, in the adjectival phrase, በጣም ትልቅ “Very large”,

በጣም ‘Very’ is a modifier modifying the head of the adjective, ትልቅ “large”.

Based on their number of predicates they contain [17] the structure of a sentence would be

either a simple, compound and complex sentence.

Simple sentences

Simple sentences consist only of one main predicate in their sentence structure.

For Example, the following are simple Amharic sentences.

Example: አበበ ምሳውን በላ (Abebe ate his launch)

A simple sentence may also describe the state of being of the subject or an action that takes

place in the sentence. Example: አበበ መምህር ነው/ Abebe is a teacher; this sentence describes

37 | P a g e

the present state of being of Abebe. We call the above sentences simple Amharic sentences

because they contain only one predicate.

Simple sentences are classified into four kinds called declarative sentences, interrogative

sentences, imperative sentences and negative sentences [51].

Declarative sentence

Declarative sentences are used to convey ideas and feelings that the speaker has about

things, happenings, feelings, etc., that could be physical, mental, real or imaginary.

Example: ዳዊት መምህር ሆነ/Dawit became a teacher

Interrogative sentence

An interrogative sentence is a sentence that questions about the subject or the action the

verb specifies.

Example: አስቴር መቼ መጣች? /When did Aster come?

Negative sentence

Negative sentences simply negate a declarative statement made about something.

Example: ቢኒያም ትምህርት ቤት አልሄደም/Binyam did not go to school

Imperative sentence

Simple imperative sentences convey instructions and mostly their subject is a second

person pronoun that is usually but implied by the suffix on the verb.

Example: ዝም በል/Shut up!

Compound

Compound sentences consist of two independent clauses or more than predicate with one

main verb in their sentence structure [17,52]. A coordinating conjunction (for, and, nor,

but, or, yet, so) often links the two independent clauses and is preceded by a comma. For

Example, the following are compound and complex Amharic sentences.

38 | P a g e

Example: አማረ ቁርሱን በልቶ ወደ ትምህርት ቤት ሄደ/Abebe Go to school after eating his breakfast

(Compound)

Complex Sentences

Complex sentences are formed from either complex noun phrases or complex verb phrases

or both [51,52]. In other words, a complex sentence can have a complex NP and a simple

VP, a simple NP and a complex VP or both complex NP and complex VP. Complex NPs

contain at least one embedded sentence which can be a complement or other type phrase.

On the other hand, complex VPs contain at least one sentence or more than one verb.

A complex sentence contains one independent clause and one or more dependent clauses.

A complex sentence will include at least one subordinating conjunction.

Example: መሳይ ውድድሩን አሸንፏል፣ ነገር ግን ሽልማት አልተሰጠውም /Mesay won the competition,

however he doesn’t get a reward (Complex)

However, for the purpose of this study we have used only simple Amharic sentences types

as well as sentences which have ambiguous predicates or verbs and their associated roles.

39 | P a g e

CHAPTER THREE: RELATED WORKS

3.1. Introduction

In this chapter, we have reviewed different works done on the area of semantic role labeling

by different scholars using different neural network classification approaches for different

languages like, English, Chinese, Vietnamese. Among these we have explained about

semantic role labeling works using recurrent neural networks specifically using LSTM and

Bi-LSTM approach and multilayer perceptron classifier.

3.2. Semantic Role Labeling for English and European Languages

Most previously proposed SRL systems focused on an end-to-end SRL learning system

without using parsing techniques and works based on predefined features and syntactic

structures of sentences. building an end-to-end SRL learning system without using parsing

becomes less successful and provides low accuracy results on SRL systems [28]. Due to

this, researchers motivated to develop different sequential neural network SRL models that

are able to integrate different techniques such as automatic feature selection, parsing and

POS tag to achieve a better accuracy result [13, 28].

In [28] the authors proposed end-to-end learning of semantic role labeling using recurrent

neural networks. They have used deep bi-directional recurrent network as an end-to-end

system and deep bi-directional long short-term memory model which works only by taking

original text information as input feature without the need of predefined feature and any

intermediate tag as syntactic knowledge. Their network model processed the input features

by 8 layers of LSTM bi-directionally.

The latent variables of the model implicitly capture the syntactic structure of a sentence.

Additionally, the authors locate the conditional random field (CRF) model at the top of the

network layers for the purpose of tag sequence prediction. Their model achieves an F1

score of 81.07 on CoNLL-2005 shared task and 81.27 on CoNLL-2012 shared task.

In [13] the authors proposed deep semantic role labeling with self-attention. They

developed a simple and effective deep attentional neural network architecture for SRL to

40 | P a g e

handle structural information and long-range dependencies problems of end-to-end SRL

with recurrent neural networks (RNN). Their network model works based on self-attention

which can directly capture the relationships between two tokens regardless of their

distance.

In addition to this, the authors implemented position embedding technique to hold

positional encoding in attention mechanism and distinguish the position of each input word.

The authors conduct an experiment on the two commonly used datasets from the CoNLL-

2005 shared task and the CoNLL-2012 shared task and achieved an F1-score of 83.4 on

the CoNLL-2005 shared task dataset and 82.7 on the CoNLL-2012 shared task dataset

respectively.

A study by [33], presented Selectively connected self-attentions for semantic role labeling

to solve the long training time over a large data set requirements of recent deep neural

network models. They proposed a novel deep neural network model for semantic role

labeling which works based on stacked attentive representations to provide selective

connections among attentive representations and to capture the hierarchical structures of

languages.

Based on their experimental result their model performed better accuracy as compared to

the state-of-the-art studies, reduced the training time by 62 percent and achieved an F1

score of 86.6 and 83.6 on the CoNLL 2005 and CoNLL 2012 shared tasks respectively.

3.3. Semantic Role Labeling for Chinese Language

In [29] Wang proposed a semantic role labelling system for Chinese language without

considering either syntactic parsing or POS tagging techniques rather they divided the

activities into two main techniques i.e., clustering and labeling. Clustering aims to clustered

similar sentences together for partially replacing syntactic parsing where as in the labeling

step, ANN feeds many numbers of cluster features with chunks of a sentence and then

labeling them with associated semantic roles.

In the experimentation part, the author manually annotated more than one thousand

Chinese clausal sentences by including imperatives, assertions and queries and grouping

41 | P a g e

the syntactic elements into four categories: predicate verb, argument, particle, and marker.

Their model achieved an accuracy of 83.8% and experimental results show the

effectiveness of clustering.

A study of [26] proposed the first Automatic Semantic Role Labeling for Chinese Verbs

using a pre-release version of the Chinese Proposition Bank. The authors used a Maximum

Entropy classifier with a tunable Gaussian prior in the Mallet Toolkit to minimize

overfitting by adjusting the Gaussian prior. The authors reported a result on two conducted

experiments. On the first experiment they obtained F1-score of 92.7% using the

handcrafted parses in the treebank and F1-score of 93.9% using a fully automatic Chinese

parser that integrates word segmentation, POS-tagging and parsing on the second

experiment.

During their experimentation, the authors adopted features that have been described in

recent work on English semantic role labeling to Chinese and focused more on how verb

classes can be induced from “frame files” from the Penn Proposition Bank and be used as

features.

3.4. Semantic Role Labeling for Arabic Language

In [46] the authors proposed the first modern standard semantic role labeling system for

Arabic language based on a supervised machine learning model that uses support vector

machines (SVM) technology and standard features. They adopted an SRL model that uses

SVM to implement a two steps classification approach, i.e., boundary detection and

argument classification. They have trained and tested the model using the pilot Arabic

PropBank data released on the SEMEVAL 2007 data and obtained an F1-score of 94.06

on argument boundary detection and 81.43 on the complete semantic role labeling task

using gold parse trees.

A study of [27] developed Semantic Role Labeling Systems for Arabic using Kernel

Methods that have a capability of exploiting many aspects of the rich morphological

features of the language. Their proposed system works based on a supervised model that

uses support vector machines (SVM) technology for argument boundary detection and

argument classification. They have trained and tested their model using the pilot Arabic

42 | P a g e

PropBank data released as part of the SemEval 2007 data. They conducted an experiment

on the pilot Arabic PropBank data based on Support Vector Machines and Kernel Methods

and they have used SVM-Light-TK toolkit for experimentation which yields an F1 score

of 82.17%.

3.5. Semantic Role Labeling for Amharic Language

In by [18] developed an automatic Semantic Role Labeler for Amharic Text Using Memory

Based Learning. The authors proposed the general architecture of semantic role labeler for

Amharic text using Memory Based Learning (MBL) in 2017. They developed and

implemented feature extraction algorithms to extract 551 instances from 240 simple

Amharic sentences and achieved an accuracy of 82.51% with default parameter values (i.e.,

number of nearest neighbors, distance metrics, class voting weight and feature weighting

metrics) and 89.29% with optimized parameters of MBL algorithm.

Different predicates have different senses in different Amharic sentences. For example, the

following two sentences “አማረ ቢላዋውን በሞረድ ሳለ” and “አማረ ጉንፋን ስለያዘው ክፉኛ ሳለ” are

taken from two different domains. In these sentences, the predicate “ሳለ” have different

contextual meanings and takes different semantic roles. However, E. Yirga and her

colleagues’ work did not consider multiple senses of predicate on simple Amharic

sentences rather they considered only semantic role labeling cases on simple Amharic

sentences with predicates which have only one sense. For example, the following simple

Amharic sentences are labelled as follows:

1. አማረ [ARG0-AGT] ቢላዋ [ARG1-BEN] በሞረድ [ARGM-INS] ሳለ [REL] and

2. አማረ [ARG0-AGT] ጉንፋን ስለያዘው [ARGM-CAU] ክፉኛ [ARGM-MNR] ሳለ [REL]

On the above-labeled simple Amharic sentences, in the first sentence the labelled

arguments [ARG0-AGT], [ARG1-BEN] and [ARGM-INS] answer the questions: who

sharpens? What is sharpened? What tools are used to sharpen? respectively where as in the

second sentence the arguments [ARG0-AGT], [ARGM-CAU], [ARGM-MNR] answer the

questions: who coughs? Why Amare coughed? In what way Amare coughed? respectively.

Therefore, in these sentences the predicate “ሳለ” have different contextual meaning and

43 | P a g e

each of them take different role arguments. However, the previous work did not consider

predicate with multiple contextual meanings. To address the above problem, this study tries

to consider contextual meaning of predicates during dataset collection and we have

annotated each sentence arguments with respect to the sense of predicate based on

PropBank annotation guidelines provided by [42] before feed to our model.

E. Yrga [18] used Memory Based Learning (MBL) technique. MBL is a direct descendant

of the classical k-Nearest Neighbor (k-NN) approach. However, MBL is sensitive to the

chosen features and algorithm parameters. MBL is a shallow machine learning algorithm.

So, it requires manually extracted task-specific features to train the model and concepts are

lost during a backpropagation that causes vanishing/exploding gradient problems [21, 22].

Therefore, this handcrafted based feature extraction consumes more time and causes

feature unbalancing on the concepts.

A study of [18] also used the k-NN algorithm which is a supervised machine learning

algorithm used for both classification and regression problems [36]. However, k-NN is a

lazy learner algorithm (i.e., as the size of the data increases it does not learn anything from

the training data simply uses the training data itself for classification) and used for only

classifying numeric data and [36].

44 | P a g e

CHAPTER FOUR: SYSTEM DESIGN AND IMPLEMENTATION

4.1. Introduction

In this chapter, we discussed the overall architecture of the proposed Context-Based

semantic role labeler model for Amharic text. First, we illustrate the overall architecture of

the proposed model and then we described the individual components of the proposed

system architecture.

4.2. Proposed System Architecture

Figure 4. 1 Proposed System Architecture

45 | P a g e

The proposed system architecture consists of three main phases. These are Preprocessing,

Training and Testing (Semantic role labeling) phases and inside each phase there are

different tasks. In data preprocessing, we have prepared collected simple Amharic

sentences in suitable format. In the proposed system architecture, we have used embedding

techniques to represent the preprocessed Amharic sentence into dense vector, Bi-LSTM

network for sentence encoding to handle the information sequence of input sentence in

forward and backward directions and Multi-Layer Perceptron (MLP) neural network

classifier for generating the score of role labels in each argument.

In this section, we have seen the detailed description about each proposed system

architecture phase.

4.2.1. Preprocessing Phase

Preprocessing phase allows us to make our dataset suitable for our neural network model

for further processing tasks. Text preprocessing is a preliminary step for natural language

processing operation to transform a text into a machine understandable format that allows

machine learning algorithms to work easier [40]. For this study, the preprocessing stage is

done manually which consists of normalization and POS tagging for each token or

arguments of input Amharic sentences.

As shown in the Table 4.1 below: The final result of this phase is normalized and its

respective POS tagged Amharic sentence.

Table 4. 1 Sample output of Preprocessing Phase

No Sentence Normalized POS Tagged

1 ቀነኒሳ በቀለ የማራቶን

ሪከርድ ሰበረ

ቀነኒሳ በቀለ የማራቶን

ሪከርድ ሰበረ

ቀነኒሳ NOUN በቀለ NOUN የማራቶን

NOUNP ሪከርድ ADV ሰበረ VERB

2 ዳንኤል እንጨት ሠበረ ዳንኤል እንጨት ሰበረ ዳንኤል NOUN እንጨት NOUN ሰበረ

VERB

46 | P a g e

4.2.1.1. Normalization

Normalization is a process of converting a list of words to a more uniform sequence of

words to improve text matching [40]. It is a technique used as part of data preparation for

machine learning to change the values of numeric columns in the dataset to a common scale,

without distorting differences in the ranges of values. This technique helps prepare text for

later processing by transforming the words to a standard format. For example, In Amharic

language “ሠ” and “ሰ”, “ሀ” and “ሐ” converting all words to a single representation will

simplify the searching process.

In this study, we have only normalized the list of predicates found in our collected dataset

to a common representation to simplify predicate sense consideration tasks during a

semantic annotation of the dataset. Our proposed network model required normalized

dataset, especially predicate for assigning a correct role label tag for each argument in a

sentence based on the given predicate since it used a sense of predicate as one feature

during semantic role labeling task. So, the same sense predicates but which have different

word formation should have to be represented to a common scale as shown in table 4 below

to reduce the gradients overall oscillation time to represent each feature during training and

to minimize the challenges appeared towards our neural network classifier.

Table 4. 2 Sample Normalized verbs in the Dataset

Word Word Normalized

ሰበረ ሠበረ ሰበረ

ሰራ ሠራ ሰራ

አገኘ ዐገኘ አገኘ

ሄደ ሔደ ሄደ

In neural network models [40], different words have different features, representations and

ranges of values. Hence during training the gradients can oscillate back and forth for a

long period of time to find the global representation of each unique feature. To overcome

this problem, we need to normalize our dataset and make sure that the different features

take on similar ranges of values before we are going to annotate the data and feed it to our

model. So, the gradient descents can converge each feature more quickly.

47 | P a g e

4.2.1.2. POS tagging

POS tagging is also called grammatical tagging or word category disambiguation, the

process of assigning an associated part of a speech tag to a word in a corpus based on its

context and definition [37]. POS tagging can be used as an intermediate step for higher

level NLP tasks such as parsing, Text to Speech (TTS), Information Retrieval (IR), shallow

parsing, Information Extraction (IE) semantic analysis, machine translation.

This process allows us to correctly classify the semantic role of the sentence constituents

(arguments) and predicates found in a given sentence. In the process of POS tagging, the

given Amharic sentence tokenized into a sequence of words/phrases. Each set of

words/phrases represented associated arguments found in our sentences. Then, for the

purpose of tagging each sentence in addition to the online Amharic tagger module we have

used two linguistic experts from wollo university Amharic department to assign part of

speech tag of the individual word in a sentence. From the tagged sentence, tokens with

“Verb” POS tag represent predicates found in a sentence. In our collected dataset, we have

only one “main verb” or “predicate” since our study focused on only simple Amharic

sentence structure. For Example, “አየለ ትናንት ቤት ገዛ” is a simple Amharic sentence taken

from our collected dataset. In the sentence the “predicate” is “ገዛ”. So, the sentence “አየለ

ትናንት ቤት ገዛ” looks like the following format after applying a habit online Amharic POS

tagger module.

Table 4. 3 Sample Amharic Sentence Tagged by online Amharic tagger module

ID Words/Tokens POS Tag

1 አየለ NOUN

2 ትናንት ADV

3 ቤት NOUN

4 ገዛ VERB

1 ደርበው NOUN

2 ስራውን NDet

3 ለቀቀ VERB

48 | P a g e

4.2.1. Training Phase

The training phase of the system architecture contains two main components. These are

Bi-LSTM encoder and Biaffine Attentional Scorer. Bi-LSTM encoder that takes each

word embedding of the given sentence as input and generates dense vectors for each word

and a biaffine attentional scorer which takes the hidden vectors for the given word pair as

input and predict a label score vector for each pair of words.

4.2.1.1. Annotated Amharic Sentences

As we have explained in the preprocessing phase, after we have collected a simple Amharic

sentence from different social media platforms, we have prepared a semantically annotated

Amharic sentence based on the PropBank annotation guidelines developed by [42] those

contains tokens of each sentences, POS of a token and its associated head-dependent role

of each argument respectively. The head-dependent role represents the predicate “ID” and

predicate relationship with each argument in a sentence. We have used these annotated

Amharic sentences to train our neural network model. In the annotated dataset, each

arguments/tokens and POS tags as well as head/predicate of each arguments are listed

down in a CONLL file format style by separating each in a “whitespace”. We have also

used “newline” between sentences to separate one sentence to the other. So, the proposed

model used this empty space to identify one sentences from the other in our annotated

dataset during training.

4.2.1.2. Embedding Layer

Embedding layer is applicable for text processing to make the input data encoded and

represent each token of words by a unique integer. The Embedding layer is initialized with

random weights and learns an embedding for all words in the training dataset [21]. The

embedding technique takes three parameters. i.e., input length, input dimension and output

dimension. The input dimension represents the size of the vocabulary in the text data.

output dimension represents the size of the output vectors in the embedding layer and the

vector space in which words will be embedded and input length is the length of input

sequences.

49 | P a g e

In our case, we have two embedding representations: these are word and POS embedding.

Word embedding is used to represent each argument in a sentence which can be represented

into a dense vector. To perform word embedding, in our case we have initialized three

respective arguments. i.e., we have used 100,200, 30 dimensions as input, output, and input

length respectively.

Pos embedding is also used to represent a POS tag used in a sentence which can represent

a dense vector. To perform POS embedding, in our case we have initialized three

respective arguments. i.e., we have used 100,200, 28 dimensions as input, output, and input

length respectively.

Finally, these vectors are concatenated and passed to the next layer. We used the following

techniques to concatenate the result vectors of words/Arguments and POS embedding.

Figure 4. 2 Concatenation of Argument and POS tag embedding

The word representation of our model is the concatenation of a randomly initialized

word/argument embedding e(a), a randomly initialized part-of-speech tag embedding

e(pos). So, the final word representation is given by e = e(a) ⊕ e(pos), where ⊕ is the

concatenation operator.

4.2.1.3. Bi-LSTM Encoder

Bi-LSTM is a special kind of RNN and powerful in sequence prediction problems because

it is able to store past information and is able to process the sequence of input data in both

forward and backward directions [2].

In Semantic role labeling, Predicates in a sentence are treated as roots in a graph whereas

arguments in a sentence are treated as nodes in a graph which are dependent on the root

node i.e., predicate. Therefore, for our study we have used Bi-LSTM sentence encoder to

handle the position information of words in a sentence and to consider the context of texts

50 | P a g e

from left to right and from right to left in a sequence. In addition to encoding positional

information, Bi-LSTM networks are used to capture sentence aware representation of the

given input sequence and reduce the need for feature engineering.

For Example: Let us show how BiLSTM works in the following Amharic sentence which

is taken from our dataset: input sequence x = (ሳሙኤል ምሳሩን በሞረድ ሳለ). The forward LTSM

handles the sequence information from left to right sequentially starting from 1 to n, where

n refers to the number of arguments/tokens in a sentence. From the above example n=4,

therefore, in the given sentence, the forward LSTM handle sequence information starting

from the words “ሳሙኤል” to the final word “ሳለ”. The LSTM learns the position words in

the forward direction and works until the end token of the sentence. Finally, it passes the

sequence information to the output layer.

Whereas, the backward LSTM also handles the given sentence information in reverse

direction. i.e., from n to 1. It handles the sequence information of the given sentence

starting from the token “ሳለ”. to the token “ሳሙኤል”. Similarly, this LSTM also handles the

position of words in a sentence in reverse direction.

Based on this, the Bi-LSTM network combined the result of Forward LSTM and Backward

LSTM to produce a final context-aware output.

Table 4. 4 The Forward LSTM and backward LSTM Result for the sentence

“ሳሙኤል ምሳሩን በሞረድ ሳለ”

Word Forward LSTM Backward LSTM

ሳሙኤል ምሳሩን, በሞረድ, ሳለ ----

ምሳሩን በሞረድ, ሳለ ሳሙኤል

በሞረድ ሳለ ምሳሩን, ሳሙኤል

ሳለ -- በሞረድ, ምሳሩን, ሳሙኤል

4.2.1.4. Multi-Layer Perceptron (MLP) Classifier

Multi-layer Perceptron (MLP) is an artificial neural network model that uses mapping the

given input data onto a set of appropriate outputs. It consists of at least three layers of

nodes: an input layer, a hidden layer and an output layer. Except for the input nodes, other

51 | P a g e

layers use a nonlinear activation function. MLP needs a combination of backpropagation

and gradient descent for training [49].

In this study, we used the role MLP classifier which is used to predict the score of roles

between arguments and predicate in a sentence. In order to determine the score of roles,

the role MLP input layer takes two vectors. These are: each word in a sentence as argument

and a predicate in a sentence. Then, the role MLP classifier takes predicate word as head

and possible arguments in a sentence as dependent. Finally, the inputs are pushed forward

through the MLP by taking the dot product of the input with the weights that exist between

the input layer and the hidden layer. The result of the dot product yields an input value for

the hidden layer. the MLPs apply rectified linear unit’s activation functions to generate the

outputs. Then push the calculated output at the current layer. Once the calculated output at

the hidden layer has been pushed through the activation function, push it to the next layer

in the MLP by taking the dot product with the corresponding weights, and it works until

the output layer is reached. At the output layer, the score of predicate words and possible

arguments in a sentence will be generated.

Figure 4. 3 Visualize MLP classifier model used in semantic role labeling

52 | P a g e

Activation Function

In a neural network, the activation function is responsible for transforming the summed

weighted input from the node into the activation of the node or output for that input and

used to determine the output of the neural network by mapping the result values in between

0 to 1 or -1 to 1 [50]. Neural network activation functions are a crucial component of deep

learning for determining the output of a deep learning model, its accuracy and also the

computational efficiency of training a model which can make or break a large-scale neural

network.

ReLU is a non-linear activation function that is used in multi-layer neural networks or deep

neural networks [50]. It enables to accelerate the training speed of deep neural networks as

compared to traditional activation functions since the derivative of ReLU is 1 for a positive

input. Due to a constant, deep neural networks do not need to take additional time for

computing error terms during the training phase and are capable of solving the vanishing

gradient problem.

For our study, we have used the Rectified Linear Unit (ReLU) activation function. Since it

is computationally efficient and it makes the network model to converge very quickly.

ReLU has a derivative function and allows for backpropagation.

For illustration how our model calculates the score of roles between predicate and possible

arguments in a sentence. Let’s consider a sentence S = “ዳንኤል ቢላዋ ሳለ”. plus, roots at an

artificial node “ROOT. In the sentence “ዳንኤል ቢላዋ ሳለ”, “ሳለ” is the predicate and we have

two arguments i.e., “ዳንኤል” and “ቢላዋ”. The input layer is the words/arguments and part-

of-speech (POS) tags, then fed to the embedding layer. The embedding layer generates the

vector representation of the input word and part of-speech tag. These vectors are

concatenated and used as input to the LSTM layers (forward LSTM layer and backward

LSTM layer). The output of the LSTM layers is concatenated to produce the input role

MLP layers, which predict the score of roles between predicate and possible arguments in

a sentence.

53 | P a g e

Finally, the role MLP classifier generates scores between arguments and predicate words

in a sentence with respect to the given semantic role label.

Table 4. 5 A Work of MLP Scoring function on the sentence “ዳንኤል ቢላዋ ሳለ”

Arguments Predicate Semantic role Role MLP Classifier

ዳንኤል ሳለ ARG0-AGT S (ዳንኤል, ሳለ, ARG0-AGT)

ዳንኤል ሳለ ARG1-BEN S (ዳንኤል, ሳለ, ARG1-BEN)

ዳንኤል ሳለ REL S (ዳንኤል, ሳለ, REL)

ቢላዋ ሳለ ARG0-AGT S (ቢላዋ, ሳለ, ARG0-AGT)

ቢላዋ ሳለ ARG1-BEN S (ቢላዋ, ሳለ, ARG1-BEN)

ቢላዋ ሳለ REL S (ቢላዋ, ሳለ, REL)

As we have seen from table 4.5 above, the MLP scoring function calculate the score of

each argument with the given predicate in a sentence to be a given semantic role and based

on the score provided by the MLP scoring function the “Role MLP Classifier” classifies a

given argument in a sentence by selecting a maximum score value. For example, from the

above table, let the score of “S (ዳንኤል, ሳለ) to be “ARG0-AGT” =5” and score of “S (ዳንኤል,

ሳለ) to be “ARG1-BEN “= 4” so the “Role MLP Classifier” selects S (ዳንኤል, ሳለ, ARG0-

AGT) based on the maximum score result of MLP scoring function.

4.2.2. Testing/Semantic Role Labeling Phase

Testing/Semantic Role Labeling phase of a proposed system architecture works based on

the network parameters on the trained network model obtained from training phase to

predict associated role labels for each argument in a given sentence and generate a

semantically annotated Amharic sentence.

4.2.2.1. Predict Score of Role Label

In semantic role labeling, the neural network previously described is used only to predict

the scores for argument and relations based on the content of the word and POS tag.

In this stage, the trained model is loaded to compute a scores role label (predicate,

argument, sentence) for assigning a given role label. After having computed scores for all

54 | P a g e

(predicate, argument) combinations in a given sentence, we have applied a maximum

selection algorithm to select the highest score semantic role label. Finally, for each pair

(predicate, argument) previously detected, the semantic role label selects the highest score

labels for connection words.

4.2.2.2. Select Maximum Score of Role Label

In this step; we apply the maximum score selection algorithm to select the highest score

semantic role label between predicate and argument in a sentence to build a semantically

labeled Amharic sentence. The algorithm uses the incoming score of the role label for each

word in a sentence as input. Then the algorithm greedily selects the highest score of the

role label and makes a decision based on this, and it works for each word in a sentence.

4.2.2.3. Generate Labelled Amharic Sentences

This section shows semantically annotated labelled Amharic sentences generated by our

neural network model based on the score of each argument and role label pairs predicted

by the proposed network model.

The following table shows the result generated for the sentence: “መልካሙ አምስት ፊልሞችን

ደረሰ”.

Table 4. 6 Sample Labelled Amharic Sentence Generated by the model

No Tokens/Arguments POS Tag Predicate Role Label

1 መልካሙ NOUN 4 ARG0-AGT

2 አምስት NUMCR 4 ARG1-BEN

3 ፊልሞችን NDet 4 ARG1-BEN

4 ደረሰ VERB 0 REL

55 | P a g e

4.3. The Proposed Network Model

Figure 4. 4 Proposed Network model

The proposed network model takes a sequence of tokens i.e., each token represents

arguments in a given sentence and its associated POS tags from the input Amharic sentence

as input and performs word and Tag embedding within their own embedding dimensions.

i.e., 100 embedding dimensions for words and 30 embedding dimensions for their

associated POS tags. After that, we concatenated these two embedding results before

feeding to Bi-LSTM encoder. The Bi-LSTM encoder takes a sequence of each word and

POS tag embedding of the given sentence from the embedding layer as input and generates

a dense vector value for each word as output.

The Multi-layer Perceptron layer used for mapping the given input data from the Bi-LSTM

encoder onto a set of appropriate outputs. Multi-layer Perceptron uses biaffine attentional

scorer which takes the hidden vectors for the given word pair as input and predicts a label

score vector. For the purpose of our study, we have used the role MLP classifier which is

used to predict the score of roles between arguments and predicate in a sentence. This MLP

classifier generates a score of the role label as output.

56 | P a g e

CHAPTER FIVE: RESULTS AND DISCUSSION

5.1. Introduction

In this chapter the experimental environment, tools and algorithm proposed, evaluation

techniques used and results obtained from the experiments are presented in detail. In

addition to this, we have explained about the data that we have collected and important

dataset preparation techniques that we have applied to transform them from raw data format

to neural network understandable (CONLL) format since this format is suitable to our

neural network model.

5.1.1. Dataset Collection

In this section, we have collected 2000 non-domain specific simple Amharic sentences

from different social media platforms and student textbooks including sentences that

contain predicates which have more than one contextual meaning. Later on, we have

applied data preparation steps such as Normalization, POS tagging on the sentences before

feeding them to the neural network model.

Table 5. 1 Collected Sample Amharic Sentences with their Domains

NO Collected Sentences Domains

1 ሞላ ትናንት ህይወቱ አለፈ Health

2 የኢትዮጵያ ብሄራዊ ቡድን ለአፍሪካ ዋንጫ አለፈ Sport

3 መስፍን አዲስ ጫማ ገዛ Science and Technology

4 አማረ ጉንፋን ስለያዘው ክፉኛ ሳለ Health

5 ሰለሞን ልቦለድ ደረሰ Entertainment

6 ኢትዮጵያዊው አትሌት በለንደን ኦሎምፒክ የወርቅ

ሜዳሊያ አገኘ

Sport

7 አማረ ትልቅ ስዕል ሳለ Entertainment

8 የጀርመኑ ጠፈር ተመራማሪ አዲስ ፕላኔት አገኘ Science and Technology

57 | P a g e

The following table shows the amount of data that we have collected from different social

media platforms and student textbooks in different domains.

Table 5. 2 Collected Amount of Dataset from Different social media platforms

No

Source No of sentences in each domain

Health Sport Science & Technology Entertainment Sum

1 Walta News 300 200 200 200 900

2 Amhara Mass

Media

200 300 200 200 900

3 Student Textbooks 200 200

Total 2000

5.1.2. Dataset Preparation

In this section we have described the data that we have used for training and testing the

proposed network model and important data preparation methods. After collecting these

data, we have performed data preprocessing (i.e., normalization, POS tagging) and

identifying predicates with more than one contextual meaning in the sentence and identify

equivalent arguments for them before we are going to annotate the data and then the data

is semantically annotated based on PropBank annotation guidelines defined by [42].

As we have explained above, the role of Semantic Role Labelling is to determine a

semantically related arguments to the predicate for the purpose of answering the question

“who did what to whom”, “when” and “where”. Due to this, predicates refer to the main

verbs in the sentence that take different arguments. So, each individual arguments of a

sentence is assigned an associated role label depending on the predicate (main verb) in the

sentences. For our study, we have used a “predicate” as a “head” or “root” and “each

associated argument” as a dependent pair and based on the predicate we have assigned an

“ID” for each argument which represents a predicate in a sentence.

58 | P a g e

For Example, the sentence “አየለ ትናንት ቤት ገዛ” looks like the following format after applying

POS tagging technique and assigning an “ID” for each argument.

Table 5. 3 Predicate-Argument Relation

ID Words/Tokens POS Tag Predicate Role

1 አየለ NOUN 4 ARG0-AGT

2 ትናንት ADV 4 ARGM-TMP

3 ቤት NOUN 4 ARG1-BEN

4 ገዛ VERB 0 REL

The ID, words/tokens, POS tag and role of a sentence are separated by a white space like

“አየለ NOUN 4 ARG0-AGT”and two different sentences are separated by a “newline”.

5.2. Experiment/Implementation

We have used python 3.7 programming language installed on windows environment to

develop the model and experiments are done based on the prototype developed with Keras

packages (used TensorFlow as a backend) on Intel Core ™ i5-6200 CPU, and 4 GB of

RAM.

For conducting the experiment, from the total of the dataset we used 70% of the data for

training and splitting the remaining 30% into two equal parts and used for testing and

validation i.e., 300 data for Testing and 300 for validation because of the size of the dataset.

Based on this, we have selected 1400 data for training, 300 data for testing and 300 data

for validation throughout the experiments i.e., allocating 70% of the dataset for training

and the rest 30% of the dataset for testing and validating the model. From the total of the

collected dataset, we have a total of 40 ambiguous verbs or predicates that can take different

sets of role labels to their associated arguments according to their senses. In our case, 30

of the predicates have two different interpretations and also 10 of the predicates have three

contextual meanings in a sentence in which each of them takes a different set of role labels

to represent their arguments. During our data annotation process, we have considered two

up to three senses of each ambiguous verb. So, we have a total of 80 sentences that are

59 | P a g e

annotated based on the sense of their predicates. In addition to the number of single sense

simple Amharic sentences, we have used 50 of the multi-sense predicate sentences for

training the model and the rest 20 for testing as well as 10 of them for validation. However,

E. Yirga and her colleague’s dataset is 200 simple Amharic sentences which is collected

simply from student Amharic and which does not consider the sense of polysemy verbs

during data annotation (semantic role label assignment) and impossible to get the data. So,

this makes it difficult to use their dataset to our work since disambiguating polysemy verbs

during data annotation is one of the tasks done in this research work.

Table 5. 4 Training, Testing and Validation Dataset Description

No Dataset Section Number of Sentences Used

1 Training Dataset 1400

2 Testing Dataset 300

2 Validation Dataset 300

3 Total Dataset 2000

5.2.1. Hyperparameter tuning

While building the LSTM model there are hyperparameters such as dropout, learning rate,

batch size etc. that have to be set up properly to get an accurate prediction during back

testing the proposed network model. A study by [21] inspired that finding the optimal

hyperparameters for our model leads us to obtain higher accuracy results and minimize the

risk of model overfitting. For our study work, we have identified the most important

hyperparameters with their optimal value which are presented in Table 5.5 below those fit

our network model and achieved best accuracy result. During training our network model,

we have taken each hyperparameter one by one and selected the optimal value of each

hyperparameter which achieved good training and testing accuracy.

The learning rate is a tuning parameter in an optimization algorithm that determines the

step size at each iteration to control the model response to the estimated error at each time.

So, Choosing the optimal value of learning rate is a very important task in training a deep

60 | P a g e

neural network models because a too large value of learning rate causes to the model to

converge too fast to a suboptimal solution, whereas a too small value of learning rate leads

to a long training process to get stuck.

We have trained the Bi-LSTM model with different dropout and learning rate values.

When the dropout = 0.33 and learning rate = 0.001, the proposed model achieved 92.5%

training and 79.9% testing accuracy result. Whereas the dropout value = 0.5 and the default

learning rate value = 0.0001, the proposed model achieved 94.5% training and 83.8%

testing accuracy result. Based on these results, the Bi-LSTM dropout = 0.5 and the default

learning rate of Adam = 0.0001 values are selected. Regarding on this result, to improve

the performance of a deep Bi-LSTM network classifier to semantic role labeling we have

increased the dropout value of the neural network and used the default learning rate value

of the optimizer i.e., Adam and decreasing the dropout value of the MLP classifier i.e.,

0.01 to reduce misclassified classes or set of role labels for unseen data.

Table 5. 5 List of Hyperparameters Used in the Model with their Description

No Parameter Value Description

1 Epochs 50 ✓ It refers to the number of iterations when

the whole training data has been passed

through the network, hence one epoch is

one iteration of the whole training data

being passed through the network.

✓ The model is trained in 50 cycles or

iteration in the given training data.

2 Learning rate

0.001 ✓ It controls how much to change the model

in response to the estimated error at each

time the model weights are updated.

3 Embedding

Dropout

0.3

✓ It is a function that randomly chooses cells

in a layer according to the probability

Bi-LSTM Dropout 0.5

61 | P a g e

MLP Dropout 0.01 chosen and sets their output value to 0 for

reducing model overfitting problem.

✓ For our study, we have assigned the

dropout value between 0 and 1 in the

experiment randomly and take the best

result in our model.

4 Bi-LSTM Layer 3 ✓ The number of layers used in Bi-LSTM

network encoder.

5 Word Embedding

dimension

100 ✓ To represent the dimension of the vector

representing each token.

✓ For our study we have used 100 embedding

dimensions for arguments and 30 for their

associated POS tag.

Tag Embedding

Dimension

30

6 Bach size 4 ✓ We have used 4 training sentences in one

iteration to estimate the error gradient in

hyperparameter for the learning algorithm.

7 Activation

Function

ReLU ✓ In Neural networks, activation functions

are used for determining the output of a

deep learning model and its accuracy.

✓ For our study, we have used the ReLU

activation function. Since it is

computationally efficient for

backpropagation.

8 Optimizer Adam ✓ Optimizers are algorithms used to change

the attributes of the neural network learning

rate to reduce the losses.

✓ For our study, we have used the Adam

optimization algorithm since it is

Computationally efficient, requires Little

memory

62 | P a g e

5.2.2. Proposed model training and Validation accuracy

As we have depicted in Figure 5.1 below, our proposed model achieved 95.5 training

accuracy and 81.2 Validation accuracy without considering multiple sense of predicates

annotated dataset. In this case, we have 19 classes or sets of role labels to classify each

argument in a sentence.

Figure 5. 1 Proposed Model Training and validation status rate

As clearly shown in Figure 5.1 above the training and validation Accuracy curve have the

same flow i.e., both of them increase from epoch 2 up to epoch 10. This shows both the

training and validation dataset contains easier examples and gets fewer complex data. since

the model can easily learn important features from the input data and less challengeable to

test the network model and can easily generalize for our unseen data.

When we see the overall performance of the proposed model, it achieved a very good

training accuracy but the validation accuracy is not as good as the training. Starting from

epoch 10 up to epoch 30 both the training and validation curve goes neck to neck, whereas

from epoch 30 up to epoch 45 the training curve increases highly than the validation curve

63 | P a g e

which means the model is very good in feature learning but did not get enough information

to generalize for unseen data. This indicates that the training dataset consists of easier

examples than validation dataset i.e., The model loses some features during validation

because of complexity of the validation dataset the model becomes difficult to generalize

for some features.

Generally, when we go from epoch 0 to epoch 50 both the training and validation curve

ultimately increases which means the model can obtain important features from the input

data as the data size increases and which can make a decision easily for new data based on

what it learns from the training data in the training phase.

5.2.3. Proposed model Training and Validation loss

Training loss is the error on the training set of data whereas validation loss is the error after

running the validation set of data through the trained network. The training and validation

loss of the proposed model is shown in Figure 5.2 below.

Figure 5. 2 Proposed model Training and Validation Loss Rate

64 | P a g e

The study aim is to make the validation loss as low as possible low. Since, most researchers

explained that the appearance of some overfitting in the network model is nearly always a

good thing and allows our neural network model to make accurate predictions for unseen

data.

As we have clearly depicted in Figure 5.2 above, as the number epoch increases both the

validation and training loss drops down. This indicates that the network model learns the

given data better and better. However, due to the nature of the dataset at a certain point the

training error curve continues to drop-down but the validation loss curve begins to rise-up,

this indicates there is some overfitting condition on the model.

5.3. Experiments

In this study, we have performed an additional experiment to show the role of sense

consideration in our dataset for semantic role labeler model performance. The experiment

examined the usefulness of context-based data annotation or preparing a semantically

annotated data for each sense of predicates to semantic role labeling. The proposed neural

network model uses a sequence of words/arguments and its corresponding sequence of tags

as input to assign a label for each argument in a given sentence which indicates their role

based on the main verb/predicate found in a sentence.

Therefore, we are motivated to consider a context of predicates during assigning of a

semantic role label for each argument. In addition to this, we have tried to consider each

sense of predicates in our dataset and prepared a data for each sense after that we have done

an experiment with these datasets to evaluate our model performance. To conduct the

experiment, we have used 70% of the dataset for training and the remaining 30% of dataset

for testing and validation with the same hyperparameter values as we have expressed above

in Table 5.5 except during context-based data annotation we have used two additional

modifier class or role label called “ARGM-CAU” and “ARGM-EXT” those are included

only in this experiment. In addition to this, 50 sentences with ambiguous words are

considered for the training, 20 sentences for validation and 10 sentences for testing the

model.

65 | P a g e

Experiment I: Evaluating the role of Predicate Sense based data annotation on

Semantic Role labeling Performance

In this study, the proposed model takes a sequence of words/arguments and predicates to

determine the relationship between each argument with the predicate based on the MLP

scoring function and uses the score value of predicate and arguments as input to label the

given sentence. During role labeling, the classifier classifies based on the maximum score

of a predicate with each argument in a given sentence.

As shown in table 5.6 below, the proposed model achieved 95.5% Training and 81.2%

validation accuracy on our dataset which is annotated without depending on sense of

predicates, whereas the model achieved 95.9% Training and 83.8% validation accuracy on

our dataset which is annotated depending on sense of predicates. Based on this result, we

have concluded that a context-based data annotation depending on the sense of predicates

and using the correct value of hyperparameters such as learning rate and dropout for

training a Bi-LSTM model increased the performance of our proposed Bi-LSTM network

model during semantic role labeling.

Table 5. 6 Comparison of Role Labeler Performance with and without context of

multi-sense predicate

Evaluation Without Predicate Sense

Dataset

With Predicate Sense

Dataset

Training Accuracy 95.5% 95.9%

Validation Accuracy 81.2% 83.8%

Testing Accuracy 82.6% 84.9%

In network model some overfitting problem appeared to fix this problem different

researchers recommended different model overfitting reducing techniques such as

increasing neural network dropout and trained a model with a huge amount of dataset. For

our study, we have increased a Bi-LSTM dropout from 0.3 to 0.5 to fix model overfitting.

66 | P a g e

Figure 5. 3 Role labeling Performance with Predicate Sense Based Annotated Data

As shown in Figure 5.3 above, the experimental result shows the effectiveness of

considering predicate on different domains and predicate sense-based data annotation on

semantic role labeling by reducing overfitting of our neural network model. Based on this,

we suggested that annotation of dataset depending on the sense of predicate and using

optimal hyperparameter values for semantic role labeling tasks achieved a better training

and testing accuracy result.

5.3.1. Evaluation

For evaluating the performance of our model, we have simply split our dataset into training,

testing and validation data. From our 2000 total dataset, we have used 1400 data for

training, 300 for testing and 300 for validation the MLP classifier which means we have

trained the classifier by 70% of the dataset and the remaining 30% for tested and validating

it because this data splitting rule is fine when the training and testing datasets contain

enough examples to get reliable accuracy results. The model classifier is tested with unseen

data and its classification performance is evaluated by comparing the number of role labels

67 | P a g e

manually assigned to each argument of a sentence in the validation dataset with the number

of role labels or classes correctly recognized by the proposed model. To do this, we have

used a validation dataset which contains 200 simple Amharic sentences.

5.3.1.1. Evaluation Metrics

To evaluate the performance of our proposed network model we have used Precision,

Recall and F1-score evaluation metrics. These evaluation metrics used true positives, true

negatives, false positives and false negatives terms as an essential component by computed

the number of correctly recognized labels or class in our dataset with the number of

incorrectly recognized labels by the classifier. The term True positive, True negative, False

positive and False negative can be described as follows: -

True positive (TP): refers to the number of classes or labels in which we predicted YES

and the actual output was also YES.

False positive (FP): refers to the number of classes or labels in which we predicted YES

and the actual output was NO.

True negative (TN): refers to the number of classes or labels in which we predicted NO

and the actual output was NO.

False negative (FN): refers to the number of classes or labels in which we predicted NO

and the actual output was YES.

Precision: - It is the number of correctly positive results divided by the number of positive

results predicted observations by the classifier.

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑇𝑃

𝑇𝑃 + 𝐹𝑃

Recall: - it is one of the performance evaluation metrics that is the ratio of correctly

predicted positive observations to all observations in actual class yes.

𝑅𝑒𝑐𝑎𝑙𝑙 =𝑇𝑃

𝑇𝑃 + 𝐹𝑁

68 | P a g e

F1-score: - F1-score is also known as an F measure that used to measure test accuracy of

the model. It is the Harmonic Mean between precision and recall and ranges between 0 and

1. It tells you how precise your classifier is (how many instances it classifies correctly), as

well as how robust it is (it does not miss a significant number of instances).

𝐹1 − 𝑆𝑐𝑜𝑟𝑒 =2(𝑅𝑒𝑐𝑎𝑙𝑙 ∗ 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛)

𝑅𝑒𝑐𝑎𝑙𝑙 + 𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛

Based on the testing dataset, the model achieved the following average precision, recall

and F1-score value as depicted on Table 5.7 below.

Table 5. 7 Average precision, recall, and F-score result for testing the model

Dataset Accuracy Precision Recall F1-Score

Testing Dataset 84.9% 82.56% 80.85% 81.72%

In addition to average value the individual class precision, recall and F1-score value are

shown on Table 5.8 below.

Table 5. 8 Individual role label performance on predicate sense based annotated data

69 | P a g e

Table 5. 9 Testing Result of the Semantic Role Labeler Model

Total

number of

sentences

Total number of

arguments in a

sentence

Total number of

arguments (Manually

Assigned)

Number of arguments

(the system correctly

determined)

Testing

Result

(%)

300 1530 1530 1300 84.9%

As we have shown in Table 5.8 above, we have used 300 simple Amharic sentences for

testing our model which have 1530 total number of arguments. In our dataset, we have 21

role labels or unique classes. we have assigned associated role labels for each 1530

arguments manually during data annotation. From these 1530 arguments, our proposed

system assigned a correct associated role label or classified correctly for only 1300

arguments from the total argument which means our models achieves 84.9% testing

accuracy for unseen data i.e., with testing dataset.

The system also assigned a role label for the rest 230 arguments based on the information

obtained from the training phase. But which is incorrect or not matches to the actual value

assigned during data annotation.

This indicated that the proposed model can labelled 1300 arguments correctly based on the

information obtained during model training. Based on this information, the proposed model

performed an argument role labeling accuracy of 84.9% on the testing data. This accuracy

is obtained by matching the number of roles assigned to each argument manually during

data annotation to the number of roles assigned to each argument by the proposed model.

70 | P a g e

Figure 5. 4 Sample Semantic role labels assigned by the model with the Testing data

As depicted above in Figure 5.4, some arguments are classified incorrectly by the model.

This error appears because of the nature of the dataset since we have used manual data

annotation technique and the more similarity between two different classes.

Furthermore, Amharic is morphologically rich language which contains morpho-syntactic

information and contains different ambiguous word that create a challenge for the model

during a dense vector representation of each words and some arguments are misclassified

by the model. To reduce such data representation challenges using a language expert-based

data collection and preparation as well as integrating morphological analyzer and reducing

ambiguous words on the dataset minimized this problem.

The other cause of error in classification of unseen data is sequential segmenting and

labeling problem [34] which has appeared because of difficulty to decide a standard set or

number of roles and produce a formal definition for these roles that will create a syntactic

variation of a sentence.

71 | P a g e

5.4. Discussion of Results

In this study, we have designed and developed a Context-Based Semantic Role Labeler

model for Amharic language. which is used as a multilayer perceptron classifier to predict

the score of predicate-argument relationship of simple Amharic sentences. We have used

a trained neural network model to assign a role label for a given new Amharic sentence

argument. The model contains a neural embedding layer which takes a sequence of words

and their associated POS tag to create dense representation vectors for the given natural

text. Then we have applied a BiLSTM encoder on the embedding layer result for handling

the sequence information in both the forward and backward direction. After we have

obtained a result from the Bi-LSTM encoder we have used MLP classifier for classifying

arguments in a given sentence based on their score of role label. The MLP classifier uses

a biaffine attentional scorer technique to predict the score of argument and their associated

role labels pairs in a given sentence. After generating a score of each argument and possible

role label pairs the MLP classifier selects the maximum score of possible pairs in a sentence

and classifies each argument based on this score for each argument in a sentence. We

conduct experiments to examine the performance of the proposed model on sense of

predicate based annotated simple Amharic sentences. The experimental result shows the

proposed model achieves 95.9% training and 84.9% testing accuracy and the model well

performed on the predicate sense-based annotated dataset and selected optimal model

hyperparameter values such as dropout and learning rate as well as batch size.

We have not compared the performance of the model with other semantic role labeler

models due to the absence of pretrained language models for Amharic language. We have

not also compared with E. Yirga and her colleagues’ work because it is difficult to access

her dataset since it is unavailable in an online data source site. So, we have used

dependency-based data annotation, So the data format, data size and classification methods

are different.

72 | P a g e

CHAPTER SIX: CONCLUSION AND RECOMMENDATION

6.1. Conclusion

Semantic Role Labeling is a process of identifying constituents and predicates in the

sentence and assigning associated labels for each of them that expressed their semantic

roles in the sentences. During role label assigning context of predicate consideration is

necessary to perform a correct annotation of each argument in a sentence.

In this study, we have developed a context-based semantic role labeler model for simple

Amharic sentences using a deep neural network by performing sense of predicates-based

data annotation from different social media platforms. The data were collected for simple

Amharic sentences from different social media platforms, identifying predicates which

have more than one senses and annotate the dataset semantically based on PropBank

annotation guidelines for each predicate depending on the sense of predicates and try to

show its role on SRL tasks. In addition to this, we have used a Bi-LSTM encoder and MLP

Neural network classifier which uses a biaffine attentional scoring techniques to predict

score of labels. The proposed model achieves 95.9% training and 84.9% testing

accuracy on predicate sense-based annotated dataset.

6.2. Contribution of the study

This study provides the following scientific contribution: -

The study contributed a Context-Based Semantic Role Labeler for Amharic

language based on dependency based SRL for simple Amharic sentences.

The study contributed 2000 semantically annotated dataset for Amharic language

that have been used as a resource for further researchers.

This study includes 40 ambiguous verbs which have a total more than 80 sense of

predicates in our simple Amharic sentence dataset with their associated role labels

in which previous researchers did not consider and tested their role on the SRL task.

This increases the training and testing accuracy by 0.4% and 2.6% respectively.

The study shows the effectiveness of deep Bi-LSTM networks and sense of verbs

on semantic role labeling tasks.

73 | P a g e

6.3. Recommendation

In this research, during dataset annotation we have considered verb sense as a

feature for semantic role labeling targeted on simple Amharic sentences only.

However, due to the existence of different kinds of sentences in Amharic language

it is required that this feature on all kinds of Amharic sentences will improve the

performance of the SRL task as a future work.

For this study, we have used language experts based semantically annotated corpus

for training the role labeler model. This is time consuming and becomes boring to

prepare a large dataset. However, for future importing automatic POS tagger and

semantic role annotator modules used in our proposed solution will minimize

manual efforts.

In deep learning, a huge size of corpus with balanced distribution of class is highly

regulating the performance of the proposed system during classification tasks. i.e.,

one of the basic limitations of deep learning is which require a large corpus

However, for our study we have used 2000 corpus only which is not well-balanced

class distribution, in future it is required to test the performance of the proposed

solution with larger and well-balanced corpus.

74 | P a g e

REFERENCES

[1] A. Lopez, “Natural language understanding on semantic role labeling,” School of

Informatics, University of Edinburgh, Mar. 27, 2018.

[2] J. Cai1, S. He1, Z. Li1 and H. Zhao, “A Full End-to-End Semantic Role Labeler, Syntax-

agnostic Over Syntax-aware,” proceedings of the 27th International Conference on

Computational Linguistics, pages 2753–2765, Santa Fe, New Mexico, USA, Aug. 20-26,

2018.

[3] K. Jaideep, “Five applications of natural language processing for businesses,” Ju. 28, 2019.

[Online]: Available: https://www.upgrad.com/blog/5-applications-of-natural-language-

processing-for-businesses/ [Accessed 25 Oct. 2019].

[4] Dr. P. Merlo, “Semantic roles in natural language processing and in linguistic theory,”

Universit´e de Gen`eve, departement de Linguistique, Oct. 9, 2009.

[5] D. Shen, and M. Lapata, "Using semantic roles to improve question answering," In

EMNLP-CoNLL, pp. 12-21. Ju. 2007.

[6] M. Palmer, D. Gildea and P. Kingsbury, “The proposition bank: an annotated corpus of

semantic roles,” association for computational linguistics, 11 Ju., 2005.

[7] M. Mohamed and M. Oussalah, “SRL-ESA-TextSum: A text summarization approach based

on semantic role labeling and explicit semantic analysis,” Information in processing and

management at the university of Birmingham, pages 1356–1372, retrieved 20 Feb. 2019;

accepted 12 Ap. 2019.

[8] J. Christensen, Mausam, S. Soderland and O. Etzioni, “Semantic role labeling for open

information extraction,” in proceedings of the NAACL HLT 2010 first international

workshop on formalisms and methodology for learning by reading, pages 52–60, Los

Angeles, California, Ju. 2010.

75 | P a g e

[9] L. He, K. Lee, M. Lewis and L. Zettlemoyer, “Deep semantic role labeling: what works and

what’s next,” proceedings of the 55th annual meeting of the association for computational

linguistics, pages 473–483 Vancouver, Canada, Ju. 30 – Au. 4, 2017.

[10] L. He, M. Lewis and L. Zettlemoyer, “Question-Answer driven semantic role labeling using

natural language to annotate natural language,” University of Washington Seattle, WA.

[11] R. Cai and M. Lapata, “Syntax-aware semantic role labeling without parsing,” transactions

of the association for computational linguistics, vol. 7, pp. 343–356, 6 Ju., 2019.

[12] F. Qian, L. Sha, B. Chang, L. Liu and M. Zhang, “Syntax aware LSTM model for semantic

role labeling,” in proceedings of the 2nd workshop on structured prediction for natural

language processing, pages 27–32 Copenhagen, Denmark, Sep. 7–11, 2017.

[13] Z. Tan, M. Wang, J. Xie, Y. Chen and X. Shi, “Deep semantic role labeling with self-

attention,” association for the advancement of artificial intelligence, Xiamen University,

China, Dec. 5, 2018.

[14] ጌታሁን አማረ “ዘመናዊ የአማርኛ ሰዋሰው በቀላል አቀራረብ”, Addis Ababa, 1989 (EC).

[15] D. Alfano, R. Abbruzzese and D. Cappetta, “Neural semantic role labeling using verb sense

disambiguation,” 2019.

[16] T. Pham, X. Pham and P. Le-Hong, “Building a semantic role labelling system for

Vietnamese,” 11 May 2017.

[17] ባ. ይማም, የአማርኛ ሰዋሰው የተሻሻለ ሁለተኛ ዕትም, አዲስ አበባ, ጥቅምት 2001.

[18] E. Yirga, “Semantic role labeler for Amharic text using memory-based learning,” (MSC

thesis) Addis Ababa University, Ju. 2017.

[19] K. Peffers, T. Tuunanen, M. A. Rothenberger and S. Chatterjee, “A Design Science

Research Methodology for Information Systems Research,” Published in Journal of

Management Information Systems, Volume 24 Issue 3, pp. 45-78, Winter 2007-8.

76 | P a g e

[20] W. Daelemans and A. van den Bosch, “Memory-Based Learning,” in A draft chapter for

the Blackwell Computational Linguistics and Natural Language Processing Handbook,

University of Antwerp, Alex Clark, Chris Fox and Shalom Lapping, 2009.

[21] W. Ahmed and M. Bahador, “The accuracy of the LSTM model for predicting the S&P

500 index and the difference between prediction and back testing,” School of Electrical

Engineering and Computer Science, Stockholm, Sweden, Ju. 4, 2018.

[22] A. Shelmanov and D. Devyatkin, “Semantic role labeling with neural networks for texts in

Russian,” Proceedings of the International Conference “Dialogue 2017”, Moscow, May

31—Ju. 3, 2017.

[23] M. Roth and M. Lapata, “Neural semantic role labeling with dependency path embeddings,”

School of Informatics, University of Edinburgh, Ju. 18, 2016.

[24] P. Moreda and M. Palomar, “The role of verb sense disambiguation in semantic role

labeling,” natural language processing research group, University of Alicante, 2006.

[25] T. Li and B. Chang, “Semantic role labeling using recursive neural network,” Collaborative

Innovation Center for Language Ability, Xuzhou 221009 China, 2015.

[26] N. Xue, M. Palmer, “Automatic semantic role labeling for Chinese verbs,” University of

Pennsylvania, PA 19104, USA.

[27] M. Diab, A. Moschitti and D. Pighin, “Semantic Role Labeling Systems for Arabic using

Kernel Methods,” Proceedings of Association for Computational Linguistics, pages 798–

806, Columbus, USA, Ju. 2008.

[28] J. Zhou and W. Xu., “End-to-end learning of semantic role labeling using recurrent neural

networks,” in proceedings of the 53rd annual meeting of the association for computational

linguistics and the 7th international joint conference on natural language processing,

pages 1127–1137, Beijing, China, Ju. 26-31, 2015.

77 | P a g e

[29] K. Wang, “Automatic Semantic Role Labeling for Chinese,” 2010 IEEE/WIC/ACM

International Conference on Web Intelligence and Intelligent Agent Technology, Dalian

University of Technology, China, 2010.

[30] A. Ghalibaf, S. Rahati and A. Estaji, “Shallow Semantic Parsing of Persian Sentences,”

23rd Pacific Asia Conference on Language, Information and Computation, pages 150–

159, 2009.

[31] D. Henrique et al., “Using recurrent neural networks for semantic role labeling in

Portuguese,” 01 Oct. 2019.

[32] D. Jurafsky and J. H. Martin, "Speech and Language Processing: An Introduction to

Natural Language Processing, Computational Linguistics, and Speech Recognition," Book

Review, University of Colorado, Boulder, 2015.

[33] J. Park, “Selectively connected self-attentions for semantic role labeling,” Department of

Computer Science and Engineering, Incheon National University; South Korea, 8 Mar.

2019.

[34] L. Marquez, “Exploring challenges in Semantic Role Labeling,” TALP Research Center

Technical University of Catalonia, Moscow, Russia, May 28, 2013.

[35] T. Samardzic, "Semantic Roles in Natural Language Processing and in Linguistic Theory,"

Unpublished PhD Dissertation, Thesis de lic., Universidad de Ginebra, 2009.

[36] G. Stevens, “Automatic semantic role labeling in a Dutch corpus,” Master thesis in

Universiteit Utrecht, Sep. 2006.

[37] D. Godayal and S. Malhotra, “An introduction to part-of-speech tagging and the Hidden

Markov Model,” 8 Ju., 2018

78 | P a g e

[38] T. Liu, W. Che, S. Li, Y. Hu and H. Liu, “Semantic Role Labeling System using Maximum

Entropy Classifier,” Proceedings of the 9th Conference on Computational Natural

Language Learning (CoNLL), pages 189–192, Ann Arbor, Ju., 2005.

[39] T. Mitsumori, M. Murata, Y. Fukuda, K. Doi and H. Doi “Semantic Role Labeling Using

Support Vector Machines,” Graduate School of Information Science, Nara Institute of

Science and Technology.

[40] J. Camacho-Collados and M. Pilehvar, “On the Role of Text Preprocessing in Neural

Network Architectures: An Evaluation Study on Text Categorization and Sentiment

Analysis,” proceedings of the 2018 EMNLP Workshop Blackbox NLP: Analyzing and

Interpreting Neural Networks for NLP, pages 40–46, Brussels, Belgium, Nov. 1, 2018.

[41] T. Cohn and P. Blunsom “Semantic Role Labelling with Tree Conditional Random Fields,”

proceedings of the 9th Conference on Computational Natural Language Learning

(CoNLL), pages 169–172, University of Melbourne, Australia, Ju. 2005,

[42] C. Bonial, O. Babko-Malaya, J. D. Choi, J. Hwang and M. Palmer, “PropBank annotation

guidelines,” Center for Computational Language and Education Research Institute of

Cognitive Science, University of Colorado at Boulder, Dec. 23, 2010.

[43] R. Johansson and P. Nugues, “A FrameNet-based Semantic Role Labeler for Swedish,”

Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, pages 436–

443, Sydney, Jul 2006.

[44] Fillmore, Charles J., R. Lee-Goldman, and R. Rhodes. "The FrameNet construction," Sign-

based construction grammar (2012): 309-372.

[45] C. Bonial, J. Hwang, J. Bonn, K. Conger, O. Babko-Malaya and M. Palmer, “English

PropBank annotation guidelines,” Center for Computational Language and Education

Research, Institute of Cognitive Science, University of Colorado at Boulder, Nov. 14,

2012.

79 | P a g e

[46] M. Diab, A. Moschitti and D. Pighin, “CUNIT: A Semantic Role Labeling System for

Modern Standard Arabic,”

[47] D. Marcheggiani1, A. Frolov and I. Titov, “A Simple and Accurate Syntax-Agnostic

Neural Model for Dependency-based Semantic Role Labeling,” proceedings of the 21st

Conference on Computational Natural Language Learning (CoNLL 2017), pages 411–

420, Vancouver, Canada, Aug. 3 – Aug. 4, 2017.

[48] T. Dozat, p. qi., and C. Manning, “Stanford’s graph-based dependency parser at the

CONLL 2017 shared task,” 2017.

[49] Lisa M. Belue, “an investigation of Multilayer perceptrons for classification,” MSC thesis

at the Air force Institute of technology, Air University, Captain, USAF, Mar., 1992.

[50] A. Panigrahi, A. Shetty and N. Goyal, “Effect of activation functions on the training of

overparametrized neural nets,” Published as a conference paper at ICLR, Mar. 14, 2020.

[51] A. Gebremariam, “Amharic-to-Tigrigna Machine Translation Using Hybrid Approach,”

Unpublished Master Thesis, College of Natural Science, Addis Ababa University, Oct. 7,

2017.

[52] I. Putrayasa and D. Ramendra, “The Type of Sentence in The Essays of Grade VI

Elementary School Students in Bali Province: A Syntactic Study,” 4th PRASASTI

International Conference on Recent Linguistics Research, volume 166, 2018.

80 | P a g e

APPENDIXES

Appendix A: List of part of speech tags used in this system and their Description

NO POS tag Name Description

1 NOUN Noun

2 VERB Verb

3 NOUNP Noun Phrase

4 NDet Noun Determinant

5 ADJ Adjective

6 ADP Adpposition

7 ADV Adverb

8 ADJP Adjectival Phrase

9 NUMP Numeric Phrase

10 NUMCR Cardinal Number

11 NUMOR Ordinal Number

12 PROPN Preposition Noun

13 ADVP Adverb Phrase

14 CCONJ Coordinating Conjunction

15 ADPP Adpposition Phrase

16 VP Verb Phrase

81 | P a g e

Appendix E: List of PropBank Semantic Roles Used in the System

NO Semantic Role Description

1 ARG0-AGT Agent

2 ARG0-EXP Experiencer

3 ARG1-PAT Patient

4 ARG1-DES Direction or Goal

5 ARG1-BEN Beneficiary

6 ARG1-SRC Source


8 ARG2-INS Instrument


10 ARG2-BEN Beneficiary

11 ARG1-TEM Theme

12 ARGM-LOC Location

13 ARGM-DES Destination

14 ARGM-PUR Purpose

15 ARGM-CAU Cause

16 ARGM-INS Instrument

82 | P a g e

17 ARGM-COM Comitative

18 ARGM-MNR Manner

19 ARGM-SRC Source

20 ARGM-TMP Time

21 ARGM-EXT Extent (added during Predicate sense

disambiguation)

22 REL Predicate or Relative (Optional class)

context-based semantic role labeler for amharic …

Documents