classifying glyphs - ntnu

Click here to load reader

Post on 14-Jan-2022

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

Autumn 2010
Tiril Anette Langfeldt Rødland
Supervisor: Keith L. Downing
abcdefghijklmnopqrstuvwxyz567
Abstract
This dissertation investigates the classification capabilities of artificial neural networks. The goal is to generalize over the features of a writing system, and thus classify the writing system of a previously unseen glyph.
Evolution and learning were both used to tune the weights of a network. This was to investigate which of the methods is best suited for creation of this kind of generalization and classification networks. The difference was remarkable, as the evolved network displayed absolutely no ability to classify the glyphs. However, the learned network managed to clearly separate between the writing systems, resulting in correct classifications.
It can thus be concluded that artificial neural networks are able to correctly classify the writing system of a glyph, and that learning by backpropagation is the ultimate method for creating such networks.
i
ii
Preface
This dissertation is the pre-study project of the master dissertation. It is written at the Department of Computer and Information Science, NTNU, during the autumn semester of 2010.
The original plan was to find an Egyptological dissertation subject, due to my long-lasting interest in the field. I appreciate the assistance from Vidar Edland, Norwegian Egyptology Society, for helping me in the search for a suitable subject. However, the search came up empty. Inspired by Dehaene (2009), Egyptology turned to hieroglyphs, and then other writing systems joined in. Classification seemed to be a natural choice of task, with regard to different writing systems and neural networks. This revealed an interesting subject on which there seemingly had been done no previous research. I like to think it still is a relevant problem. At least it can be used to show off the generalization abilities of artificial neural networks.
I wish to thank my supervisor Professor Keith L. Downing at the Depart- ment of Computer and Information Science, Norwegian University of Science and Technology, for his invaluable assistance.
Special thanks are given to John A. Bullinaria, for his immense help in debug- ging, and for sharing of his infinite wisdom regarding neural networks.
Gratitude is given to Arne Dag Fidjestøl and Aslak Raanes in the System Administration Group, Department of Computer and Information Science, for giving me access to hardware on which I could run the evolution.
I am also very grateful to Cyrus Harmon, Edi Weitz, David Lichteblau, and all the other Common Lisp programmers out there, whose libraries saved me from months of extra work.
Trondheim, December 2010
Chapter 1 : Introduction 1
Chapter 2 : Background 3 2.1 Writing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Classification of Glyphs . . . . . . . . . . . . . . . . . . . . . . . 5 2.3 Artificial Neural Network . . . . . . . . . . . . . . . . . . . . . . 6
2.3.1 The XOR Example . . . . . . . . . . . . . . . . . . . . . . 6 2.4 Training ANNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4.1 Training ANNs Using Supervised Learning . . . . . . . . . 9 2.4.2 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.3 Fast Backpropagation . . . . . . . . . . . . . . . . . . . . 12 2.4.4 Tricks for Multi-Class Classification Problems . . . . . . . 13
2.5 Evolving ANNs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.5.1 Evolving Neural Networks . . . . . . . . . . . . . . . . . . 15
Chapter 3 : Methodology 17 3.1 Glyphs and Writing Systems . . . . . . . . . . . . . . . . . . . . . 17 3.2 Neural Network Design . . . . . . . . . . . . . . . . . . . . . . . . 20 3.3 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.4 Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 4 : Results and Analysis 25 4.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.1.1 Pair Classification . . . . . . . . . . . . . . . . . . . . . . 25 4.1.2 Full Classification . . . . . . . . . . . . . . . . . . . . . . . 27 4.1.3 Writing System Classification Quality . . . . . . . . . . . 27
4.2 Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Chapter 5 : Discussion 35 5.1 Learning or Evolution? . . . . . . . . . . . . . . . . . . . . . . . . 35 5.2 Writing Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 5.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
References 39
v
B.1.1 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1 B.1.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-6
B.2 Full Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29 B.2.1 Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-29 B.2.2 Evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . B-31
Appendix C : Data Set Distribution C-1
Appendix D: Pixel Comparison D-1
vi
List of Figures
2.1 A simple ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Activation function . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 XOR ANN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Backpropagation ANN . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 The evolution cycle . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 ANN: From Unicode to output values . . . . . . . . . . . . . . . . 19
4.1 Learning: full classification . . . . . . . . . . . . . . . . . . . . . . 28 4.2 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.3 Evolution: full classification . . . . . . . . . . . . . . . . . . . . . 32
A.1 Coptic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2 A.2 Runic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3 A.3 Hebrew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4 A.4 Arabic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5 A.5 Hangul jamo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6 A.6 Hiragana . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7 A.7 Thai . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8 A.8 Canadian Aboriginal syllabics . . . . . . . . . . . . . . . . . . . . A-9 A.9 Han unification . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-10 A.10 Egyptian hieroglyphs . . . . . . . . . . . . . . . . . . . . . . . . . A-11
C.1 Approach 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-3 C.2 Approach 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-4 C.3 Approach 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C-4
vii
viii
2.1 XOR truth table . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Evolutionary methods . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Writing systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.2 Evolution parameters . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 The gene . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.1 Learning: pair classification . . . . . . . . . . . . . . . . . . . . . 26 4.2 Learning: pair classification probabilities . . . . . . . . . . . . . . 26
C.1 Number of glyphs . . . . . . . . . . . . . . . . . . . . . . . . . . . C-1 C.2 Comparison of combinations of relative set size . . . . . . . . . . C-2
D.1 Glyph similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-2 D.2 Glyph similarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . D-2 D.3 Pixel comparison using Richard’s curve (128) . . . . . . . . . . . D-3 D.4 Pixel comparison using Richard’s curve (200) . . . . . . . . . . . D-3 D.5 Pixel comparison using Richard’s curve (250) . . . . . . . . . . . D-3 D.6 Pixel comparison using difference ≤ 40 . . . . . . . . . . . . . . . D-4 D.7 Pixel comparison using difference ≤ 20 . . . . . . . . . . . . . . . D-4 D.8 Pixel comparison using difference ≤ 10 . . . . . . . . . . . . . . . D-4 D.9 Pixel comparison using difference ≤ 5 . . . . . . . . . . . . . . . . D-5
ix
x
XOR Exclusive or
Introduction
33 000 years ago, the first Homo sapiens discovered that they could draw rec- ognizable pictures on the wall of the Chauvet cave, in southern France (Dehaene, 2009). There were drawings of animals, as well as nonfigurative shapes such as dots, lines and curves. Pictography appeared in ancient Sumeria and Egypt, creating easy-to-read writings.
The art of writing mutated as it spread throughout the cultures. Even though pictograms are relatively easy to read, they are — due to their many details — very slow to write. They were thus simplified in different ways, and the symbols were tied to different sounds. For example, the Chinese character is a simplification of a horse, and the letter A was originally an ox (Dehaene, 2009, Figure 4.3). This is visible in the Greek letter α (alpha), as aleph means just that. Similarly, m symbolizes waves (mem), k symbolizes a hand with outstretched fingers (kaf), and R symbolizes a head (res) (Dehaene, 2009).
Even though both and A originally came from large, four-legged animals, the resulting glyphs are very different. For each step of simplification, some features were extracted for use in the simplified glyph. Different civilizations focused on different features. The simplification was also depending on the writing tools available; pencils and paper enable glyphs of many different shapes, while glyphs carved in stone ought to consist of straight lines.
Today, 33 000 years after the first drawings in a cave, we have numerous writing systems, in different stages of evolution. The human brain seems to be rather good at separating between these different writing systems, depending on their relative placement on the family tree. It might be difficult to separate between the Latin, Greek and Cyrillic alphabets, if one is familiar with neither. However, it is trivial to separate these from Egyptian hieroglyphs. After having learned that m, l and w are hieroglyphs, it is easy to see that R also is of this writing system.
Even though the glyphs within one writing system differ from each other, they are also very similar. One tend to reuse the same basic shapes for different glyphs within the writing system (Dehaene, 2009, Figure 4.1), thus creating a subconscious expectation on how glyphs from that writing system shall look.
When humans separate between writing systems like that, it is due to the
1
CHAPTER 1. INTRODUCTION 2
neural network of our brain. We are able to generalize common features over a set of glyphs, and classify unseen glyphs based on that. Since the natural neural network is able to do this separation, an artificial neural network ought to be as well.
We already know that artificial neural networks are able to recognize letters (Bishop, 1996). This dissertation will investigate whether an artificial neural network can generalize over the patterns and shapes common for the different writing systems, and thus find the writing system of an unseen glyph.
It was evolution that gave humans a brain capable of reading. However, no one can read without having learnt it first. The main question of the dissertation is thus whether learning or evolution of the neural net will give the best result. That is, the ultimate solution with respect to both success rate and execution time, or rather how easy it is to attain said solution.
The dissertation is structured as follows. This introductory chapter intro- duces the problem, and writing systems in general. Chapter 2 demonstrates the concepts of classification, evolution, and artificial neural networks, including com- mon methods and algorithms. Also writing systems are discussed in more detail. Chapter 3 presents the methodology used, focusing on differences from the stan- dard approach given in Chapter 2. Chapter 4 reveals the experiments and their results, including a low-level analysis. A more general discussion can be found in Chapter 5, in addition to future work.
Additional information can be found in the appendices. Appendix A includes Unicode charts from the writing systems used, and Appendix B contains the full results from the experiments. The optimal data set distribution is researched in Appendix C, while a comparison of the writing systems based on pixel values can be found in Appendix D.
CHAPTER 2
Background
This chapter introduces the concepts used in this dissertation. First there is a brief overview of writing systems, followed by an introduction to artificial neural networks.
2.1 Writing Systems
A writing system is the set of glyphs used to represent the writing of a specific human language (FOLDOC, 1998b). In this case, a glyph is an image used to visually represent the characters of the language (FOLDOC, 1998a).
There are many different classes of writing systems, depending mainly on what each glyph represent. Below is the classification used by The Unicode Consortium (2009, Ch. 6), as well as a few examples.
Alphabets are writing systems consisting of letters, where each letter can be either a consonant or a vowel. The two have the same status as letters, giving a homogeneous collection of glyphs.
The Latin alphabet is the most well-known example, as it is used to write thousands of different languages. Another…