8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network
http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 1/6
Recognition of Printed Bangla Document from Textual Image Using Multi-Layer
Perceptron (MLP) Neural Network
Md. Musfique Anwar, Nasrin Sultana Shume, P. K. M. Moniruzzaman and Md. Al-Amin Bhuiyan
Dept. of Computer Science & Engineering, Jahangirnagar University, Bangladesh
Email: [email protected], [email protected], [email protected], [email protected]
Abstract
This paper focuses on the segmentation of printed
Bangla characters for efficient recognition of the
characters. The segmentation of characters is an
important step in the process of character recognitions because it allows the system to
classify the characters more accurately and quickly.
The system takes the scanned image file of the printed document as its input. A structural feature
extraction method is used to extract the feature. In
this case, each individual Bangla character is
converted to a N M × feature matrix. A Multi-
Layer Perceptron (MLP) neural network with back
propagation algorithm is chosen to feed the feature
matrix to train with the set of input patterns and todevelop knowledge to classify the character. The
effectiveness of the system has been tested with
several printed documents and the success rates inall cases are over 90%.
Keywords:
Character segmentation, Character recognition,
Feature extraction, Multi-Layer Perceptron (MLP),
etc.
1. Introduction
Optical character recognition [1] is one of the
attractive fields of image processing [2]. A
character recognition technique associates a
symbolic identity with the image of a character. Lotof research works on Bangla Character recognition
has been done through last few years. In the
modern approach, adaptive tools have been appliedto pattern recognition system. The Artificial Neural
Network (ANN) is the most popular adaptive toolthat is used for character recognition [3]. Most
application use feed forward ANN and a numerous
variant of classical backpropagation algorithm andother training algorithms. The area of this research
is not only individual character recognition but it
attempts to retrieve a complete paragraph from itsoptical image created by a scanner. In this paper we
proposed a way to recognize printed Bangla
document from textual image using multilayer perceptron with backpropagation algorithm for
individual character recognition.
2. Bangla Character Set
Character is the fundamental attribute for writing
and reading a language. Character recognition is
the process to classify the input character according
to the predefined character class. There is a
particular character set for each language in the
world and Bangla language has also its own
character set with 49 characters, 10 digits, punctuations and other symbols.
Bangla letters are formed in two-dimensional space
based on mostly horizontal, vertical and are stroke
[4].The Bangla characters are classified in two
categorizes as follows:
i) Sorborno: ‘Shorborno’ like vowel of EnglishLanguage Character. There are eleven
‘Shorborno’ characters. The first six charactersor letters have full matra, the 7
thhas half matra
and the last four have no matra.
ii) Banjonborno: ‘Banjonborno’ is like as theconsonant. There are 39 ‘Banjonborno’ in
Bangla letter. Here we are concerned about
only the characters.
Bangla scripts are moderately complex patterns.
Each word in Bangla scripts is composed
of sever al characters joined by a horizontal line(called ‘Matra’ or head-line) at the top. The
concept of upper and lower case (as in English)
character is absent her e. There are many
composite characters, called “Jukto barna” asshown in Fig. 1. There are more that about 253
compound characters composed of 2, 3, or 4
consonants (i.e. Banjonborno) [5]. There are someother types of characters used in Bangla dictionary,
called suffix-prefix characters as shown in Fig. 2.
(a) Shorbarna
(b) Benjonbarno
(c) Bangla numerals
(d) A few Bangla composite characters
Fig. 1 Some Bangla mainstream characters used for
images recognition.
Fig. 2 Suffix-prefix determiner characters
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 1, April 2010
254 http://sites.google.com/site/ijcsis/ISSN 1947-5500
8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network
http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 2/6
3. The System Overview
The main phases of character recognition system isthe segmentation of text into characters so that the
computer is able to classify characters within a
paragraph as human can identify them. The overallmethod of the implemented system is illustrated in
the Fig. 3.
Fig. 3 Overall diagram of the recognition system
3.1 Data Acquisition
The input images are acquired from documents
containing printed Bangla text by using scanner as
an input device. Scanned images are then stored asan image file (.JPEG). Pre-processing is required to
make the raw data of the image into usable format
[6] because the scanned image does not happen to
be always in suitable form. This image is then passed for boundary detection.
3.2 Boundary Detection
We need to scan from the upper left and the bottom
right of the image to find the processing area of the
printed text document. The scanning is halted whenit faces a single pixel.
3.3 Segmentation
In this phase text is partitioned into its elementary
entities i.e. characters. First the system detects the
region of a text line of the paragraph. Then the textlines are segmented into words and the words are
divided into characters.
3.3.1 Text Line Detection
Text line detection is performed by scanning the
image row by row horizontally and keeps the
numbers of black pixels in each row. Now the boundary may be detected from the array by
counting the frequency of pixels in each line. In our
experiment we found the number of pixels of a
blank line in the image vary from 0 to 10. So thenumber of pixels where text is present in the image
is much larger than that of blank in the paper.There is a general concept that between two lines
more than two blank lines are present. In this waywe detect the boundary of a text line.
Upper boundary of a line is the first row where themore black pixels are found. After finding the
upper boundary, it continues scanning until a row
whose next row has no black pixels, which is the
lower boundary of the text line. There exist about 8
to 10 blank rows between two text lines.
3.3.2 Word Segmentation
Normally, in Bangle word there is no character
spacing due to Matra ( ⎯⎯ ). We have to detect the
Matra of a text line at first. Matra line is that row
that where the number of black pixels is themaximum [1, 7]. After detecting a line, the systemscans the image vertically from the upper boundary
of the line and count the number of black pixels in
each column. Start position of a word is the first
column where black pixels found first. The systemcontinues scanning until a column whose next
column has no black pixels, which is the end
position of the word. There exist about 4 to 6 blank columns between two words.
3.3.3 Character Segmentation
To perform the separation of characters in a word,
the system scans vertically from the start positionof the word which is also the start position of the
first character of the word. After finding the start
position of the character, it continues scanning untila column whose next column has no black pixels,
which is the end position of the character. Every
consecutive character in a word contains 2 to 3
blank columns shown in Fig. 4 .
Fig. 4 Character separation from below the Matra
3.4 Feature Extraction
Feature extraction is a subject of effective character
recognition and it helps easing classification task.
Maximum height and width of Bangla characters
(without compound characters) of SutonnyMJ font
with 10 font size is 76 × and maximum 912 × in
case of compound characters. After determining thestart and end position of a character, the region of
that character is converted to a 76 × matrix or
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 1, April 2010
255 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network
http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 3/6
912 × matrix (for compound characters)
containing 0 and 1, where 1 represents the presenceof character component and 0 represents the
absence of the character component.
The boundary of all characters are not of equal size,
i.e., the extracted matrices are not of equal size. If
some matrices are of smaller or greater height andwidth of our standard size then we scale the matrix,
but, if the height is equal but width is less then, we
add 0 to fill up the matrix to our standard size. The
character matrix acts as input to the recognition
stage. The input matrix is then fed to the neuralnetwork.
3.5 Recognition Engine and Classifier
In a back-propagation neural network, the learning
algorithm has two phases. First, a training input pattern (Bengali characters) is presented to the
network input layer. The network then propagates
the input pattern from layer to layer until the output
pattern is generated by the output layer. If this
pattern is different from the desired output, an error
is calculated and then propagated backwardsthrough the network from the output layer to the
input layer. The weights are modified as the error is
propagated.
A back-propagation neural network is determined
by the connections between neurons, the activationfunction used by the neurons, and the learning
algorithm that specifies the procedure for adjusting
weights. The network architecture for the backpropagtion neural network is shown in Fig. 5.
Fig. 5 Back-propagation neural network topology
A neuron determines its output by computing the
net weighted input:
∑
=
−=n
1i
θi
wi
xX ………… (1)
Where n is the number of inputs, and θ is
threshold applied to the neuron. Next, this inputvalue is passed through the sigmoid activation
function:
Xe1
1SigmoidY
−+
= ………… (2)
To derive the back-propagation learning law, let us
consider the three-layer network shown in Fig. 5.
The indices i, j, k here refer to neurons in the input,
hidden and output layers, respectively. The symbol
ijw denotes the weight for the connection between
neuron i in the input layer and neuron j in thehidden layer, and the symbol jk w the weight
between neuron j in the hidden layer and neuron k
in the output layer.
To propagate error signals, we start at the output
layer and work backward to the hidden layer. Theerror signal at the output of neuron k at iteration t is
defined by:
(t)a,k
Y(t)d,k
Y(t)k
e −= ………… (3)
Where t=1, 2, 3 and (t)d,k
Y is the desired output
of neuron k at iteration t.
Neuron k, which is located in the output layer, is
supplied with a desired output of its own. Hencewe may use a straightforward procedure to update
weight jk w :
(t) jk
Δw(t) jk
w1)(t jk
w +=+ ………… (4)
Where (t) jk
Δw is the weight correction, given by:
(t)k
δ(t) j
yα jk
Δw ××= ………… (5)
Where (t)k
δ is the error gradient at neuron k in
the output layer at iteration t.
In order to calculate the weight correction for the
hidden layer, we can apply the same equation as for the output layer:
(t)ij
Δw(t)ij
w1)(tij
w +=+ ………… (6)
Where (t)ij
Δw is the weight correction, given by:
(t) jδ(t)
ixα
ijΔw ××= ………… (7)
Where (t) jδ represents the error gradient at neuron
j in the hidden layer:
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 1, April 2010
256 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network
http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 4/6
∑=
×−×=l
1k (t)
jk (t)w
k δ(t)]
jy[1(t)
jy(t)
jδ … (8)
Where l is the number of neurons in the output
layer and,
(t)i
xe1
1(t)
jy
−
+
= ………… (9)
∑=
−×=n
1i jθ(t)
ijw(t)
ix(t)
iX ………… (10)
Where n is the number of neurons in the input
layer.
In our work, we use backpropagation neural
network consisting of 42 neurons in input layer, 30
neurons in the hidden layer and one output neuronin the output layer for character matrix of
size 76 × . And for character matrix of size 912 × ,
backpropagation neural network consists of 108
neurons (i.e. as inputs), 80 neurons in the hidden
layer and one output neuron in the output layer.
The system recognizes a character if the output of
the network is very close to one of the characters
with a certain acceptable tolerance. If the output isfar apart from all the possible outputs, then the
system cannot identify the character. This process
continues until the end of the text document. Theentire operation of the system can be easily
understood from the flow-chart shown in Fig. 6.
Fig. 6 Flow-chart of the recognition system
4. EXPERIMENTAL RESULT
We used bswing1_0_beta package for Bangla text
output and neuralj-0.0.4 package to implement backpropagation neural network in Java. The
number of neurons of hidden layer is always set to
(3/4) th of the number of neurons of input layer.
We use ‘PatternSet’ class which represents a set of patterns. The function ‘addPattern
(Pattern pattern)’ is used to add the required
patterns for all Bangla characters. The pattern for
Bangla character looks like:
‘pattern_set.addPattern(newPattern("0;0;1;0;1;0;0;0;0; 0;0;1;0;0;1;0;0;0;
0;0;1;0;0;0;1;0;0; 0;0;1;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;
0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;
0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;
0;0;0;0;0;0;0;0;0; 0;0;0;0;0;0;0;0;0;",
matrix_output_Str))’where 0;0;1;0;1 ………………….. 0;0;0; is the
input vector and ‘matrix_output_Str’ is the output
vector. We set the value of the following fields of ‘BackPropagation’ class as:
Field value
‘desired_error’ 0.001‘maximum_epochs’ 1000000000
Then the training of backpropagation neuralnetwork starts. After the training, the system scans
Bangla paragraph image and try to find the
correctly recognized characters and display thosecharacters as output. Fig. 7 illustrates the snapshot
of the implemented method. Results for different
types of sentences are furnished in Table 1.
Start
Input the image of the paragraph
which will be recognized
Detect the boundary of the printedtext document to perform the
se mentation of characters
Input the matrixto ANN
Stop
Select the character matrix of size
76 × or 912 × (for compound
character
Calculate OutputVector and error
error ≤ 0.001
Add the character to output list
Set index = 0,
maximum_epochs = 1000000000
Print “the character
is unrecognized”
index = index + 1
Whole documentrecognized?
No Yes
Yes
No
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 1, April 2010
257 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network
http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 5/6
Fig. 7 Sample output of the proposed system
Table 1: Success rate for experimental results
Total no. of
characters
Correctly
recognized
characters
Success
rate (%)
165 162 98.18
288 275 95.49
356 337 94.66
0
50
100
150
200
250
300
350
400
Total no. of
characters
Correctly
recognized
characters
Success rate (%)
Fig. 8 Success Rate Graph of experimental results
5. Conclusion
In this paper, we proposed a recognition system
emphasizing on the segmentation phase. The proposed system is capable of separating Bangla
letters, digits successfully from printed document.
It recognizes the segmented characters using
backpropagation neural network. The system
sometimes fails to recognize composite characters.
So to improve the performance of the system, the
segmentation process can be improved to deal withcomposite characters. In future, the proposed
recognition system may further be improved using
spell-checker.
References
[1] M. E Hoque, M. J. H. Siddiqi, S.M. Kamruzzamanand M. S. Chowdhury, “Efficient Method of Size
Independent Printed Bangla Paragraph
Recognition Using ANN and EfficientHeuristics”, Proceedings of International
Conference on Computer and Information
Technology (ICCIT), Dhaka, Bangladesh, pp.
755-758 (2003). [2] Rafael C. Gonzalez, Digital Image Processing, 2nd
Edition, Pearson Education publisher, New York,
2002.
[3] S. M. M. Rahman, S. M. Rahman and M.A.Rashid, “Kohonen Neural Network in Character
Recognition Applications”, Proceedings of
NCCIS, pp. 106-110 (1997). [4] M. R. Bashar, M. A. F. M. R. Hasan, M. F. Khan,
“Bangla Off-Line Handwritten Size Independent
Character Recognition Using Artificial Neural
Netwroks Based on Windowing Technique”Proceedings of International Conference on
Computer and Information Technology (ICCIT),
Dhaka, Bangladesh, pp. 351-354 (2003).
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 1, April 2010
258 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
8/9/2019 Recognition of Printed Bangla Document from Textual Image Using Multi-Layer Perceptron (MLP) Neural Network
http://slidepdf.com/reader/full/recognition-of-printed-bangla-document-from-textual-image-using-multi-layer 6/6
[5] M. F. Zibran, A. Tanvir, R. Shammi and Md.
Abdus Sattar, “Computer Representation Of
Bangla Characters And Sorting Of BanglaWords”, Proceedings of International Conference
on Computer and Information Technology
(ICCIT), Dhaka, Bangladesh, pp. 191-195 (2002). [6] T.M. Ha and H. Bunke, “Off-line Handwritten
Numerical Recognition by Perturbation Method”,
IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol.19, no.5, pp.535-539
(May 1997)
[7] M. A. Sattar, K. Mahmud, H. Arafat and A. F. M.
Noor-Uz-Zaman, “Segmenting Bangla Text for
Optical Recognition”, Proceedings of
International Conference on Computer andInformation Technology (ICCIT), Dhaka,
Bangladesh, pp. 283-286 (2003).
Md. Musfique Anwarcompleted his B.Sc (Engg.) in
Computer Science and
Engineering from Dept. of CSE, Jahangirnagar
University, Bangladesh in
2006. He is now a Lecturer in
the Dept. of CSE, Jahangirnagar University, Savar,
Dhaka, Bangladesh. His research interests include
Artificial Intelligence, Neural Networks, ImageProcessing, Pattern Recognition, Software
Engineering and so on.
Nasrin Sultana Shume
completed her B.Sc (Engg.)
in Computer Science andEngineering from Dept. of
CSE, Jahangirnagar
University, Bangladesh in2006. She is now a Lecturer
in the Dept. of CSE, Green University of
Bangladesh, Mirpur, Dhaka, Bangladesh. Her research interests include Artificial Intelligence,
Neural Networks, Image Processing, Pattern
Recognition, Database and so on.
P. K. M. Moniruzzamanreceived his B.Sc (Hons) in
Electronics and Computer
Science and M.S. inComputer Science and
Engineering from Dept. of
CSE, Jahangirnagar
University, Bangladesh. He successfully completed his post-graduate project on
Image Processing under the supervision of Dr. Md.
Al-Amin Bhuiyan. He is now working as aDatabase Administrator in a renowned commercial
bank in Dhaka, Bangladesh. His main researchinterests include Natural Language Processing,
Artificial Intelligence, Data Mining and so on.
Md. Al-Amin Bhuiyan
received his B.Sc (Hons) and
M.Sc. in Applied Physics andElectronics from University
of Dhaka, Dhaka, Bangladesh
in 1987 and 1988,respectively. He got the Dr.
Eng. Degree in Electrical
Engineering from Osaka City University, Japan, in2001. He has completed his Postdoctoral in the
Intelligent Systems from National Informatics
Institute, Japan. He is now a Professor in the Dept.
of CSE, Jahangirnagar University, Savar, Dhaka,
Bangladesh. His main research interests include
Image Face Recognition, Cognitive Science, ImageProcessing, Computer Graphics, Pattern
Recognition, Neural Networks, Human-machineInterface, Artificial Intelligence, Robotics and so
on.
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 8, No. 1, April 2010
259 http://sites.google.com/site/ijcsis/
ISSN 1947-5500