on mathematical expression analysis in arabic handwriting elena smirnova and stephen watt orcca,...
TRANSCRIPT
![Page 1: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/1.jpg)
On Mathematical
Expression Analysis in Arabic
Handwriting
On Mathematical
Expression Analysis in Arabic
Handwriting
Elena Smirnova and Stephen Watt
ORCCA, UWO,
Feb 2007
![Page 2: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/2.jpg)
Categories of Math Notations• Writing direction
– Math flows against text – Math is written in the same direction as text (right to left)
• Use of alphanumerics and math symbols– Variables
• Use of Latin and Greek alphabet• Use of Arabic alphabet
– Numerals • Use of Western Arabic notation for numbers• Use of Arabic - Indic or Eastern Arabic-Indic numbers
– Math operators and function names• Western notation • Mirrored glyphs• Special Arabic glyphs
![Page 3: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/3.jpg)
Directions in Arabic Math• Dual direction (Persian and Moroccan Styles)
<text 2> math <text 1>
• Single direction (Maghreb and Machrek Styles)
<text 2> math <text 1>
٠ < ۱+ ا ، ب( ا – ب)٢
![Page 4: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/4.jpg)
Numerals in Arabic Notations
• Western Arabic (Europe)0 1 2 3 4 5 6 7 8 9
• Arabic – Indic (Most of Arabic counties)
٩ ٨ ٧ ٦ ٥ ٤ ٣ ٢ ١ ٠
• Eastern Arabic-Indic (Iran, Urdu)
٣ ٢ ١ ٠ ۴ ۵ ۶ ٩ ٨ ٧
![Page 5: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/5.jpg)
Math Variables
• Latin and Greek alphabets
• Arabic alphabet
![Page 6: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/6.jpg)
Math Operators and Functions• European Notation – Persian Style
• Mirrored glyphs – Maghreb “Western” style
• Arabic glyphs – Machrek “Eastern” style
![Page 7: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/7.jpg)
Typeset Arabic Math
• Related projects in rendering typeset math:
– DADTeX, a TeX environment supporting Arabic
– Dadzilla, a MathML browser supporting Arabic
– Arabic Unicode, with respect to directionality
![Page 8: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/8.jpg)
New Challenges in HWR
• Stroke segmentation in text fragments
• If Moroccan or Persian notations are used, structure recognizers has to handle bidirectional input.
• In Maghreb notation recognizer has to handle mirrored glyphs.
• In Machrek notation a special recognition technique is needed for handling ligatures.
![Page 9: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/9.jpg)
Influence on Expression Analysis
• Arabic notation affects methods not only for analyzing the structure, but also for interpreting the results of recognition
• Special attention to be paid to
– Implicit directionality– Mirrored expressions– Special container glyphs– Stretched ligatures
![Page 10: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/10.jpg)
Implicit Directionality
Statement “A2>0 if A>0” written in Farsi
• Recognizer determined the glyphs as { A,>, اگر ,٠ , A, ٠, <,٢ }.
• Persian notation mathematical content flows from left to right
• Naïve structure analyzer may translate this to
A > 0 if A2 > 0(A > اگر ٠ A ٠ < ٢ )
WRONG!!!
![Page 11: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/11.jpg)
Careful Mirroring • Every asymmetric operator is assigned its
mirrored glyph: “(“ “)”, “>” “>“, etc.
٠< ۱+ ا ، ب( ا – ب)٢ a, b )a – b(2 + 1> 0
• Some mirrored glyphs have not only opposite, but very different mathematical meaning– For ex. pair “\”, “/” is direction sensitive:
• “A / B” means division in Left to Right notation• “B / A” means set subtraction in Right to Left notation
![Page 12: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/12.jpg)
New Container Glyphs
notation for “5!”
• The notation for factorial introduces one more case of a container symbol, in addition to the symbols for radical and long division.
• New set of rules to the structural analyzer must be added, i.e. the layout of the expression “n!” will be detected as nested rather than linear.
![Page 13: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/13.jpg)
Advantages• Stretched large operators allow to avoid
ambiguities in structure recognition
Examples
N-ary Summation vs.
N-ary Product vs.
Limit vs.
Maghreb
Machrek
Farsi
![Page 14: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/14.jpg)
Context Assistance• Extra challenge: lots of ambiguous math
characters– ;("1") ١ and ("ALEF") ا – ٠("0") and a dot; – ٥("5") and ه("HEH") or the symbol for degree "".
• Ex:
• Suggested strategy for character disambiguation: use of Math Context Database for Arabic notations
![Page 15: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/15.jpg)
Conclusions• Recognition of Arabic handwritten math introduces
new classes of problems, mainly dealing with – stroke segmentation – structure analysis in bidirectional notations.
• However, many methods developed for European style of math handwriting analysis are applicable to Arabic notations.
• Moreover, certain things that are easier with Arabic notations:– clearer structure organization in case of large delimiters– more explicit distinction between mathematical and text
fragments (in bidirectional notations).
![Page 16: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/16.jpg)
Future Work• Identifying suitable source of Arabic training
material for building Context DB and for training the structure analyzer
• Merging our Mathink framework with existing recognizers for Arabic script (for text fragments)
• Enhancing char recognizers to handle very stretched glyphs
• Adding direction awareness to the structure analyzer
• Developing tools for automated notational profile detection
![Page 17: On Mathematical Expression Analysis in Arabic Handwriting Elena Smirnova and Stephen Watt ORCCA, UWO, Feb 2007](https://reader036.vdocuments.mx/reader036/viewer/2022082817/56649dac5503460f94a9b190/html5/thumbnails/17.jpg)
References[1] Azzeddine Lazrek, Mustapha Eddahibi, Khalid Sami, Cadi Ayyad, Bruce R. Miller. Arabic mathematical notation. W3C Interest Group Note, January 2006. http://www.w3.org/TR/arabic-math/
[2] T. Sari and M. Sellami,Cursive Arabic Script Segmentation and Recognition System. International Journal of Computers and Applications, Vol. 27, 2005.
[3] Al-Emami, S. and Usher, M., On-Line Recognition of Handwritten Arabic Characters. Pattern Analysis and Machine Intelligence, IEEE Transactions (PAMI) Vol. 12, No. 7, 1990, pp. 704-710.