[ieee 8th international multitopic conference, 2004. proceedings of inmic 2004. - lahore, pakistan...

5
Online Signature Recognition and Writer Identification by Spatial- Temporal Neural Processing A. Rauf Baig and Masroor Hussain FAST-Nationa f UniversitJ, of Computer & Emerging Sciences, IsIamabad, Pakistan ruuj baig@nu. edu.pk, masrorhussain@yahoo. com Abstract IH this puper we alkmpt online person idenlificafion b-v signature recognition and also from the mtwd writing of a person. The basic interest is in the novelty of the technique and the methodology utilized. Our sysfeem is hmed on c( newly developed Spntio-Tempord Arti/icial Neui~on (STAN), which is well udupted for [he recognilion of’sputio-temporal patterns. Thi.\ neuron has the cupabili!v io process continuous asynchronous spalio-temporal data sequences nnd compares them with the help of Hermitian distance. The architecture of’ the spiems developedfor both of these person identificaiion pi.oblems is identical. It is bused on lhree modules: pve- processing, feahve detedbn and classljicalion The second md third modules urr bused on nriiral architectures. which have STANs as their neurons. The archilecture und training qf weights of the secad module is based on U spatio-teniporal odaptution of’ Kmeuns ulgorithm nnd the third moduli. is bused on un arloptuiion qf the RCE crlgorilhm. The resuh obinined $)I. both [he upplic~a~ions ow encouraging. Keywords: STAN, online signature recognition, online writcr idcntification. 1. Introduction Recently a neuron model called STAN has been devclopcd [ll-141, which cmulatcs somc of the aspects of the biological neuron. Its main feature is its capability to simultancously handlc the spatial as well as the temporal position of an event in a given sequence of events. This feature makes it suitable for applications which process data in which temporal positioning is also important (e.g. lip-reading [2,3], and onIine character rccognition [I]). Online signature recognition and writer identification (the task of determining the author of a sample of liandwriting) are both spatial-temporal problems (the temporal ordering of the data generated is available), and in this paper we investigate the application of STAN on these problems. This is a continuation of our ongoing cfforts for the application of STAN to diffcrcnt spatio- temporal problems [1,2,3,6,7,13]. Both signature and handwriting samples can be converted into digital forti] with the help of a scanner (offline) or a digitizing tablct (online). In the on-line case the information about the pen-tip movement (generally its position, velocity, or accekration as a function of timc) is available. In the off-line case, only the completed writing is available as in an image. The online case deals with a spatio-temporal representation of the input, whereas the off-line case involves analysis of the spatio-luminance of an imagc. Signature recognition is the most widely used method of person identification. There exist a variety of methods and it is the subject of on-going research. Some references and survey papers on this topic are [5,8,9,15]. Writer identification is possible because handwriting is a skill that is personal to individuals. Each writer’s writing has a sct of charactcristics that is exclusivc to him only. These have many applications in forensic analysis. No effort to date (to the best of our knowledge) has been made to recognize online writer identification, but some work has been done for the offline case [4]. The reason for this may be that its practical use is not currently very obvious. However, as technology progresses and new more sophisticatcd digitizcr tablets arc availahlc wc might very well be routinely using them as an interface for filling forms, etc. At that stage we would certainly find it helpful to determine the author of a given sample of online handwriting. In this paper, we first present the STAN model on which our system is based. Then we explain our system and give sonic experimental rcsults. At the end, wc makc some concluding remarks. 2. Spatial-Temporal Coding & Stan Consider a sequence of asynchronous events. An event is represented by an impulse x whose spatial and temporal aspects are simultaneously taken into account by coding it in the complex domain. In polar coordinates ( , ) the magnitudc givcs thc amplitude and thc angle gives the temporal position (or age) of the impulse from a reference point. x = neim where tan(@) =U Tt hence x” ~ npctan(rrrr) When a new impulse is emitted on a given component, it is accumulated with the previous impulses. The amplitude is made to dccreasc with time. Hencc any given event is forgotten in due course of time. 0-7803-8680-9/04/$20.00 02004 IEEE INMlC 2004

Upload: masroor-hussain

Post on 24-Mar-2017

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: [IEEE 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004. - Lahore, Pakistan (Dec. 24-26, 2004)] 8th International Multitopic Conference, 2004. Proceedings of

Online Signature Recognition and Writer Identification by Spatial- Temporal Neural Processing

A. Rauf Baig and Masroor Hussain FAST-Nationa f UniversitJ, of Computer & Emerging Sciences, IsIamabad, Pakistan

ruuj baig@nu. edu.pk, masrorhussain@yahoo. com

Abstract

I H this puper we alkmpt online person idenlificafion b-v signature recognition and also from the m t w d writing of a person. The basic interest is in the novelty of the technique and the methodology utilized. Our sysfeem is hmed on c( newly developed Spntio-Tempord Arti/icial Neui~on (STAN), which is well udupted for [he recognilion of’sputio-temporal patterns. Thi.\ neuron has the cupabili!v io process continuous asynchronous spalio-temporal data sequences nnd compares them with the help of Hermitian distance. The architecture of’ the spiems developedfor both of these person identificaiion pi.oblems is identical. It is bused on lhree modules: pve- processing, feahve detedbn and classljicalion The second m d third modules urr bused on nriiral architectures. which have STANs as their neurons. The archilecture und training qf weights of the s e c a d module is based on U spatio-teniporal odaptution of’ Kmeuns ulgorithm nnd the third moduli. is bused on un arloptuiion qf the RCE crlgorilhm. The resuh obinined

$)I. both [he upplic~a~ions ow encouraging.

Keywords: STAN, online signature recognition, online writcr idcntification.

1. Introduction

Recently a neuron model called STAN has been devclopcd [ll-141, which cmulatcs somc of the aspects of the biological neuron. Its main feature is its capability to simultancously handlc the spatial as well as the temporal position of an event in a given sequence of events. This feature makes it suitable for applications which process data in which temporal positioning is also important (e.g. lip-reading [2 ,3 ] , and onIine character rccognition [I]).

Online signature recognition and writer identification (the task of determining the author of a sample of liandwriting) are both spatial-temporal problems (the temporal ordering of the data generated is available), and in this paper we investigate the application of STAN on these problems. This is a continuation of our ongoing cfforts for the application of STAN to diffcrcnt spatio- temporal problems [1,2,3,6,7,13].

Both signature and handwriting samples can be converted into digital forti] with the help of a scanner

(offline) or a digitizing tablct (online). In the on-line case the information about the pen-tip movement (generally its position, velocity, or accekration as a function of timc) is available. In the off-line case, only the completed writing is available as in an image. The online case deals with a spatio-temporal representation of the input, whereas the off-line case involves analysis of the spatio-luminance of an imagc.

Signature recognition is the most widely used method of person identification. There exist a variety of methods and it is the subject of on-going research. Some references and survey papers on this topic are [5,8,9,15].

Writer identification is possible because handwriting is a skill that is personal to individuals. Each writer’s writing has a sct of charactcristics that is exclusivc to him only. These have many applications in forensic analysis. No effort to date (to the best of our knowledge) has been made to recognize online writer identification, but some work has been done for the offline case [4]. The reason for this may be that its practical use is not currently very obvious. However, as technology progresses and new more sophisticatcd digitizcr tablets arc availahlc wc might very well be routinely using them as an interface for filling forms, etc. At that stage we would certainly find it helpful to determine the author of a given sample of online handwriting.

In this paper, we first present the STAN model on which our system is based. Then we explain our system and give sonic experimental rcsults. At the end, wc makc some concluding remarks.

2. Spatial-Temporal Coding & Stan

Consider a sequence of asynchronous events. An event is represented by an impulse x whose spatial and temporal aspects are simultaneously taken into account by coding it in the complex domain. In polar coordinates ( , ) the magnitudc givcs thc amplitude and thc angle

gives the temporal position (or age) of the impulse from a reference point.

x = neim where tan(@) = U Tt

hence x” ~ n p c t a n ( r r r r )

When a new impulse is emitted on a given component, i t is accumulated with the previous impulses. The amplitude is made to dccreasc with time. Hencc any given event is forgotten in due course of time.

0-7803-8680-9/04/$20.00 02004 IEEE INMlC 2004

Page 2: [IEEE 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004. - Lahore, Pakistan (Dec. 24-26, 2004)] 8th International Multitopic Conference, 2004. Proceedings of

x ( t ) ~ n ~ - i r s r e - i n r c r a n ( u , r )

where ps = pT = IITW. The temporal window (TW) depends on the application and represents the size of the temporal window inside which one wishes to identify impulse sequences. This ST coding thus dynamically maps thc incoming asynchronous cvents into a continuously evolving vector X. The time corresponding to the arrival of the latest impulse is taken as the reference point. All previous impulses have their temporal position updated (represented by phase angle) with reference to t$e current time. Each component of the vector X is thus updated as soon as a new impulse is prcscntcd to thc input

I

Fig. I : Update of input vector X a t the arrival of each impulse and comparison with stored vector W. Output of an input if X and W a r e close enough.

The STANs are spiking neurons, which work with the ST coding (Fig. I ) . The weight W of a STAN is a complcx vector and it rcprcsents thc scqucncc to bc detected by it. The comparison between X and W can be done by means of a Hermitien product V in one type of STAN and in another type it can be done by Hermitien distance D:

where the bar denotcs the complex conjugate.

3. System Architecture

A. Dnta Acquisition & Pre-processing

Displacenrcnt and its quadzation in 8 directions

From the digitizer tablet we acquire the basic signals of position coordinates of the contact of thc pcn on thc surface of the tablet (Fig. 2). Any written shape consists of a sequence of position coordinates. As our first and basic preprocessing tool, we calculate displacements

from these position coordinates. This process overcomes the spatial translation problem associated with absolute position coordinates.

T 11

Y

Y

Fig, 2: The position coordinates obtained from the digitizer tablet a r e converted into displacements and then the displacements are processed and mapped into impulses.

The calculated displacement between two points has an attachcd dircction. The dircction is quantizcd, by making it equivalent to its nearest basic direction. By basic directions we mean the north, south, east, west, north-east, north-west, south-east and south-west.

*f A

5

==3

North t South A

West A East

North-East

South West

North- West t South-East t

Fig. 3: Each shape is composed of displacements, which are converted into impulses. Thus a shape is mapped into an 8 dimensional, time dependent, pattern of impulses.

Accumulation of displacement

Since our STANs are designed to detect sequences of asynchronous impulses, therefore we have to first convert quantized displacement into sequence of impulses. Since there are eight directions for quantization, therefore the sequence of impulses has eight components (Fig. 3).

We have put a threshold on the length o f the accumulated jmpulsc. If thc accumulatcd lcngth of thc line crosses the threshold an impulse of a magnitude equal to the size of the threshold is emitted and the accumulation starts again. If the accumulated length IS

below the threshold and the pen changes direction, then

382

Page 3: [IEEE 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004. - Lahore, Pakistan (Dec. 24-26, 2004)] 8th International Multitopic Conference, 2004. Proceedings of

an impulse is emitted M + ~ I has a magnitude equal to the accumulated length, and the accumulation starts again for the new direction. This threshold is caIculated from the training database by taking the average length of lines in all dircctions (therc is only onc thrcshold for all directions).

B. Detection 9 f Primitives

Since the different spatio-temporal shapes to be classified may have different lengths, hencc wc first break each shape into its constituent primitives or features (basic components or parts of a shape). For detection of primitives, we first create their representative prototypes. For this purpose, we convert the continuous, but asynchronous, flow of inipulscs from thc first module into ST vectors (formed by spatio-temporal coding of impulses). Each complex domain vector thus created contains in itself the information about the current iinpulse as well as a (exponentially weighted) history of previous impulses. These vectors c m be considcred as points in the Hermitian space, These points are then groupcd into clustcrs with thc help of Kmeans algorithm adapted for complex domain data (called ST-Kmeans) 12,3]. Each cluster center i s the representativc of a priniitive of the shape present in the database. The optimum number of clusters for any given training database is determined experimentally by setting up the complete system and observing the final results for different nuinbcr of clusters. Thc quality of clustcrs obtained by Kmeans can also be improved by the techiiique of hard-partitioning of data.

If there are k primitives then a laycr of k STANs is used for their detection (one reserved for each primitive). Each STAN has the prototype (weight) vector of its primitive.

During thc utilization phase, cach STAN updatcs its ST vector at the arrival of each impulse from the first module. This ST vector is compared with the prototype vector. If the two are close enough, an impulse of unit length is produced at the output of the respective STAN, thus indicating thc dctcction of a primitivc.

C. Detection of Signatureflords

The weights of STAN and thc architecturc of the classification module is according to the RCE algorithm [IO] adapted to complex domain [2,13). According to this algorithni we can have several clusters (and thus several prototypes) for a given class. For our signature recognition problem, a class is a signaturc (or rather the person doing the signature); and for the writer identification problem, a class is a person writing a given word. The procedure of classification is the same as that for the second module. The impulses (each impulse indicating the presence of a primitive) arrive from the second module. Each STAN continuously converts these

incoming impulses to an ST vector and compares the ST vector with its prototype vector (weight vector). If the two vectors are sufficiently close to each other, an output impulse i s produced.

Thc prototypc vector stored in each STAN is the representative of the shape (signature or word) to be recognized by that STAN. It is formed by ST coding of the correct sequence of impulses obtained from the second module (representing the correct succession o f primitives) for a shape. The complete diagram of the three modules is shown in Fig. 4.

Fig. 4: The complete system comprising of three modules.

D. Reset of System

This feature i s used only for writer identification. Tt is the detection of silence between two words. This IS

determined as a function of the time of pen lift and the position of next pen down. If the next pen down is a suffcicnt distance away from the pen lift in the horizontal direction, then i t is assumed that the word has ended. This threshold is calculated on the basis of the average space found beiween words of the writer on whose writing the system has been trained. If the threshold is surpassed a trigger pulse is emitted, which resets all the system, thus enabling it to detect the next word in a morc efficient way (thc residual influencc of the old words due to ST coding is wiped off).

4. Experimentation

A. Signature Recognition

In our data collection 20 peopfe provide exactly 7 different signaturc samplcs (on a digitizer tablet), a total of 140 signature samples. For each person, 3 out of 7 signature samples are used for training our system and rest are used for testing purpose; so there are 60 samples for training and 80 samples for testing

After the projection of each signature onto the eight directions, we calculate the threshold for the emission of an impulsc by thc first module (Section 3 A).

383

Page 4: [IEEE 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004. - Lahore, Pakistan (Dec. 24-26, 2004)] 8th International Multitopic Conference, 2004. Proceedings of

The weights of second module (prototypes for primitives) are determined with the help of S T - b e a n s (Section 3 B). We conduct ten trials each for our training database using 25, 50, 75, 100, 125, 150, 175, 200 and 225 numbcrs of primitivcs. The rcsults against each number of primitives are shown in Figure 5. The graph indicates that 125 primitives give better and stable results with both high mean and median, and comparatively low variance. Hence we select 125 primitives as standard for our signature recognition system's database.

The impulses obtained from the second module (rcprcscnting the dctcction of primitives) have a unique pattern for each signature. These sequences of impulses are grouped together by the supervised algorithm of ST- RCE (Section 3 C). The cluster centers and their regions of influence obtained by the RCE algorithm are placed as the weights and thresholds of STANs of the third module.

I" word classifier

.. E 3 .............. I~ ............. i ............. i .............. L..

-f- Median e Min

0 20

,o ............. ........ ............ ....

13Lh word classifier

Fig. 5: The graph for deciding the number of primitives of the Znd module.

Having fixed the number of primitives at 125, we now utilize 6 out of 7 signature samples of each person for training the system, and use the 7h sample for testing. Firstly, we train our system on samples numbered 1 to 6 and use 7th sample for testing. In the second iteration, samples numbered 2 to 7 are used for training and 1" sample for tcstinp. In this way seven tests are completcd. We achieve 92.28% mean and 92.50% median correct results.

B. Writer Identipcation

We use digitalized tablet to collect the samples of each writer. A text is provided to all writers to write for this purpose. The sample text is "The public was amazed to view ihe qzrickness arid de.rteriiy oJ' rhe juggler". There arc 13 .words containing a total of 60 lettcrs in our text, and it contains all the letters from a-z.

Fig. 6: The sample text written by a writer on the digitizing tablet.

Our database comprises of samples from 5 different writers. There are 8 writing samples of each person, hence a total of 40 handwriting samples. For each writer, 3 samples are used for training our system and rest are used for testing purpose, so there are 15 samples for training and 25 samples for testing.

11 CIassify

Fig. 7: Method of decision for the writer identification classifier.

We segment our sample text into individual words and make the first classification at word level. The words are segmented with the help of the reset tcchnique described in Section 3D. Thus we build 13 different classification sub-systems bascd on the 13 individual words. We make the same graph for each sub-system as we did for the signature recognition system (Fig. 5 ) for determining the number of primitives of the 2"d module.

For testing purpose, we calculate the decisions of all of the 13 sub-systems; and classify thc writing as of the person who has the highest number of votes (classifications) from the sub-systems. We obtain 94% success in writer identification by this method.

5. Conclusion

In this paper, the individuality of handwriting has beeh explored from two different dimensions; signature and natural handwriting. The analysis of both these aspects of handwriting is utilized for person identification. We tackle the on-line problem only.

3 84

Page 5: [IEEE 8th International Multitopic Conference, 2004. Proceedings of INMIC 2004. - Lahore, Pakistan (Dec. 24-26, 2004)] 8th International Multitopic Conference, 2004. Proceedings of

The basic interest is in the noveltv o f our aDaroach. 1131 G. Vducher, A. R. Baig, and R. Seguier, “A Sei of Neural . . Tools for Human-Computer Interactions”: N r w d Cornpiling & Appl-’licaliuns, Springci -Vcrlag, pp. 297-305, Vol. 9, Issue 4, Dec 2000.

The application of a relatively new typc of ncuron called Spatio-Temporal Artificial neuron (STAN) has bcen explored.

Wc achicvc approximatcly 92% and 91% results for signature recognition and writer identification ICANN Proceedings, 2003. respectively. These results show that the recent entry of

[ 141 G. Vaocher, “A Complex-Valued Spiking Machine”,

STAN in the field of pattern classification and i t s integration into ordinary neural networks can turn out to

infomation of a signal is as relevant as its amplitude.

[ I S ] 2. Yong, T. Tieniu, W. Yunhong, “Biometric Personal Identification based on handwriting”, lCPR Proceedhg,T, 2000 be a good classifier option for data where the timing

5. References

[I] A. R. Baig, ”Spatio-Temporal Artificial Neurons Applied to Onhie Cursive Handwritten Character Recognilion”, ESAIVN Pruceedmgs, Belgium, April 2004.

[2] A. R, Bag, “Une Approache Methodologiqtie de I’utilisation des STAN AppliquC a la Rcconnaissance Visuelle de la Parole”, PAD ThesrS, Univ. of Reniles-I, France, Apri[ 2000.

[3] A. R. Baig, R. Seguier, and G . Vaucher, “A Spatio- Temporal Neural Network Applied tu Visual Speech Recognition“, IC.4NN Proce~diirgs, UK, pp. 797-802, Scpt. 1999.

[4] S.H. Cha. “Use of Distance Measure I n Handwriting Analysis”, PhD The.Ti.7, Univ. of New York, Bliffalo, 2001.

[ 5 ] F. LcCIcrc, and R. Plamondon, “Automatic Signature verification: The state of the art 1989-1993”, Inti. Jrd of Pattern Recognition and Arfificial Inteiligence, Vol. 8(3), 1994.

[6] N. Mozayyani, and G . Vaucher, “A Spatio-Tcmporal Perceptron for On-line Handwritten Character Rccogniiion ”, ICANN Proceedings, 1997.

[7] N . Mozayyani, A. R . Baig, aild G. Vaucher, “A Fully- Neural Solution for Onlinc Handwritten Charactcr Recognition ”, IJCMV Pruceedings, USA, 1998.

[S] V. S. Nalwa, “Automatic on-line signature verification”, Proc. fEEE, vol. 85, no. 2. pp. 215-240, 1997.

[9] R. Plamondon, and S. Srihari, “On-tine and Off-Line Handwriting Recognition: A Coniprehensive Survey”, IEEE Truns. on Purlern Anulyvis und Machine Inlelligencc, Vol. 22(1), pp. 63-84, Jan. 2000.

[ f 01 D. Reilly, L. Cooper, and C. Elbaum. “A neural model for category leaining”, Biological Cybernetics, Vol. 45, 1982.

[ I I J G. Vaucher, “An algebra for recognition of spatio-temporal forms”, Ezwo Symposium on Art@ial Neural Networkq pp. 231-236, April 1997.

[I21 G. Vaucher, “h algebraic interpretation of PSP composition“, BioSvstems, pp. 241-246, Vol. 48( l-3), 19%.

385