the computer transcription of machine shorthand

lnter/acesin Computing, 2 (1984) 147 165 i47

THE COMPUTER TRANSCRIPTION OF MACHINE SHORTHAND

ALEXANDER W. BOOTtf

Department o f Computing and Math(,matics, The Polytechnic, Britain)

(Received January 27, 1934)

Woh,erhampton ((;l

Summary

In this paper, research undertaken at Leicester Polytechnic into the computer transcription of Palantype code (the British equivalent of Steno- type code) is described. The work has been investigated by the British Broad- casting Corporation for the automatic subtitling of television to benefit the deaf and hard of hearing. The fundamentals of machine shorthand are explained and the problems of computer transcription are identified. Tech- niques developed to overcome these problems are also described in outline. The research is continuing at The Polytechnic, Wolverhampton, and negotia- tions are taking place with H.M. Stationery Office to investigate the possibility of transcribing the proceedings of the House of Commons by computer.

1. Introduct ion

The computer transcription of machine shorthand code into English text has been a subject of research for more than 20 years. The emphasis during this period has been to develop systems which, by connecting shorthand machines to a computer, are able to output English words only frac- tions of a second after they have been spoken.

It is reasonably well known that mechanical shorthand machines can be used to record verbatim proceedings of conferences, committees, law courts etc. The British system for achieving this is known as Palantype [1] and the U.S. system is called Stenotype [2]. The output from the shorthand machine is a kind of phonetic code and is not readily intelligible to the uninitiated reader. For example, the word believe is represented in Palantype by P+LI.F+. At first glance there seems to be little correspondence between the two but, since in Palantype P+ stands for B, F+ for V and I. for the double E sound (ee), it can be seen that the Palantype code BLEEV is phonetically similar to the word believe.

After a recording session, an operator normally transcribes his/her own output into English words. Although this is not a difficult task, it is very

0252-7308/8,1/$3.00 ¢) Elsevier Sequoia/Printed in The Netherlands

148

time consuming, typically four to five times longer than the recording stage. Therefore, if an operator has been recording for several hours in the law courts, for example, it may be 2 or 3 days before transcripts of the proceedings are available.

Besides being a valuable time saver for Palantype operators, there are several other applications for computer transcription systems: computer typesetting, transcription of the proceedings of the House of Commons and of the law courts and, the most recent application, the automatic generation of television subtitles to benefit deaf and hard-of-hearing viewers. In most of these applications the quality of English text has been impaired by a com- bination of the problems associated with the computer transcription of Palantype code and the goal of real-time transcription. Although the speed and efficiency of computers have increased dramatically over the last 20 years, the goal of real-time transcription has been difficult to achieve because of the errors and ambiguities present in speech itself and those introduced by the recording process. These problems have been examined in detail and in some cases "new" methods have been developed to a t tempt to overcome them.

These new methods for transcribing Palantype code into English text have not had their development hampered by attempts to work within the restrictions imposed by a real-time goal. They have been developed in the short term to reduce to a minimum the effect of errors and ambiguities, and in the long term to provide a framework for a system which incorporates several knowledge sources, in a similar way to a model of linguistic per- formance, to help to resolve errors and ambiguities in Palantype transcription systems.

2. Background

2.1. Machine shorthand Machine shorthand is very different from conventional forms of hand-

written shorthand and, although not as widely practised, is a direct alternative to them. Palantype is the British system of machine shorthand (equivalent to the U.S. Stenotype system) and is used for recording verbatim proceedings

Fig. 1. Layout of the Palantype keyboard.

I { 9

of conferences, committees, law courts etc. It employs a small portable machine with a keyboard of 29 keys. The keys are laid out in three zones (Fig. 1): 12 consonant keys on the left operated by the fingers of the left hand, 12 consonant keys on the right operated by the fingers of the right hand and five vowel keys in the centre of the keyboard operated by the two thumbs.

Any set of keys may be depressed simultaneously to form what, is known as a chord, and each chord pressed is printed on a paper roll in such a way that the order of the keys across the keyboard is preserved, i.e. when the left-hand S is present in a chord it appears in a discrete position on the printed roll. In phonetic terms a Palantype chord usually corresponds fairly closely to the spoken syllable, e.g. the word capillary is represented by the chords CA, PIL and RI, and the word funnel by the chord FUNL.

It will have been noted that the keyboard does not contain a key for each letter of the alphabet. The way in which the missing consonants are represented in Palantype is by depressing two or three keys simultaneously (together with any other keys which might be in the chord). For example, to obtain a representation for B, D, G, V and Z, the + key is pressed simultaneously with P, T, C, F or S respectively. Figure 2 shows the form in which the output appears on a printed roll. In Fig. 2 the orthographic equivalent is given so that the reader may gain some insight into the coding conventions used in Palantype. A description of the major coding conventions has been given by Booth [3].

2.2. Manual transcription The output from the Palantype machine is not readily intelligible to the

uninitiated reader (as with hand-written shorthand) and is taken away at the end of a recording session to be transcribed into plain language (Fig. 3). Operators have no difficulty in transcribing shorthand code into English text because human transcribers can draw (usually subconsciously) on many knowledge sources, e.g. articulatory sources, phonetics, lexicography, syntax and semantics, to help them to resolve errors and ambiguities present in speech itself and those introduced by the recording process. One disadvan- tage of the manual system is that a transcript of several hours' proceedings may not be ready for several days, during which time the operator who is performing the transcription is not available for further recording.

2.3. Computer transcription There have been many different kinds of computer transcription sys-

tems. In those applications where the goal has been real-time transcription, e.g. visual display for deaf people, there has been little or no processing time available for syntax or context analysis. This is because most of the available processing time is spent transforming the Palantype code into English text. In real-time systems an aim has been to make the recording and transcription stages concurrent (Fig. 4).

150

SCPTH+MFRNLJOEAUI.NLCMFRPT+SH 4 /VV~ANvvv~AAA/V%

order of keys

T ÷

+ F

o u

R o A T+S

A I

F U R

J T+

I N

A

J E

L OE

0 U T+

A N +

O

R I

I

0 U T+

0 T

A

L

0E

N

T R

F

p +

tNo

roads

~ d i v e r g e d

in

a

~ y e l l o w

,,..Io o d

and

~ sorry

I

oould

not

-~~.-- Iravel

T It bo th

Fig. 2. Palantype roll with orthographic equivalent.

The limited amount of syntax and context checking which can be done in real time has been described by Szanser [4] and by Booth [3].

In those applications where the goal has not been real-time transcription, e.g. transcripts of proceedings, an emphasis has been put on post- transcription editing facilities (Fig. 5) rather than on automatic correcting routines.

An example of a computer transcription system which at tempted to output English text in real time is the one developed at Leicester Polytech- nic [5]. The work that went into developing the first Leicester Polytechnic

151

a) Recording stage J b) Transcription stage / Palantype

speaker / "~ "~' code

? . operator in L A ~ recording J

mjode~ ,ranscriptionin English words

Fig. 3. Manual transcription system.

Speaker ~ A S I W A S ~

operator

computer English words Palantype chords

Transcription program

Data Structures

VDU

SAYING...

(

Fig. 4. Concurrent recording and transcription stages: VDU, visual display unit.

system became the foundat ion for the present author 's research, and therefore it is worth describing briefly the organization of the transcription system at Leicester Polytechnic.

2.4. The transcription system at Leicester Polytechnic At Leicester Polytechnic an English dictionary of over 75 000 words

and their Palantype equivalents are stored on disk. This dictionary was provided by the National Physical Laboratory (NPL) and was used by them

152

a) Recording stage speake~,.~ ~ storage / /

/r ~^,~ ~ mediu~j

I ~ ~ / ~ n t y p e ~~]

VDU

Editor ACTION .... LOUDER THAN ( ~ WORDS...

J b) Transcription stage

Computer OUTPUT PRINTER or VDU

~ . ACTION SPEAKS LOUDER THAN

'Y I WORDS...

Interaction with human editor

Fig. 5. Separate recording and transcription stages: VDU, visual display unit.

TABLE 1

The first few English words and their Palantype equivalents in the dictionary used at Leicester Polytechnic

English words Corresponding Palantype chords

A /A Abaek /A P+/AC Abacus /A P+/A C/US Abandon /A P+/AN T+/N Abandons /A P+/AN T+/NS

for an earlier project [6]. Table 1 shows the first few words of the dictionary with their Palantype equivalents. It should be noted that in Table 1 the symbols to the left of a solidus (/) denote those keys operated by the left hand, and the symbols to the right of the solidus denote those keys operated by the right hand.

An electronic interface between a Palantype machine and a Burroughs 6700 computer was constructed so that chords pressed on the Palantype keyboard could be processed by programs on the computer. A Burroughs 6700, which is a large computer, i .e. 1.25 Mbytes of electronic memory and 250 Mbytes of hard disk store, was used only because it was convenient. However, the transcription system is not complex and was small enough to be implemented in another version on a microcomputer with a Palantype machine modified to be ergonomically acceptable to the British Broadcasting Corporation (BBC) [7]. The transcription program was capable of tran-

INPUT

Palantype keyboard

TRANSCRIPTION

Computer

i Palantype chords~

i

Lists of i Palantyoe chords

I Transcription 1 program

4% English I diction~Jry

OUTPUT

Visual displav unit

io Fig. 6. The transcription system at Leicester Polytechnic.

1 5 3

Many p e o p l e ask a b o u t s u b t i t l i n g on T V and I g e t bags o,: l e t t e r s a b a t t h a t sup j e c t , why n o t more o { them. But a b r o a d c a s t o r s s o m e t i m e s a r g u e t h a t t h e g e n e r a l p u b l i c d o n t u a n t t o see s u b t i t l e s on t h e i r p r o g r a m m e s . A p a r t f r om t h a t , i t t a k e s a v e r y l o n g t i m e t o s u b t i t l e a p r o g r a m m e . But t h e B B C has een w o r k i n g w i t h a c o u p l e o f u n i v e r s i t i e s d e v e l o p i n g ane~ way t h e s e o b j e c t i o n s . Now i t s a r i ec s a t ~e mustn't raise false hops because therefore ~:e can't show i~ nou, it Is b ill the pr~qramlne. I t Nould "~ ;~ iU:;:: 0 , to a computer, and that fond produce

of s u b t i t l i n g to o v e r c o m e n d e v e l o p m e n t , and a l l t h o i t s s t i l l experiment~! and

r i t [ : ~ n [l/gli:-;i~ instantly un t h e s c r e e n . I t c o u l d be used f r l i v e p rog rammes wi t h a t a g r e a t d e a l o f p r e p a r a t i o n . To cope t~ i t h t h e o t h e r p r o p l e a , s u b t i t l e s t~ould n o t have t o a p p e a r on e v e r y p t i s s c r e e n , o~n l i o f t h o s e uho ~ a n t e d to see them. So t h a t f o o d remove t h e j e n r l o b j e c t i c ! l t o s u b t i t l e s . Tha t i t s p o s s i b i l i t y b l b e c a u s e o f a n o t h e r B B C i n v e n t i o n c o l d see facs ~Hlich you • i11 be a b l e to see den s t r a y t e d i n n e x t weeks p rog ramme u h i c h ~as a l s o r e c o r d e d i n n i n e t e e ~ s e v e n f i v e p f o r t h e s e x p e a r mer i ts on s u b t l i n g b e g a n .

Fig. 7. Example of the output from the Leicester Polytechnic system.

scribing Palantype chords into English words and displaying them on a visual display unit at speeds which were adequate to keep up with fast speech and the speediest of shorthand machine operators. Figure 6 shows a block diagram of the first Leicester Polytechnic system, and Fig. 7 is a typical example of the output produced by that system. It is clear from examil~ing the text in Fig. 7 that the Leicester Polytechnic system had several shortcomings; some of these were due to the problems associated with the transcription of shorthand code to English text. In Section 3 these problems are

154

identified and in Section 4 the present author 's a t tempt at resolving them are described in outline.

3. The problems of transcribing Palantype code into English text

Although many hurdles have been overcome to facilitate the transcription of Palantype code into English text by computer, some problems still remain.

3.1. Operator keying mistakes The major factor affecting the quality of English text produced by

Palantype transcription systems is the number of keying mistakes made by the machine operator. Recording verbatim means that the operator has sometimes to work at speeds in excess of 200 words per minute. It has been shown [6] that, as the recording rate increases, so does the operator's error rate. The transcription program at Leicester Polytechnic, on encountering erroneous chords which failed to match English words, resorted to subjecting these chords to a simple pseudo-phonetic transla~lon procedure, so that for example the chord P+RI. would be output as BREA. This procedure has been described by Booth [3].

3.2. Word boundaries Transcribing machine shorthand code into English text is made more

difficult by the fact that there is no means of indicating in a string of chords where one word ends and the next one begins, i.e. there is no explicit end-of-word marker to guide the transcription process. Although the transcription program which produced the output in Fig. 7 produces output consisting wholly of English words from error-free input almost all the time, examples can be contrived which would cause it to fail. An algorithm which always produces output consisting wholly of English words from error-free input has an important implication for research into error correction, and this is as follows. When the algorithm outputs any group of letters other than English words, it is known that there is at least one error in the input string. It is only when the presence of an error has been detected that procedures can be invoked to correct it. Therefore, some research has been done on developing an algorithm which always produces output consisting of English words from error-free input.

The word boundary problem can be viewed as two separate problems. The first has already been mentioned, that of producing output consisting wholly of English words from error-free input. The second problem arises because, even when the ou tpu t consists wholly of English words, the words have not necessarily been partit ioned at the correct word boundaries. For example, the chords /A N/EU S/I T/IN can be transcribed equally well in three different ways: a new sitting; anew sitting; a newsy tin.

I ~

3.3. Homophonic ambiguities As Palantype is a phonet ic shorthand, words which sound the same are

Palantyped identically. For example, the words for, four and fore are represented in Palantype by the single chord F/OR. Of the 75 000 words in the dict ionary used by the Leicester Polytechnic system, almost 1000 are one of a pair or triad of homonyms. In the Leicester Polytechnic system the designers were faced with one of two options; output t ing all the alternatiw, homonyms separated by a solidus (/), or output t ing the word that was felt to be most commonly occurring in everyday spoken English. Let us consider the alternative sentences "The for / four / fore ladies sat down for / four / fore / tea" or "The for ladies sat down for tea". The latter opt ion was chosen at Leicester Polytechnic, but it is clear that the resulting text is not satisfactory.

3.4. Keyboard design One cause of operator keying mistakes might be the design of the key-

board, both ergonomically and physically. It has been shown [6] that the design of the Palantype keyboard is ergonomically poor since, for example, the weakest fingers (commonly the third and fourth of the left hand) have the most work to do. No effor t was made in the present author 's research to improve this aspect of the transcript ion system.

With an emphasis on preventing errors rather than on correcting them, Downton et al. [8] have had some success by improving the physical opera- tion of the Palantype keyboard with the in t roduct ion of electronic com- ponents. Some work has also taken place at the BBC [9] in developing an electronic Palantype keyboard . No work of this nature has taken place in the present author 's research.

3.5. Incomplete lexicon No mat te r how large the dict ionary becomes, the operator will always

encounter words which are no t in it. Consequently, when the chords which represent those words are pressed, the transcription program will fail to find a match for them and will therefore assume that there is an error in the input string, even though there may not be. Special program features to help to reduce the effect that such words have on the ou tpu t text have been developed and will be described subsequently,

3.6. Error and ambiguity present in speech As the Palantype operator is recording verbatim with little or no time

for on-line editing, errors or ambiguities in speech are transferred through the transcription system to impair the quality of the ou tpu t text . Inaccurate or hasty art iculation can cause words or parts of words to sound quite different from what was intended. Also, a very common "e r ro r " in everyday spoken English is that sentences (and sometimes words) are left unfinished. These factors have been considered in the goal of defining a "grammar '~' particularly suited to parsing the ou tpu t f rom Palantype transcription sys.- terns.

156

4. Resolving the problems

Having considered the problems associated with the transcription of Palantype code to English text, the present author's at tempts at resolving them will now be described in outline.

4.1. The word boundary problem There have been several different approaches to resolving the word

boundary problem. Although it could be argued that one of these approaches is only an extension of another, they are mentioned separately here because, as they were developed in different institutions and/or at different times, some of the differences between them are significant.

The first approach, known as the the longest match technique, was used by Galli [10] and Price [6] and was the basis of several algorithms to follow. The second approach, called two-chord look-ahead, was developed in an ad hoc fashion by Booth and Barnden [5] in an a t tempt to improve the output text by recognizing the shortcomings of the longest match and trying to eliminate them. The third approach, named n-word look-ahead, was developed as part of the present author 's research and was inspired by the look- ahead concept of two-chord look-ahead. This concept appeared to be the key to producing output consisting wholly of English words from error-free input.

This last point is very important since the word boundary problem can be viewed as two separate problems and indeed was viewed as such for the major part of the present research. The first problem is how to produce output consisting wholly of English words from error-free input. The second problem arises because, even when the output consists wholly of English words, the words have not necessarily been partitioned at the correct word boundaries. For example, the chords /A N/EU S/I T/IN can be transcribed equally well in three different ways and still produce English words: a new sitting; anew sitting; a newsy tin.

When the two problems were considered together instead of separately, a partial solution to the word boundary problem was achieved in an algorithm called WORDFINDER. This has been described in detail by Booth [11], so we shall not deal with it here.

4.2. Incomplete lexicon It was stated earlier that, no mat ter how large the dictionary becomes,

the operator will inevitably encounter words which are not in it. However, the fact that the dictionary can never contain all the obscure,

foreign, made-up and swear words that the operator will inevitably encounter does not mean that it should not contain all those words that an operator is reasonably likely to encounter. At the moment , the system dictionary does not contain all the words that an operator is likely to encounter. A glaring omission is the possessive form of proper nouns. Words such as Britain's, London's , Churchill's etc. have not been included in the dictionary although

T A B L E 2

Examples of a l t e rna t ive r ep r e s en t a t i ons of words end ing in y

157

English word Dictionary representation Alternaliw'

Correct ly CO REC T L I CO RECT LI Recen t ly RI. SEN TLI RI. SENT LI His tory HI STRI HIS TRI Only OE NLI OEN LI Ent i re ly EN T A I R LI EN T A I R LI Surely SHOU RLI S t f O U R LI Simply SIM PLI SIMP LI Cheaply CHEA PLI CHEAP LI Quickly CFI CLI CFIC LI etc.

t hey readi ly occu r in eve ryday English. To o v e rco m e this p rob lem, the sys- t em d ic t iona ry could be upda t ed to include the possessive fo rm of each p r o p e r noun . However , this would be a very significant increase in d ic t ionary size and might have impl ica t ions for t ree-searching t imes etc. An al ternat ive approach a dop t ed b o t h at the NPL [6] and in the present research is as follows. When the t ranscr ip t ion program assumes a cho rd to be e r roneous tha t ends in S (final), it dele tes the S f rom the e r roneous chord and pe r fo rms a re t ransc r ip t ion o f the re levant par t of the input string. If the retranscrip- t ion is successful where the mod i f i ed chord has been m a t c h e d wi th a p ro p e r n oun (since d ic t ionary entr ies for p r o p e r nouns are marked to indicate tha t t h e y s tar t with a capital le t ter ) , t h en apos t ro p h e s ('s) is added to the p ro p e r n o u n before it is ou tpu t . If the re t ransc r ip t ion is unsuccessful , t h en the S is added to the e r roneous chord again and the normal co r rec t ion p rocedu re is invoked.

4.2.1. A preprocessor for alternative representations Opera tors f r equen t ly have an a l ternat ive r ep resen ta t ion to those listed

in the sys tem d ic t ionary . Notab le examples of this are words ending in y (Table 2). To ove rcome this p rob lem, the sys tem d ic t iona ry could be upda ted to include two entries of all words ending in y. Again, this would increase the size of the d i c t iona ry signif icant ly and so a d i f f e ren t approach was adop ted . When the t ranscr ip t ion p rogram assumes a chord to be e r roneous tha t ends in I, it deletes the last c o n s o n a n t o f the previous chord. adds it as the first c o n s o n a n t of the e r ron eo u s chord and pe r fo rms a re t ranscr ip t ion of the re levant par t of the input string. If the re t ransc r ip t ion is successful, t h en it is assumed tha t the cho rd has been cor rec ted . If the re t ransc r ip t ion is unsuccessful , t h en the chords are res tored to wha t t hey were be fo re this " spec ia l " co r r ec t i on a t t e m p t was made and the normal co r r ec t ion p r o c e d u r e is invoked.

For a l ternat ive r ep resen ta t ions of words no t ending in y a d i f fe ren t so lu t ion was a dop t ed . During the analysis of o u t p u t t ex t s it became appa ren t

158

TABLE 3

More alternative representations

English word Dictionary representation Alternative

Responses RE SPON SS RES PON SS Interested IN TRE STT+ IN TRES TT+ Signal SI C+NL SIC+ NL System SI STEM SIS TEM Question CFE STJUN CFES TJUN And AN+ ANT+ Produced PROE T÷EU ST+ PRO T+EU ST+ Viewer MFEU.R MFEUR University EUN FUR STI EUN MFUR STI etc.

tha t there were be tween 20 and 30 words tha t were cons is ten t ly represented by an al ternative Pa lan type chord sequence. Some of these words for which opera tors had a legal a n d / o r valid al ternative representa t ion are listed in Table 3. I t can be seen tha t this cou ld be quite a serious p rob lem, especially if an al ternative representa t ion is for a word tha t occurs as o f ten as the word and. The solut ion to this p rob lem was to upda te the sys tem d ic t ionary with the al ternative representa t ions . However , there was no easy way of upda t ing the sys tem d ic t ionary w i t h o u t runn ing a p rogram which p r o d u c e d the tree s t ructures for all 75 000 words . As the execu t ion o f this p rogram t o o k such a large a m o u n t o f c o m p u t i n g resources (a few hours ' p rocessor t ime) an alternative m e t h o d was searched for. The m e t h o d adop t ed was to make the d ic t ionary appear as t hough it had been expanded by having a preprocessor in f ron t o f the t ranscr ip t ion p rogram which con t inua l ly checked for chord sequences which m a t c h e d any of the words in the above list. I f a m a t c h was found , the chord string was modi f i ed to the no rma l conven t ion as in the dic- t ionary . The al ternative solut ion which involved developing a p rogram to upda te the tree s t ructures was n o t very a t t ract ive because it wou ld have been a m u c h m o r e t i m e ~ o n s u m i n g task.

A fur ther p rob lem was the f r equen t omission of the H key in the chord TH. This causes the let ter T to be o u t p u t instead o f the word the. As the word the occurs so f requen t ly , a few mistakes of this kind can soon affect the readabi l i ty o f the o u t p u t text . To ove rcome this p rob lem the chord TH was subs t i tu ted for T in all s i tuat ions excep t where the chord which suc- ceeded T was a valid successor to T. For example , in the string T T+/E, TH would n o t be subs t i tu ted for T, as T+/E is a valid successor to T in the chord r equ i r emen t for the word today . This was also taken care of by the preprocessor .

4.3. A u t o m a t i c error correct ion 4.3.1. Stra tegy A stra tegy for error de t ec t ion and co r rec t ion was adop t ed as follows.

1 5 ! t

4.3.1.1. The analysis o f operator errors. This involved runn ing the t ransc r ip t ion p r o g r a m wi th t h o u s a n d s of chords f r o m five d i f f e ren t ope ra to r s and t hen analys ing those chords which the p r o g r a m had tagged as e r roneous . The ma in resul ts can be s u m m a r i z e d as fol lows.

(i) A b r e a k d o w n of the to t a l n u m b e r of errors in t e rms of de le t ion (keys missing), inser t ion (ex t ra keys) and subs t i tu t ion ( exchanged keys) showed tha t 19.6% of the to t a l were subs t i tu t ion errors, 44.5% of the to ta l were de le t ion errors and 35.9% were inser t ion errors.

(ii) More errors were m a d e with le f t -hand c o n s o n a n t s than wi th vowels or r igh t -hand consonan t s .

(iii) 90% of all ana lysed errors d i f fe red f r o m the in t ended chord by on ly one key .

(iv) No e r r o n e o u s cho rd was found to d i f fer f r o m its i n t ended chord by m o r e t han th ree keys .

(v) 53% of all ana lysed er rors occu r r ed in " in i t i a l " chords (chords which can s ta r t a word) . 39% of er rors occu r r ed in med ia l syl lables and 8% of errors occu r r ed in final syllables.

(vi) An analysis o f the i n v o l v e m e n t of each key in e r r o n e o u s chords as a pe rcen tage o f the t o t a l n u m b e r of errors showed tha t the + keys were the cause of m o s t errors.

4.3.1.2. The d e v e l o p m e n t o f error correct ion tools. Before e r ror correc- t ion p r o c e d u r e s could be deve loped , cer ta in tools were needed to m a k e the p r o c e d u r e s m o r e ef fec t ive .

(i) An error metric: the first t oo l deve loped was an error met r ic to d e t e r m i n e h o w close in t e r m s o f k e y depress ions a chord which failed to find a m a t c h was to the chords in the list against which it was being c o m p a r e d . This too l s u b s e q u e n t l y p r o v e d very successful in the process of iden t i fy ing e r r o n e o u s chords .

(ii) A "we igh t ing" routine: this co r r ec t ion too l was designed to com- p l e m e n t the e r ro r me t r i c by tak ing a list o f chords having the same key d i f fe rence values and order ing t h e m so tha t , o f the chords in the list, the chord m o s t p r o b a b l y i n t ended by the o p e r a t o r was p laced at the top . The cr i ter ia on which the list was o rde r ed were based on i n f o r m a t i o n a b o u t inser- t ion , de le t ion and subs t i t u t i on errors elici ted f r o m the analysis o f o p e r a t o r errors.

4.3.1.3. The ident i f icat ion o f the source o f error. As an e r ro r can occur at any p o i n t in the i npu t string, m e t h o d s for d e t e r m i n i n g the ac tual chord or chords which are in e r ro r can be qui te c o m p l e x . Once the p resence of an e r ror has been d e t e c t e d , the ac t ions t a k e n to iden t i fy the source of e r ror d e p e n d on the p o i n t in the t r ansc r ip t ion process a t which the p resence of an e r ro r b e c a m e a p p a r e n t , i.e. the ac t ions t a k e n are governed by the charac- ter is t ics (e.g. initial, media l , f inal) and the n u m b e r of chords p rocessed since the last English word was o u t p u t .

160

The poin ts in the t ranscr ip t ion process at which it becomes appa ren t tha t an e r ro r is p resen t in the inpu t string were classified and labelled. The cond i t ions which cause these poin ts to be reached were analysed and corresponding rou t ines to iden t i fy the source o f e r ror were wri t ten . These have been descr ibed in detai l by B o o t h [3] .

4.3.1.4. The implementation of error correction procedures. Once the presence o f an e r ror in the inpu t string has been de tec t ed , the ident i f ica t ion process descr ibed in Sec t ion 4.3.1.3 is c o m b i n e d with a co r rec t ion procedure in a rou t ine to cor rec t errors au tomat ica l ly . The first co r rec t ion rou t ine to be descr ibed uses knowledge a b o u t the chords processed since the last English word was o u t p u t and up to the chord at which it became appa ren t t ha t an er ror was present in the input string. F o r want of a be t t e r phrase, this was called the h i s to ry rou t ine . The second cor rec t ion rou t ine to be descr ibed n o t on ly uses his tor ical knowledge bu t also looks ahead in the inpu t string to ob ta in more i n f o r m a t i o n to help it to cor rec t e r roneous chords. This rou t ine was called the h i s to ry and fu tu re rou t ine .

(i) The history routine: this co r rec t ion ro u t i n e func t ions in the fol- lowing manner .

(1) It assumes one chord to be e r roneous (depending on which e r ror po in t it has been invoked f rom, as expla ined above) and submits the lists in which the e r roneous chord was supposed to f ind a ma tch to the error met r ic . The e r ro r metr ic p roduces two lists o f chords , one having a key dif- fe rence value o f one and the o the r having a key d i f fe rence value o f two. The weight ing rou t ine is t hen invoked to o rder the lists so tha t the chord mos t p r o b a b l y i n t ended by the ope ra to r is a t the top of the lists.

(2) I t t hen subs t i tu tes each chord in tu rn for the e r roneous chord in its pos i t ion in the inpu t string (using first the list o f chords with a key differ- ence o f one, and if necessary the list o f chords with a key d i f fe rence of two) .

(3) The par t of the inpu t string tha t was cu r ren t ly involved in building words when the error became appa ren t is t hen re t ranscr ibed with a substi- t u t ed c ho r d in place o f the e r roneous chord .

(4) If the t ranscr ip t ion is successful, i.e. English words are p r o d u c e d w i t h o u t any m o r e errors being de t ec t ed , t h en it is assumed tha t the substi- t u t ed chord is cor rec t . If the t ranscr ip t ion is unsuccessful , t h en the process is r epea ted f r om (2) onwards , unt i l all the chords in the o rdered lists have been tr ied.

(5) If this still does no t p r o d u c e a successful t ranscr ip t ion , t hen the process is r epea ted f rom (1) above unt i l all possible sources of e r ror have been tr ied.

This co r r ec t ion rou t ine i m p l e m e n t e d on the c o m p u t e r improved the qual i ty o f the o u t p u t t ex t s ignif icantly (Figs. 8 and 9). However , in some cases the e r roneous chord was no t successful ly ident i f ied; in o the r cases, even when it was successful ly ident i f ied it was n o t co r rec t ed to the chord tha t the o p e r a t o r in tended . In m a n y cases the reason for this was because the co r rec t ion rou t ine could use on ly the string o f chords up to the one at which

161

. . . . . . . . . . . . . . . . . . . . . . . . . 1

[ F+P, NEUL T h a n k y o u l'lr S h a u . PFOR I g i v e my t a l k I ~ a n t t o ".;a'/ i s o m e t h i n g a b o u t t h e S I S TEM t~e a r e u s i n g t o - d a y I t u s e : ; a

s h o r t h a n d m a c h i n e t o t a k e d o u n t h e ~ 'orc!s t h a t I am S E I N * Z~LIi:

t h e tlORI- h a n d l a a c h i n e o p e r a t o r u s e s a p e c u l i a r c o d e ANT+ t h e f i r ~ ; t s l i d e i s u h a t a s h o r t h a n d m a c h i n e o p e r a t o r n o r m a l l y P~RO de~; {Jg [,;e are us ing a compute r to CIIE.N +d t h a t into the uords yeu can

s e e on t h o s e s c r e e n s . H o t a l l t h e ~ o r c l s a r e s p e l t c o r r e c t t i !

B u t u s e t h e same t e c h n i q u e as yOLI u s e i n l i p r e a d i n 3 I f y o u I donZt u n d e r s t a n d a word + c a r r y on ANT+ t r y AHT~ make sense o f ~"~r e

~zho le SELl t e n s e . I f y o u s e e a sv : ,b ,~ , i ~ mea l , s i h ~ t L : t : , . - - : I

' '~ Fb i . : : , i ~ : g TEU.C + ANI~ you i q n o r e ~,.~ M:~r-'~t:; i'fO',', i ~ ; g~ ' : : '

: . : ' i H p r o v e , ! v : , r s i o n o~ the oHe t : ; ~ d •',,J.. FIr P , 4 1 t ] d . , ~ i l l ' . , u . . - ~ ' ' .

E~.e l~out;e o [ CO l:OHS . i ie h a s [ o u n d t h e SIC; l ee r v e r y b e n e . i ~ : i a l

ANT+ 14e a r e f o r TEU 9 n a l : t o h a v e [ l r ash [ . I S P a l a n t y p i s ~ + r , : i s s

b e a r d + IJho i s a T+REC tot" o f t h e P a l a n ~ y p e o r C+ I IA I SE. S,qN A ~ t : +

h a s b e e n o f t r e m e n d o u s h e l p t o us i n d e v e l o p i n g t h i s S I S ] E t i .

She i s h e r e t o - d a y P a l a n t y p i n g HFOIT sa ' / fc, r y o u . Fir A s h l e y ~ o . , n : l

t h e S I S TEM o f g r e a t b e n e f i t + LJe h o p e t h a t y o u ~ i l l a l s o f i n c l i t

h e l p f u l l i n f o l l o w i n g T t a l k t o - d a y . l . lha t I h a v e b e e n a s k e d [ o l e c t u r e on i s T V s u b t i t l i n g . TF+ Sup ± i t l i n g f o r t h e h e a r i t . 3

i l ; ~ p a i r e d . T h i s h a s b e c o m e v e r y i m p o r t a n t R I . s e n t L I i n i n c; ' - , :~dj .,~ because of the i n TRO duck SLIM of the TF~ SIS Thames ~hi :i~ : - e

called see FACS ANT+ orncle I can no;; see my notes . "~

FACS o r a c l e S I S TEI'I i s a d a t a t r a n s l a i s s i o ~ S I S TEH t, ,N[÷ i t a~ I ~ ' : ;

>'at} t o see d a t a u h i c h a computer has PRO T~EU ST+ on you r curt T [+

S e t .

Fig. 8. Uncorrected output.

the error became apparent, to test whether it had identif ied and corrected an erroneous chord successful ly. If the erroneous chord is the latest to be received, then this string is clearly inadequate as the correct ion routine is unable to see h o w the subst i tuted chord will fare with chords which c o m e after it. For example , let us consider that the transcription program has received the chords NO R MLI PFRO T+EU when it b e c o m e s apparent that there is an error present in the input string. N O R is an initial chord, MLI is a second-level successor to N O R and has a pointer to the English word normal ly . PFRO is not a third-level successor to N O R and MLI but is an initial chord (i.e. the start o f the next word) . T+EU is no t a second-level successor to PFRO, and PFRO is no t a word in its o w n right. There is no other initial chord be t we e n NO R and PFRO, and so the correct ion rout ine assumes T+EU to be erroneous. At tempts at correcting T+EU prove unsuccessful since it is very far in terms o f ke y dif ference from any o f the second- level successors to PFRO. The correct ion rout ine n o w assumes PFRO to be erroneous and finds that there is another initial chord only one key differ- ence away, i.e. FRO. (the subst i tut ion error o f P for .}. T+EU is a second- level successor to FRO. and so the retranscription is successful. It appears as though the transcription is going to produce the phrase normal ly fraudulent .... However , a l though the error has been successful ly identif ied, it

1 6 2

DR HEI.IL . T!~ar~k y o u Mr S h a u . [~ ,efore I g i v e :~y t n l k T t ~ n ; ~ ~o 5e, 3,

s o m e t h i n - 3 a b o u t t h e s y s t e m u e a r e u s i n g t o - d a y I-~ u s e r , a

s h o r t h a n d m a c h i n e t o t a k e down ~he w o r d s t h a t I am # a y i n . ~ B u t t h e s h o r t h a n d m a c h i n e o p e r a i o r u s e s a p e c u l i a r c o d e a n d ~i- .~ - ~ i r s t s l i d e i s ~Jhat a s h o r t h a n d m a c h i n e o p e r a t o r n o r m a l l y FRAY-! DED} SS .

14e a r e u s i n g a c o m p u t e r t o c h a n g e , t h a t i n t o t h e ~ o r d s y o u c a n s e e on t h o s e s c r e e n s . N o t a l l t h e ~ o r d s a r e s p e l t c o r r e c t ! , 2 . B u [ u s e t h e same t e c h n i q u e a s you use i n l i p r e a d i n g . I f y o u d o n ' t u n d e r s t a n d a w o r d , c a r r y on a n d t r y a n d m a k e s e n s e o f t ~ l e u h o ] e

s e n t e n c e . I f y o u see a s y m b o l i t means t h a t t h e r e is a m ~ s s s t e a k , nnd you i g p . o r e t h e :~o;c!s b e f o r e i ~ . l h i c , s!~c~[e~,~ i : ; a ~ i , :~:or .nv~: t

~ , o r t u n a t e s %o h a v e Mr A s h l e y ' s P a l a n e y p i s t , m i s s b e a r d , '~;~o i s a ( I r a k _ e _ t o r o f t h e P a l a n t y p e o r g a n i s a t i o n and h a s b e e n o f t r e m e n d o u s h e l p t o us i n d e v e l o p i n g t h i s s y s t e m . She i ~ h e r e - t o - d a ~ P a l a n t y p i n g ~ h a t I s a y f o r y o u . t l r A s h l e ~ f o u n d " t h e ~ y s t e m o f g r e a t b e n e ~ i ~ , ~e h o p e t h a t y o u u i l l a l s o ~ i n d i t l ~ e l p - , ; u l l i ~ ~ o l l o ~ i n q t h e t a l k t o - d a y . . I~ha t I h a v e b e e n a s k e d t o I e c t u r e

on i s T V s u b t i t l i n g . TV S u b t i t l i n g ~ o r t h e b e a r ~ j ~ m p a i r e d . T h i s h a s become v e r y i m p o r t a n t r e c e n t l y i n i n g l a n d b e c a u s e e ~ t h e

i n t o d u c k s a n e o~ ti~e TV s y s t e m ~,h ich a r e c a l l e d s e e : - :~x a n d o - r - a c - l e - - - I can n o , s e e my n o t e s . . The s e e ~ax o r a c l e s y s t e m i s a d a t a t r a n s m i s s i o n s y s t e m and i t a i l o ' ~ s y o u t o s e e d a t a ~ , h i c h a c o m p u t e r h a s p r o d u c e d on y o u r o'~m T V s e t .

Fig. 9. O u t p u t p r o d u c e d by a t r a n s c r i p t i o n p r o g r a m i n c o r p o r a t i n g p r e p r o c e s s i n g cor- r e c t i o n s a n d the history c o r r e c t i o n r o u t i n e .

has n o t b e e n successful ly corrected. This is conf irmed when the next chord SS in the input string is received and there is no such word as frauduses. Had the h is tory rout ine been able to use SS to test whether FRO. was a successful correct ion, it w o u l d have f o u n d that it was n o t and indeed, as can be seen in Fig. 10, w o u l d have f o u n d that the in tended chord was PROE (the subs t i tu t ion error o f F for E) and that the in tended text was the phrase normal ly produces ....

(ii) The history and future routine: this correct ion rout ine has the abil ity to l ook ahead in the input string to help it to test whether a chord has been successful ly corrected. It does this by keeping several chords behind the operator. The h is tory and future rout ine funct ions in exact ly the same way as the his tory rout ine except that, when the correct ion rout ine is retranscribing the relevant part o f the input string conta in ing the subst i tuted chord, it also transcribes several chords from the "future". This is a significant i m p r o v e m e n t on the h is tory rout ine (Fig. 10) .

4.3.2. Results The effect o f the correct ion rout ines on the qual i ty of the o u t p u t t ex t

varied s ignif icantly depending on the number of errors in the input , h o w c lose ly together the errors occurred and h o w "bad" the errors were. Overall,

! 63

F ............................................................ ] 4 [ I?. ' ) .,"~ them!< y m l ~1;" Si~a;~ . . . . . F, , [ . . . . ~ : ' =T l iV :2 i:>' '. ~ [: 2 : : h . ~ : " ( " ~'2 I . . . . . . . . . - ( s o m e t h l n j a b o u t t h e s y s t e m ~e a r e usia~1 t o - d a y . I t u_~,e_ ~, a

I s h o r t h a n d m a c h i n e t o t a k e d o u n t h e v o r d s t h a t I ~r, sa ' . z in ,~ g u t i t h e s h o r t h a n d m a c h i n e o p e r a t o r u s e s a p e c u l i a r c o d a e.nd t h e # i r . ~ ; t

s l i d e i-'~ u h a L a s h o r t h a n d m a c h i n e o p e r a t o r , n o r l ~ a , i [ ? . _ p ~ o ~ t c r ~ - . . . . . . . . ' ~ ,

a r e u s i n g a c o m p u t e r t o c b a n : j e t h a t i n t o t h e ~ o r d s . y o u c a n .~}ea o:~

t h o s e s c r e e n s . t l o t a l l t h e '~ord.s a~e s p e l t c o r r e c t l y B,nL u~;e

t h e same t e c h n i q u e as y o u use i n l i p r e : - d i n g I ~ ?c:u d o n ' t u n d e r s t a n d a '~ord , c a r r y on and t r y and lat tke s e ; ; s : - o £ t h e ~,'na~_e .~ ;en tenca I f y o u s e e a .~;ylnbo! i t moans t h a h ~ b e , ~ i s ~',x: i~.:; .~;~);~k

~-:il.~ ',, ~:i " ; : l ; ~ o c e ~ h ? ~ , , o : T [ s b , ' O T ' i [ . i i ; : : ' : : '-, i , - i ~ ' [

~ o r L u n a k e s t o h a v e i'!c A s h l e y ' s P a ! a n t y p i : ; c , m i s t ; ~ e a . - d , ',Jl;,) i~; a

d r a k e t o t o~ t h e P a l a n . p ~ o r g a n i s a t i o n and h a s be:za o f z r e m ~ n d o , ~ : ; ' - h ~ l p - - { - o - - u s i n d e v e l o p i n g t h i s s y s t e i n . She ~s h a r e L(, day

Palantyping u h a t I s a y ~ o r y o u . ~lr A ' ; h l e y - f o u n d t h e Sy:-.L-oL~ r,~

g r e a ~ b e n e f i t ue h o p e t h a t y o u m i l l a l s o ~ i n : ! i t h e l p [ u l ~ in f o l l o , ~ i n g t h e t a l k t o - d a y . . N h a t I h a v e b e e t ~ a s k e d t o l e c k u r e

on i s T V s u b t i t l i u g ~V Subti / l in.,9 fo r t h e hearin~j i : n p ~ i r e d . This has become very impor tant r e c e n t l y in in 91and be~a,~s~a o~ th~ i n t r o d u c L i o n o f t h e TV s y s t e ~ ~.~hich a r e c a l l e c [ - ~ ; e ~ - ~zax a n d o ~ ' a c l e • I c a n nc~4 .see my n o t e s . The s e e ~ a x o r a c l e .sv : ; -~ ; ;~ i.~; a ( ~ : ; ~ : ~

t r a n s m i s s i o n system ant i i~ a l l o ' 4 s you t-o---see d a t a ~ . h i c h a co:: ,pL~L~;-

i h a s . p r o d u c e d on y o u r o u n T V s~;t I

Fig. 10. Output produced by a transcript ion program incorporat ing preprocess ing correct ions and the h i s tory and future correct ion routine.

the history correction routine corrected 34% of all errors and the history and future correction routine corrected 52% of all errors. Figure 8 shows the output from a transcription program incorporating no error correction whatso- ever. Figure 9 shows the same output but after the preprocessing corrections and the history correction routine have been applied. Figure I0 shows the output after the preprocessing corrections and the history and future correction routine have been applied. Figures 9 and I0 are examples of output in which the Palantypist was performing rather better than average. Successful corrections are underlined with a full line and failures are underlined with a broken line.

4.4. A grammar and parser for the output from transcription systems It was previously m e n t i o n e d that h u m a n transcribers are able to draw

on many knowledge sources to enable them to transcribe Palantype code into English text• One o f these sources, syntax , is concerned with the goal of producing a cons i s tent meaningful grammatical structure for a sentence .

Of the problems associated with the transcript ion of Palantype code to English text , there were three in which it was t h o u g h t that syntax analysis might prove useful: h o m o p h o n i c ambiguities , key ing errors and the incorrect jo ining of chords in word boundary recons t i tu t ion . For this reason a set o f

164

syntactic rules (which here shall loosely be called the grammar) were defined and used as the input to a powerful translator writing system [12] which performed the const i tuent analysis (parsing). A description of the grammar and parser, the way that syntactic knowledge was incorporated into error correction procedures and the ambiguity resolution procedures will be published in a separate paper.

5. The future

Palantype is a phonetic shorthand and the code is essentially a phonetic description of continuous speech. Therefore, since the major part of the work has been done to transcribe from Palantype code to English text, a project has been undertaken at Leicester Polytechnic by Edmonds and Hashim to investigate the possibility of producing Palantype code (or a similar code) from a direct speech input system [13].

Projects of this complexi ty and Palantype transcription systems to produce very high quality documents will clearly require the organization of many knowledge sources. This area is an obvious candidate for an intelligent knowledge-based system.

6. Conclusions

The present author has a t tempted to summarize more than 3 years of research work. Fundamentals have been presented, problems identified and a t tempted solutions described in outline. Palantype is used by Jack Ashley, the deaf MP, and by deaf businessmen; it has many potential application areas including the subtitling of television and high quality document product ion from verbatim recordings.

For natural-language-understanding systems, Palantype will provide a practical learning vehicle, especially in the absence of high quality speech recognizers.

Acknowledgments

I would like to thank Professor Ernest Edmonds and Dr. Wyn Price for their invaluable advice and experience. I would also like to thank the Palan- type Organization and the British Broadcasting Corporation.

References

1 I. Beard, The Palantype Method of Stenotyping, Palantype Organization, London, 1978.

2 Stenograph's Computer-compatible Touch Shorthand, Books 1, 2, Supplementary Lists, Stenograph, Stokie, IL.

165

3 A. W. Boo th , The re so lu t ion of ambigui t ies and the co r rec t ion of errors in lhe a u t o m a t i c t r ansc r ip t i on of Pa lan type , Ph.D. Thesis, Leicester Po ly techn ic , 1982.

4 A. J. Szanser, Linguist ics in a u t o m a t i c m a c h i n e s h o r t h a n d t r ansc r ip t ion , Inc. Linguist, 8 (2) (1969) 30-33.

5 A. W. Booth and M. S, Barnden, Voice input to English text output, Int. d. Man Math. Stud., 11 ( 1 9 7 9 ) 681 - 691.

6 W. L. Price, Pa l an type t r ansc r ip t ion by c o m p u t e r a final repor t , NPL Compul. Sci., 45 (1971) .

7 L. A. T h o m a s and W. R. Hawkins , The economic p r epa ra t i on of Te le tex t subt i t les . Proc. IBC, (1980) .

8 A. C. D o w n t o n , A. F. Newell and J. L. A r n o t t , Ope ra to r error p e r f o r m a n c e and key board eva lua t ion in Pa l an type m a c h i n e s h o r t h a n d , Appl. Ergon., (1980) .

9 W. R. Hawkins and R. N. R o b i n s o n , The d e v e l o p m e n t and use of an e lec t ronic keyboard for television sub t i t l ing by Pa lan type , Int. J. Man Mach. Stud., 11 ( 1 9 7 9 ) 701 710.

10 E. J. Galli, The S t e n o w r i t e r - - a sys tem for the lexical process ing of S t eno typ ing . IRE Trans. Electron. Compul., 11 (2) ( 1 9 6 2 ) 187 - 199.

11 A. W. Boo th , W O R D F I N D E R : a part ial so lu t ion to the word b o u n d a r y p rob lem in s h o r t h a n d t r ansc r ip t i on sys tems , to be publ i shed .

12 E. A. E d m o n d s and S. P. Guest , SYNICS -- F O R T R A N s u b r o u t i n e package for t r ans la t ion , Rep. 6, 1978 (Man C o m p u t e r I n t e r a c t i o n Research Group , Leicester Po ly techn ic ) .

13 J. J. Cruzi, J. H. Connol ly , E. A. E d m o n d s and A. A. t t a sh im, A feasibi l i ty s t udy in to a sys tem for direct speech inpu t to c o m p u t e r s , Final Rep., 1980 (Leices ter Polytech- nic, Leicester) .

the computer transcription of machine shorthand

Documents