a formal specification of document processing

Pergamon Mathl. Comput. Modelling Vol. 25, No. 4, pp. 57-72, 1997

Copyright@1997 Elsevier Science Ltd Printed in Great Britain. All rights reserved

PII: SO8957177(97)00024-l 0895-7177197 $17.00 + 0.00

A Formal Specification of Document Processing

A. L. BROWN, JR. XSoft, 10200 Willow Creek Road, San Diego, CA 92131, U.S.A.

S. MANTHA AND T. WAKAYAMA Xerox Corporation, 800 Phillips Road, 128-293, Webster, NY 14580, U.S.A.

Abstract-we propose a computational model of structured documents and their processing based on preferential attribute grammar schemes and grammar coordinations. Our grammar-based model can be viewed as a specification of composable structure transformations. The main novel features are declarative specification of preferential constraints, and specification of structure transformations at the level of meta-data through coordination schemes. The preferential constraints may express constraints to guide computations as in dynamic programming, as well as constraints to controi declaratively the outcome of transformation. A coordination is essentially a partial substitution map from the vocabulary of a grammar to languages over the vocabulary of another grammar. Although our grammar-based coordination schemes are designed to capture various types of document processing (such as view processing and query processing), we focus on the document layout application in this work. Our first main result shows that when the coordination map satisfies the unijomnitg condition, the two grammars (of the layout coordination scheme) are syntactically coordinated in the sense that trees of the first grammar are always transformable to trees of the second grammar, while satisfying the constraints imposed by the coordination. We then show that the elementaql uniformity is a decidable property when the coordination is regular, thereby establishing a decidable class of coordinated grammar schemes.

Keywords-Document models, Attribute grammars, Preference logics, Document layout, Struc- ture transformation.

1. INTRODUCTION

For some time, we have been interested in developing a computational theory of documents [1,2]. Such a theoretical framework would serve as a specification language for some class of documents, while at the same time guaranteeing that the document structures so specified could be effectively

computed. It can be argued that typical document specification languages (which we shall call electronic markups or simply mark2Lps), together with the programs that interpret them, con-

stitute computational theories of documents. The problem with such markups is that the only explicit semantics provided is that of the accompanying interpreters. The document specification

language that we have in mind should have a declarative semantics of which any computational interpretation is but a happy consequence. Ultimately, we hope to use our abstract computa-

tional theory of documents as a basis for designing a concrete document description language based entirely on declarative specifications; cf., that:

l has a precise formal semantics; l separates the specifications of content, form, and logical structure;

We are grateful to our colleagues S. Marshall and J. Mayer. S. Marshall constructed the proof of Lemma 5.2 presented in this paper, which is much cleaner than our original induction-based proof. This work hae also benefited from our conversations with H. Blair, A. Nerode, and D. Wood on various occasions.

Typeset by d@-TBX

57

58 A. L. BROWN, JR. et al.

l is fully expressive of the constructs typical of traditional (procedural) document description languages that are mainly concerned with document processing applications (e.g.,

interchange, layout, and recognition).

The theoretical framework that we develop in this paper is used to more immediate ends; we provide an abstract architecture of documents within which every markup language known to us

can be fit. This architecture has a number of representational mechanisms.

l Extended context-free grammars and parse trees. Extended context-free grammars, i.e.,

context-free grammars that allow arbitrary regular expressions on the right-hand sides

of productions, are used to specify structural constraints. A source grammar specifies a

class of source documents. We also use such constraints (in a result grammar) to specify

the structure of the result of a particular document processing application (e.g., layout processing), as well as the structural correspondences between constituents of the source

document and that result, e.g., a coordination between logical structure and layout structure. Source parse trees describe concrete instances of structures compatible with an associated constraining source grammar. Other parse trees, compatible with other mentioned

grammars, will be induced by satisfying a result and coordination specifications. l Rule-based attribution and attributes. Attributes are associated with every grammar, and

attribute-value pairs are associated with the nodes of parse trees.

l Coordination and embedding. Coordination grammars constrain structural correspon-

dences between structural elements specified by source and result grammars, respectively.

An embedding is an instance of such a correspondence.

a Preference. Though the components mentioned above are suficient to specify any computationally effective result, many such results are best specified as optimizations according

to particular preference criteria. Such preference criteria are specified by appealing to a modal extension of definite clause logic programs.

The representational architecture described is shown to have certain completeness properties.

Prom completeness, it can be argued that any computationally effective notion of document can

be cast in our framework.

We shall borrow freely from the literature of mathematical logic [3], recursive functions [4],

computational logic [5,6], program schematology [7], formal languages [8], and attribute gram-

mars 191, specializing definitions and theorems to our present needs. The formal foundations of our representational formalism are presented in a self-contained fashion. We first give defini-

tions of trees, preference log&, and preferential theories before introducing preferential attribute grammar schemes. We then develop the notion of coordination and coordinated attribute grammar schemes. This is followed by an analysis of layout processing schemes and coordination, in which we spell out some complexity theoretic properties.

2. PRELIMINARIES

2.1. Trees

Let J* be the free monoid generated by the set of positive integers J. If dl,dz E J* and dz = dl - d for some d (possibly the identity element) in J’, dl is an ancestor of d2, or dg is a descendent of dl, and we write dl IA dz. Let d,dl, dp E J* and i,j E 3. If i < j, d. i. di is to the left of da j 1 dz, and we write d 1 i . dl <L d. j . dp. Let V be a nonempty set. A V-valued tree is a mapping t : D + V where D is a subset of ,7* satisfying:

o there exists do E D, called the root of t, which is an ancestor of every element in D; l ifdEDandde<Ad’IAd,d’ED;and l iffori,jEJ,d.jEDandi<j,thend.iED.

If t is a tree, we write dam(t) to denote its domain. Since we are mostly concerned with finite

trees, i.e., trees with finite domains, a tree will mean a finite tree unless otherwise stated. The

A Formal Specification 59

members of dam(t) are nodes oft. Note that the relations, <A and <L, form two partial orders

on the nodes of a tree; the ancestral order is reflexive while the left-to-right order is irreflexive.

A (finite) set of nodes of a tree which are linearly ordered with respect to <A is an A-chain of

the tree. Similarly, a (finite) set of nodes of a tree which are linearly ordered with respect to <L

is an L-chain of the tree. A chain of a tree is then either an A-chain or an L-chain of the tree.

Since chains are sets, we will sometimes use set terminology and notations for chains. A chain is

nontrivial if it has two or more elements. Note that a nontrivial A-chain is an L-antichain, and

a nontrivial L-chain is an A-antichain. Let CJ~ and 02 be chains of the same type. If ~1 C g2,

(~1 is a subchain of (~2, and u2 is an extension of ur. A chain of a tree is maximal in the tree

if it has no proper extensions in the tree. A chain C of a tree is connected if it has no proper

extensions in the tree which share the first and last elements with C. We may also write an

A-chain, dl <A . . . <A dk, as a sequence (dl,. . . , d k A, or simply as (dl, . ,dk) when clear from )

the context. Similarly, we may write an L-chain, dl <L . . <L dk, as a sequence (di, . . dk)L, or simply as (dl, . . , dk) when clear from the context.

Since L-chains have special importance in the subsequent sections; we will introduce some

additional notations on them. First, if t is a tree, L-chains(t) denotes the set of all L-chains of t. We extend <L to a relation on the nonempty L-chains of the tree as follows:

(6,. . .,dk) <L (el,...,e,), iff dk CL el.

Similarly, we extend the ancestral relation on the nodes of a tree to a relation on the L-chains of

the tree

&,...,dk) <A (el,...,em), iff for every ei there exists d3 such that d3 <A ei.

Note that the extended relations <A and <L on the L-chains of a tree form two partial orders;

the extended iA is reflexive, whereas the extended <L is irreflexive.

2.2. Logic of Feasible Preference PI

SYNTAX. We add to the language L, of a normal first-order modal logic [lO,ll]--equipped with

the modal operators 0 and O-a unary modal operator Pf. The rules of formation of the new

language (called CT,) include all the rules of L, in addition to

l if F is a formula, then PfF is a formula.

SEMANTICS. A Pi preference frame is an ordered triple of the form (W,R, d), where W is a

nonempty set of worlds, and 77, and 3 are binary relations over W x W such that 5 is a subrelation

of ‘R. A Pr preferential model (or structure) is a Pr preference frame with a valuation function V that determines the truth of atomic formulae at individual worlds. Assuming the usual valuation

of formulae with Boolean connectives at possible worlds, the semantics of the modal operators

are given by

AXIOMATICS. Pi is equipped with all the axioms and rules of propositional logic in addition to

Dfo OA H TO-IA, RK if I- (Al A.. . A A,) + A then k (OAl A.. . CIA,) -+ CIA, for n 1 0,

PI if I- (Al A . . . A A,) + A then k (Pf-A1 A.. . PflA,) + Pf-A, for n > 0,

PPS Cl-A ---) PfA, PIR PfA -+ TA, TOA--+A.


PIR is valid in the class of preference frames with an irrefZexive preference relation. T is valid in the class of preference frames with a reflexive feasibility relation. They are needed to show

the completeness of Pi with respect to the class of such preference models. Some other axioms

and rules that are valid in the class of all preference frames are

PI* Pf(A ---$ B) + PfB,

PA (PfAr\PfB) + Pf(Ar\B), PF Pfl,

PGN PfA --+ Pf (A A B),

PST Pf (A V B) -+ Pf (A).

Asymmetry of the preference relation is axiomatized by PAS.

PAS +((PfA A B) A O(A A PfB)).

PTR (Pf A A O(PFB A A)) -+ Pf B.

A preferential model is said to be supported iff, if for any two worlds w and v, if w 5 v, then

there exists a formula PfA such that w+ PfA and v+ A. PTR ensures transitivity in supported

preferential models.

In the rest of this paper, we shall consider the first-order version of Pi with fixed domains, terms as rigid designators, the quantificational rules of standard first-order logic, and the following

versions of the Barcan formulae.

BF VxOA -+ q IVxA,

PBF ‘dxPf A -+ Pf VxA, NPBF VxP, TA 4 Pf TVxA.

2.3. Horn Preferential Theories

A preference clause is the universal closure of a formula of the form,

AMk + Pf (A&) ;

k

where Lj and Mk are general literals [5], i.e., literals (possibly) adorned with 0 and 0. An arbiter

is a finite collection of preference clauses. Let {ri}ie~ be a finite collection of definite clause logic

programs [6]. Let ~1 be a definite clause logic program as well. Let A be an arbiter. A Horn preferential theory is given by

The set {ni}ie~ is the solution space of the theory. We assume that wi $Z nj, for i # j. Of

course, this is quite easy to arrange by introducing new dummy predicates that are unique to the respective programs. Let L be a language of a Horn preferential theory. A Herbrand preferential structure for L is a preferential structure M = (W, R, 5, U) such that U assigns Herbrand

interpretations for L to members of W. Given a Horn preferential theory, we would like to associate with it a minimal model as its

denotation. For this reason, we introduce an ordering on Herbrand preferential structures (for a fixed language). Let MI and Mz be Herbrand preferential structures. MI 5 MO iff

o there exists 4, a one-to-one mapping from Wi to WZ such that VW E WiV2(~$(w)) 2 Vi(w); l @(RI) 2 (R2 restricted to d(Wi x Wl)) (by a minor abuse of notation); and l if in the above condition, equality holds then, 52 restricted to $(Wi x Wi) 2 +(dr).

It is known that given a Horn preferential theory 7,


there is a unique minimal Herbrand preferential model MT of the theory [12,13]. Moreover, MI = (W, X,5, V) has the following properties:

l there is a bijection 4 between W and the solution space of 7 such that V(w) is the least Herbrand model of p U 4(w) for each w E W;

l R=WxW;and l 5 is the smallest subrelation of R that satisfies the arbiter.

3. PREFERENTIAL ATTRIBUTE GRAMMAR SCHEMES

An extended context-free grammar is a tuple (V, C, P, S) where

l V is a finite vocabulary; l C & V is a set of terminal symbols; l P is a finite set of productions of the form A + r where A E V - C and r is a regular

expression over V;

l S E V - C is the start symbol.

Let G = (V,C,P,S) b e an extended context-free grammar. Let A + T be a production of G. If CT is in the language of T, A + o is an instance of the production. A derivation tree of G is a V-valued tree t such that for all internal nodes d of t, if dl <L ... <L d, are all of the children of d, t(d) --+ t(dl) . . . t(d,) is an instance of a production of G. A derivation tree of G is an X-derivation tree of G if the label of its root node is the grammar symbol X of G. A derivation tree of G is upward complete if it is an S-derivation tree, and it is downward complete if every leaf node of it is labeled with a terminal symbol of G. A complete derivation tree of G is an upward and downward complete derivation tree of G. Let L be a language over V. Then ,C(G, L)

denotes the set of all sentences derivable in G from some string in L. Similarly, C(G, L) denotes the set of all sentential forms derivable from some string in L. For a regular expression T, L(T)

denotes the language associated with the expression. When context-free grammars are extended to allow regular expressions on the right-hand sides

of productions, the corresponding attribute specification languages also need to be extended.’ For instance, consider a layout grammar production (line) - [box]+. Assume that (line) and [box] have attributes length and width, respectively, and that the length of (line) is the sum of the width values of all [box] items. It may also be desirable to specify the position of each [box] given the position of (line) (see Example 3.1). In general, since a production of an extended context-free grammar has many instances, it is not immediately clear how to write such specifications given just productions. What is clear, however, is that once a specific instance of a production is given, the attribute specification for this production must be able to generate a set of conventional attribution sentences (functional or relational) for that instance. More generally, we view the attribute specification for a grammar as a mapping that takes a derivation tree of the grammar and generates a set of conventional attribute sentences for this tree. Thus, instead of discussing design details of a particular attribute specification syntax, we only elaborate on the common properties of such specifications, and then give an example to illustrate how a particular specification scheme may generate actual attribute sentences.

Given an extended context-free grammar G = (V, C, P, S), we first define an attribution ‘map TK of G: it maps a derivation tree of G to a finite set of definite Horn clauses over some underlying first-order language, which we refer to as the language of n. We assume that the language of T includes a designated monadic predicate symbol (guard), a collection of dyadic predicate symbols (attributes), a collection of fme function and predicate symbols, and the members of the monoid 3’ as part of its collection of constant symbols. We use the symbol g for the guard,

*Extended and ordinary context-free grammars have the same expressive power when their denotations are taken to be strings. When their denotations are taken to be derivation trees, the former are strictly more powerful than the latter.


Q and fl for attributes, and p and q for free predicate symbols, possibly with subscripts. The

attribution map r satisfies the following properties:

0 n(t) = U{7r(t’) ( t ’ is a height 1 subtree of t}, and l the clauses of I are of the following two forms:

- Qo(dko,tO) Ccrl(dkt,tl)A...Acr,(dkn,tn)Apl(~~)A...Ap,(~~), - g(do)+cu~(dj~,t~)A...Acr,(dj,,t,)Apl(u'i)A...Ap,(u~),

where do is the root of t’, dk,, and dji are children of do, and the ti and uj are terms that

may include logical variables.

An attribute grammar scheme is a pair G = (G, 7r) where G is an extended context-free grammar and r is an attribution map of G. Throughout, we assume a fixed but arbitrary definite clause

logic program pL, which interprets the free function and predicate symbols of the language of

r. We also assume the map n to be effectively computable. Let t’ be a height-one subtree of a

derivation tree t of G!, and d the root of t’. If r(t’) contains a guard clause, we say d is a guard

node. t is admissible in G (under the interpreter pux) if

for all guard nodes d of t.

EXAMPLE 3.1. (ATTRIBUTE SPECIFICATION)

(line) - [box]+ guard((line)) : - Zength((line), L) A L 5 maxlen. Zength((line), L) : - Ai width([box]i, Xi) A xi Xi = L. position([box]l,X) : - position((line),X).

&(position( [box]i+l,x+l): - position( [box]i, Zi) A width( [boxji, wi)A

x+1 = Zi + Wi).

Consider a tree t that corresponds to an instance (line) - [box] [box] [box] of the production above. The attribute specification given above generates the following set of clauses, where do is the root of the tree and dl, dz, and d3 are the children of do. do is an example of the guard node,

and for a sufficiently large maxlen, t is an admissible tree.

guard(dc) : - length(do, L) AL 5 maxlen.

Zength(do, L) : - width(dl, Xl) A width(dz, X2) A width(ds, X3)A

x(X1,X2,X3) = L. position(dl, X) : - position(do, X). position(d2, Y2) : - position(dl, 2,) A width(dl, WI) A Ys = 21 + Wi.

position(d3, Ys) : - position(dz,&) A width(d2, WZ) A Ys = 22 + W2.

Let G = (V,C,P,S) b e an extended context-free grammar. A preference domain of G is a set

of (possibly attributed) derivation trees of G. A preference space of G is an indexed collection of preference domains of G. A preference space is locally finite if every preference domain in it is

finite.

EXAMPLE 3.2. (PREFERENCE SPACE 1) Given an extended context-free grammar G = (V, C, P, S), a preference space {Pi}ien may be given as follows:

. R=C*;

e for each cr E C*, PO is the set of complete derivation trees spanning u in G.

Note that if G is finitely ambiguous, the preference space is locally finite.

EXAMPLE 3.3. (PREFERENCE SPACE 2) A preference space of the class discussed in Example 3.2 only compares complete trees. This is unsatisfactory when a grammar has highly ambiguous subgrammars (e.g., exponential): it is desirable to resolve such ambiguities locally through more refined preference specifications. This can be achieved by manipulating the preference space as follows:

l o=vxv*;


l each preference domain P(x,~) is a set of X-derivation trees spanning the sentential. form

0 E I/*.

Again, if G is finitely ambiguous, the preference space is always locally finite.

EXAMPLE 3.4. (PREFERENCE SPACE 3) The restriction that trees are comparable only when

they span the same sentential form, as in Example 3.3, may be too strong for some applica.tions. For instance, in first-fit line breaking layout [14], t wo lines should be comparable when one is an extension of the other. More specifically, given an attribute grammar having the production of

Example 3.1, a preference space {PV},,eo may be given as follows:

l R is the set of attributed [box] sequences having exactly one extra [box] append.ed to

maximal [box] sequences bounded by m&en,

l P, is the set of (downward complete) (line)-trees that span initial subsequences of cr.

Note that in order to avoid unnecessary comparisons, the preference space takes advantage of

the monotonicity property of the length attribute: once a nonadmissible (line)-tree is found,

its extensions are always nonadmissible because they have larger length values. Without this

monotonicity (e.g., [box] items with negative luidth values), the preference space given above

would fail to capture the intended first fit layout.

Given a preference domain, its arbiter has clauses of the form

where E and Y are meta-syntactic variables ranging over (the root nodes of) trees in the preference

domain. ri, Uj, rjk, and ~1 are terms, possibly including logical variables.

DEFINITION 3.1. A preferential attribute grammar (PAG) scheme G is a tuple given by

where

0 c= (V,C,P,S) . IS an extended context-free grammar;

l T is an attribution map of G;

l P is a locally finite preference space of G (with index set a);

l A, is an arbiter for the preference domain P, for each w E R.

DEFINITION 3.2. Let G = (c, K, P, {&},En) be a PAG scheme, and P, a preference domain

in P. Then, the preferential theory of G with the preference domain P, is given by

Thus, given an interpreter pa, we can define the denotation of G relative to ,ua as a map from

the preference space P to the set of minimal Herbrand preferential models of preferential th.eories with preference domains P,.

EXAMPLE 3.5. (PAG SCHEME: OPTIMAL LINE BREAKING WITH PREFERENCE) The following PAG scheme specifies optimal line breaking layout for paragraphs. For simplicity, we assume that various parameters such as the desired line length(s) are fixed and hard-wired inside free predicates such as compute-adj (which computes how much a given line should be stretched or shrunk).

(dot) - (line)+ cost((doc), C) : - /ji badness((line)i, Bi) A C& = C.


(line) - [box]+ guard( (line)) : - adjustment( (line), A) A (shrink-bound 5 A 5 stretch-bound).

length( (line), L) : - Ai width( [box]i, Xi) A Ci Xi = L. adjustment( (line), A) : - Zength( (line), L) A compute-adj( L, A). badness((line), B) : - adjustment((line), A) A compute-bad(A, B).

The index set R is the set of attributed [box] sequences. For each c E 0, the preference domain

is the set of complete admissible trees spanning 0. The arbiter A, for PO is given by

where Zl and Zz meta-syntactic variables ranging over (the root nodes of) trees in PO. In this ex-

ample, as IY increases, the corresponding preference domain increases exponentially. However, by

the monotonicity of the cost predicate, there is a certain dependency between preference domains.

Namely, given a preference domain PO, only the optimal trees in PC may be extended to optimal trees in subsequent preference domains. This is the key observation that enables the application

of dynamic programming, and consequently, the (drastic) reduction of the optimization search space, as in the optimal line breaking algorithm of Knuth and Plass [14]. It is worth noting that

this dynamic programming computation can be modeled in terms of chains of preference domains which are appropriately refined based on the above observation.

4. COORDINATED ATTRIBUTE GRAMMAR SCHEMES

4.1. Document Processing Schemes

Let Gi and Gz be PAG schemes with Gi = (V~,&,P~,&) and G:z = (V., EJ,P~,S~). Let X E VI. A coordination grammar of Ga for X is a PAG scheme G with G = (V, C, P, X) where

C c V, and V rl (VI U V2) = C U {X}. We assume that fZ(G:, {X}) # 0. A coordination of GJJ with Gi is a set of coordination grammars of Gz for symbols of Gi such that no two coordination

grammars in the set have the same start symbol. More generally, we often view a coordination as a partial substitution from VI to languages over Vz. If X is such a substitution map, and C is

a class of languages over Vz such that X(X) E C for all X in the domain of X, the coordination given by X is called a C-coordination. As in Section 3, we assume arbitrary but fixed interpreters

for Gi, Gz, and all coordination grammars. Let X be a coordination of Gz with Gi, and tl and tz trees admissible in Gi and Gz, re-

spectively. A tree embedding (or simply embedding) cp from tl to t2 is a mapping of the type

dom(ti) -+ L-chuins(tz). cp is said to respect X, or cp is a X-embedding, if and only if for all

nodes d of tl having a coordination grammar G for tl(d) in X, G admits (under the common interpreter) a complete tree spanning tz(cp(d)). A tree embedding from Gi to Gz is an embedding

from tl to tz for some trees tl and tz admissible in Gi and Gs, respectively. Let Q be a collection of tree embeddings from Gi to G2, and tl and t2 trees admissible in Gi and G2, respectively. tz is Q-coordinated with tl via X if and only if Q has a map of the type dom(ti) -+ L-chuins(tz) which

respects X. Gz is Q-coordinated with Gi via X if and only if for every complete tl admissible in Gi, G2 has a complete admissible tree t2 which is Q-coordinated with tl via X.

DEFINITION 4.1. A document processing scheme is a tuple r = (GI, Gs, X, @I) where

l G1 and Gz are PAG schemes; l X is a coordination of Gz with Gi; l Q is a collection of embeddings from Gi to Gz.

I’ is coordinated if Gz is Q-coordinated with G1 via X.

Given a document processing scheme P as above, we sometimes refer to Gi as the source grammar, and Gz as the result grammar. If a tree tz of the result grammar is coordinated with a tree tl of the source grammar, tl is called a source tree, or source document instance, and t2


is called a resvlt tree, or result document instance (see Figure 1). This source/result terminology

is borrowed from SGML (Standard Generalized Markup Language), where a notion similar to

coordination is also discussed [l&16]. Let tr be a tree admissible in Gr and spanning a string LT.

Let t2 be a tree in Gs which is Q-coordinated with tl via X. The set of trees tl, t2 and their coordination trees is called a I?-covering of u. t2 is a Gs-view of tr. For instance, if G2 is a layout

grammar, t2 is a layout vieul of tl.

source grammar

1

source document instance

(source tree)

comdination ---+

embedding -+

result grammar

1

result document instance

(result tree)

Figure 1. Document processing model.

Next, we state a specification completeness of our grammar-based coordination schemes. This, of course, follows immediately from the computationai completeness of definite clause logi,c pro-

grams [S].

PROPOSITION 4.1. Let VI and V, be grammar vocabularies where 5’1 and Sz are their start

symbols, and Cr and C2 are their terminal alphabets, respectively Let IV, CC&j denote the

collection of all Vr-valued trees (I+valued trees) having Sr (S’s) as its root label and symbols of

Cl (Ic2) as its leaf Jabefs. Suppose ‘R C 7~~ x ‘I& is recursive. Then there exists a document

processing scheme I’ = (Gl, Gz, X, 9) with Gr = (V~,~~,P~,S~} and (G2,C2,P2,&) such that (tr, t2) E R iff t2 is B-coordinated with tl via X.

PROOF. A desired I? can be given as follows:

l Pi = (A * Vi* 1 A E Vi - Ci}, i = 1,2; * Gr and Gs have a single attribute tree, which synthesizes tree structures from children to

their parents;

l X is a singleton containing the following grammar:

SI - s2 g(Sr) : - tree(&,Tl) Atree(S2,Tz) A ji(q,~),

where ‘I? is the free predicate interpreted by a definite clause logic program defining the recursive relation R;

l 9 is the collection of all embeddings from Gr to G2, containing, in particular, embeddings that map the root node of the tree in Gr to the root node of the tree in Gz. I

4.2. Layout Processing Schemes

DEFINITION 4.2. Let f = (GI, Gz,X, Q) be a document processing scheme with (?I =: (VI, Cl, PI, S1) and 62 = (V2, C2, P2, Sz), and tl and t2 complete trees admissible in Gr and G2, respectively. A tree embedding cp from tl to t2 is a layout map from tl to t2 if it satisfies the

following:

l iftlfd) E T=i, thent2(~(d)) f 6: and cp(d) is&-connoted, i.e., q?(d) can be made connected

by inserting c-nodes (terminal atomicity); e every non-c feaf dg of t2 is in the chain cp(dl) for some source node dl of tl;

l ifdl IA d2 in tl, cp(dl) <A cp(&) in h;

o ifdl <L dp in tl, cp(dl) <L q(dz) in tz.

A layout map from G1 to Gz is a layout map from tl to tz for some complete tl and tz adm~ssibie

in G1 and Gz, respe~t~ve~~


DEFINITION 4.3. A document processing schemer = (Gr, G2, X, Q) isa layout processing scheme if

l X is a full substitution;

0 X(Sr) = sz; l X(a) s C$ for all a E Cl; l 9 is the set of all layout maps from GI to Gz.

EXAMPLE 4.1. (LAYOUT PROCESSING SCHEME) Define a layout processing scheme P = (Gr ,

G2, X, 9) as follows.

Gr: Source Grammar

(dot) - (para)+.

(para) - (sen)+. (sen) --+ [word]+.

l\,(content( [word]i, Xi) : - lez-analyzer( [wordli, Xi)). (sen) - [word]*(quote)[word]*.

q-type((quote),X) : - usr-q-type((quote),X).

A\i(content([word]i, Xi) : - lez-analyzer([word]i, X,)). (quote) - [word]+.

&(content( [word]i, Xi) : - lez-analyzer([word]i, Xi)).

Gz: Result Grammar

(dot-lay) - (line)+ default-font( (d oc-lay), X) : - get-default-font(X).

l\,(defadt-font((line)i, Xi) : - defaz&font((doc-lay), X)). (line) -+ [box]+

guard((line)) : - adjustment((line), A) A (shrink-bow&IA < stretch-bound). length((line), L) : - /ji width([box]i, Xi) A xi Xi = L. ao!justment((line), A) : - Zength((line), L) A compute-adj(L, A). badness((line), B) : - adjustment((line), A) A compute-bad(A, B). /ji(defauZt-font([box]i,Xi) : - default-font((line),X)). A&idth([box]i, Xi) : - font([boxli, %)A

content([box]i, &) A compute-width(Yi, Z’i, X,)).

X: Coordination

G(doc) : Coordination Grammar for (dot) (dot) - (doe-lay)

G(para): Coordination Grammar for (para) (para) - (line)+

cost( (para), C) : - /ji badness( (lineji, &) A C& = C.

G(een): Coordination Grammar for (sen) (sen) - [box]+

Glquote): Coordination Grammar for (quote) (quote) - [box]+

Ai font([box]i,bold) : - q-type((quote),direct). Ai font([box]i, italic) : - q-type((quote), indirect).


Glwordl : Coordination Grammar for [word]

[word] - [box] font( [box], X) : - default-font( [box], X)A

lq-type( [word], direct) A Tq-type( [word], indirect). content( [box], X) : - content( [word], X).

Note that,

0 all free predicate symbols are overlined (except the equality predicate). Some free predicate symbols are interpreted by logic programs generated by other grammars; e.g., the q-type

predicate in the coordination grammar Glquote) will be interpreted by the logic program generated by the source grammar for a given source tree;

l although the result grammar is largely taken from Example 3.5, the computation of the

cost predicate is now done in the coordination grammar for (para) to reflect the new

situation that (para)-trees represent appropriate optimization units for line breaking;

l the font value is determined in a context-dependent manner; it depends on whether the corresponding [word] item appears inside a (quote) item, and when it does, whether its

quote type is direct or indirect;

l the clause in the coordination grammar Glwordl is not a definite clause; it has two occur- rences of negative literals. These literals are interpreted through negation-as-failure in the

usual way [6].

5. DECIDABILITY/UNDECIDABILITY OF SYNTACTIC COORDINATION

Let I = (Gi, Gz, X, Q) be a document processing scheme. Let tl and t2 be complete trees of Gi and Gz, respectively. Then t2 is syntactically coordinated with tl in I if t2 is coordinated with tl in F = (Gr,Gz,X,Q), where 2 = {G ] G E X}. S’ imilarly, G2 is syntactically coordinated tith

G1 in I’ if Gz is coordinated with Gi in f, and I is syntactically coordinated if l? is coordinated. Of course, even if l? is syntactically coordinated, it is possible that semantic constraints disallow

I to be coordinated. However, we view the semantic component of l? to be a more flexible part of

the specification than the syntactic part, and the semantic part is intended to be user-manipulable through user-definable free predicates. The syntactic component of the scheme, on the other hand, is, in general, inaccessible to the user: the user can only indirectly influence it by manipulating

semantic constraints. This is justified by our choice of extended context-free grammars, which can support extensive syntactic manipulations by permitting high degrees of ambiguity. In particular,

we would like to keep layout processing schemes to be at least syntactically coordinated: a situation where a valid source document can have no syntactic layout translations would be an unreasonable surprise to the user. In this section, we study decidability/undecidability properties of syntactic layout coordination. Throughout this section, a layout processing scheme means a scheme without semantic components.

PROPOSITION 5.1. Given a layout processing scheme, it is undecidable whether the scheme is

(syntactically) coordinated.

PROOF. The proof is via undecidability of inclusion in context-free languages. Let L1 and Lz

be arbitrary context-free languages generated by context-free grammars Gi = (VI, C, PI, ,571) and Gz = (Vz,C, Pz,Sz), respectively. Without loss of generality, we assume that C(Gi, {X}) is nonempty for all symbols X of Gr, and similarly for G 2. We also assume that no &-productions are present in Gr or Gz, again without loss of generality. We construct a layout processing scheme

I = (Gi, Gz, X, Q) where X is given by

l X(&j = (S2);

l X(X) = 13(G1, {X}) for all X E VI - {S1}.


We claim that r is coordinated iff L1 C Lz. Suppose L1 c Lz. Let tl a complete tree of G1. By the supposition, GP has a complete tree t2 spanning the same terminal string as tl. We define a

layout map ‘p from tl to t2 as follows:

l for the root do of tl, cp(do) is the L-chain consisting of the root of t2;

l if d is the lath leaf node (from the left) of tl, cp(d) is the L-chain consisting of the lath leaf

node of t2; l for any other node d of tl, let cp(d) = p(C) where C is the leaf chain of tl spanned by d.

It is straightforward to verify that cp is a layout map from tl to t2 respecting X. Thus, l? is

coordinated. The converse immediately follows from the definition of layout maps. I

Next, we consider under what circumstances layout processing schemes are (syntactically) coordinated. Given a substitution (and hence, a coordination), we extend it to a substitution

from languages to languages in the usual way.

DEFINITION 5.1. A layout processing scheme l? = (G1, G2, X, Q) is uniform if for every produc-

tion A - T of Gl, X(L(T)) c z(G2, X(A)).

Although the uniformity concept is useful in studying decidability issues of coordination, it was first formulated to localize the syntactic aspect of layout optimization (such as optimal line breaking) in the parsing and attribute evaluation setting. More specifically, the uniformity condi-

tion allows us to take a height-one subtree of the source tree at a time, compute its layout images (which are forests in general), and work upward in the piece-wise manner without considering

global syntactic constraints. In this section, we will show how the uniformity concept can be

used to characterize classes of layout processing schemes decidable for syntactic coordination.

We start with the following lemma, where we use tree notions such as L-chains and embeddings

extended to the case of forests (i.e., finite sequences of trees).

LEMMA 5.1. Let F = (G1,G2,X,Q) b e a uniform layout processing scheme. Then, for every

downward complete derivation tree tl of G 1, G2 has a downward complete derivation forest t2 which is Q-coordinated with tl via X such that the root of tl maps to the root chain of t2.

PROOF. Ignoring the semantic components, we let Gl= (VI, Cl, PI, Sl) and G2 = (Vz, C2, P2, S2). The proof is a straightforward induction on the height of tl (but it depends on subtle features

of document and layout processing schemes). If it is zero, tl is a tree consisting of a single node

with a terminal label, say a. Since X(a) # 8, we pick a string from X(a) and form an L-chain

labeled with the string. Since X(a) E C, , + the chain is trivially downward complete. Now given

a downward complete tree tl of Gl, let ~1, . . . , un be the subtrees of tl (rooted at the children of the root of tl). By induction, we let U: be the downward complete forest Q-coordinated with ui via X such that the root of ui maps to the root chain of u:, for i = 1,. . . , n. Let do be the root of tl, and di the root of ui for all i. Similarly, let Ci be the root chain of ui for all i. Let P=u’l(C1)... uL(C,). By the induction condition, we have,

P E X (tl ((4, . . . ,4x))).

By the uniformity condition, any string in X(tl((dl, . . . , d,))) is derivable from some string in X(tl(do)). In particular, p is derivable from some string, say cr, in X(tl(do)). We put together all ui’s from left to right, and extend the pasted forest upward to a to obtain the new forest t2.

A desired layout map from tl to t2 can be obtained by putting together the layout maps from ui to u: and extending the result by assigning the root node of tl to the root chain of t2. I

If tl in the lemma above is a complete tree, t2 must also be a complete tree because of the requirement X(Sl) = {SZ}. Th us, we obtain the following proposition.

PROPOSITION 5.2. If a layout processing scheme is uniform, it is (syntactically) coordinated.

Note that the converse does not hold.


PROPOSITION 5.3. Given a layout processing scheme, it is undecidable whether the scheme is unjform.

PROOF. The proof is also via undecidability of inclusion in context-free languages. Given context- free languages L1 and Lz, define a layout processing scheme r = (Gl, Ga, X, KP) as follows:

l the productions of G1 are {Sl - A,A - a}, l G2 has a single production S2 - Es, l X is given by

- WSI) = {Sz),

_ X(A) = L2 c C;,

- X(a) = L1 2 c;.

Observe that L1 C Lz iff I? is uniform. I

As these undecidability results suggest, one desirable situation in getting de~idability results is to have classes of languages in which inclusion is decidable. In fact, there are several subclasses of context-free languages in which inclusion is known to be decidable: (very) simple languages [17],

parenthesized languages [18], NTS(NonTermina1 Separation) languages [19,20]. However, these are all subclasses of deterministic context-free languages, and their (known) grammatical char- acterizations strongly disallow ambiguities, which is incompatible with our plan to use highly ambiguous grammars to loosely specify layout structures. One could use recursion-free extended context-free grammars (which, of course, generate regular sets) for layout grammars. But this seems uninterestingly restrictive. Fortunately, it is possible to weaken this restriction and allow som,e recursion in layout grammars.

Towards this end, we introduce the notion of ~n~~o~i~~ maps. Given a layout processing scheme l? = {Gl,Ga,X,@), where c1 = (&,X1, PI,&) and C% = (VZ,&, Pz, SZ), a map

is a uniformity map of r if it satisfies h(p) C EfG2, X(A)) for all productions p = A + T of GI. Let C be a family of languages over V2. We say that fi. is a C-~~~fo~~~~ mad if h(p) E C for all p E P. The following is a generalization of the previous definition of uniformity.

DEFINITION 5.2. Let r and C be as above, and ti a C-uniformity map of r. Then r‘ is C-uniform under ?i if for all productions p = A - T of GI, X(C(r)) 2 h(p).

As a special case, if E(p) = .i?(Gz,X(A)) for all productions p of G1, we get the original de~nition of uniformity.

PROPOSITION 5.4. Let C be a family of languages such that

l C is closed under regular operations; l the inclusion problem is decidable in C.

Then given a layout processing scheme r with a C-coordjnatio~ and a C-uniformjty map, it is decidable whether I’ is C-uniform under this uniformity map.

PROOF. Straightforward. 1

We now construct a class of C-uniformity maps that has the decidability property. Let G = (V, C, P, S} be an extended context-free grammar, and A E V. CY E V* is an elementary expansion of A in G if cy has an A-derivation tree in G in which every branch has at most one occurrence of a symbol for every symbol of G (i.e., roughly cx has a recursion-free derivation from A}. il f V’

is an elementary expansion of a: = Al, AZ,. . . , A, in G if p = &/32,. . . 1 /3, and each pi is an elementary expansion of Ai in G. Let L C V*. The elementary expansion, &,(G, L), of .1: in G is given by

&&G&f = u {P I P is an elementary expansion of cr in G). afL


Given a layout processing scheme I7, define a uniformity map fi such that ti(p) is the elementary expansion of X(A) in Gz for all productions p = A - T of G1. Note that since b(p) C

z(Gz, X(A)), th e map is well-defined. Such maps are called elementaq uniformity maps. We also say that r is elementarily uniform if it is uniform under its elementary uniformity map.

EXAMPLE 5.1. (ELEMENTARY UNIFORMITY) The layout processing scheme in Example 4.1 is elementarily uniform.

We now show that the elementary uniformity of I’ is decidable if its coordination is regular. Two lemmas follow first. Let cy, /? E V* for an alphabet V. We let o N p if and only if they have the same set of distinct symbols. Given L c V*, the relation N defines an equivalence relation on 1;. We write [cx]L for the equivalence class generated by (Y E L, i.e., [cy]~ = {p E L 1 a N ,O}.

LEMMA 5.2. If R is a regular Janguage over a finite aJphabet C, so is [a]~ for all (Y E R.

PROOF. Let N = (Q,~,~,~o,~) b e a deterministic finite state automata (DFSA) for the lan-

guage R, where

l Q is the finite set of states, l qo E Q is the initial state, l F C_ Q is the set of final states, and l 6 : Q x C - Q is the transition function.

The proof is by construction of a DFSA that simulates the machine M except that the new machine also tracks sets of distinct symbols occurring in the input string. This is possible since C, and hence its power set, is finite. Define a DFSA M’ = (Q’, C, 6’, 46, F’) as follows:

l Q’=Qx2”*,

l 4; = {qo,~), e F’=Fx~~*,

l b’((qi,S),a) = (qj,SU{a)) iff 5(qi,a) = qj for all SC t=*.

Note that for all (7 E C*, b(qo,a) - qk iff S’((qo,(b),o) = (qk, S) where S is the set of distinct symbols appearing in LT. Thus, M and M’ accepts the same language. For each S C C’, define the machine M&, the S-projection of M’, as follows:

M; = (Q', C, a’, d, F x {S)).

Then for any [ar]~, it is exactly the language accepted by the machine M& where S is the set of distinct symbols appearing in cy. I

LEMMA 5.3. Let G = (V, C, P, S) be an extended conte~-fry ~rarnm~, and A E V. Then, the

eJementary expansion of {A) in G is regular.

PROOF. The lemma is intuitive since elementary expansion disallows recursion and preserves regularity. The proof goes by induction on the cardinality of N = V - C. If N = {S}, P =

(S --+ v} for some regular expression u over V. By Lemma 5.2, the production can be rewritten S ---+ v1 1 572, where w1 is a regular expression denoting the set of all strings in ,C(V) not involving the symbol S, and 29 is a regular expression denoting the language .C(v) - C(vl). Thus, the elementary expansion of S in G is the set C(V,) U {S}, and hence, it is regular.

Now, let A E JV, and A --+ r E P. Without loss of generality, we can assume that this is the only production with A on left-hand side. As in the case above, by Lemma 5.2, we can rewrite the production A ---+ ‘I‘ as A - ~1 1 r2 where ~1 is a regular expression denoting the set of all strings in L(r) not involving A and r2 is a regular expression denoting the language C(T) - C(rl). Let &, J32, . . . , B, be all of the symbols appearing in the language of ~1. For each Bi, define a grammar as follows:

Ggi = (V - {A}, C, PA, &I,

where P‘4={X--+u1 IX -uE Pandu=ul IUS)--(A-r},


where ~1 is a regular expression denoting the set of all strings in the language of u not involving the symbol A, and 212 is a regular expression denoting the set L(U) - ~Z(UI). By induction, the elementary expansion of {Bi} in Gg, is regular for all &. Let 8 be a regular substitution that maps each Bi to the elementary expansion of {Bi} in Gg,. Then, the elementary expansion of {A} in G is given by

{A) u W(Q)),

which is regular, since regular sets are closed under regular substitutions. I

PROPOSITION 5.5. Let r = (G1, G2, X, Q) be a layout processing scheme with regular coordination, i.e., X(A) is regular for all grammar symbols A of G1. Then, the elementary uniformity of r is decidable.

PROOF. Let ti be the elementary uniformity map of I’. Since regular languages are closed under regular operations and the inclusion problem is decidable in regular languages, we obtain the proof by specializing Proposition 5.4. It remains to show that fi is regular, i.e., h(p) is regular for all productions p = A --+ r of G1. Let 0 be the substitution that takes a grammar symbol B of Gz and generates the elementary expansion of {B} in Ga. By Lemma 5.3, t!?(B) is regular for all grammar symbols B of G2. Observe that

Q(WA)) = Cele(G2, X(A)) = GJ).

Since regular sets are closed under regular substitutions, we conclude that h(p) is regular. u

In attempting to model particular document layouts using layout processing schemes, we have never encountered an actual layout Ystyle” where regular coordination did not yield a natural structural expression of the style in question. Thus, the previous lemma reinforces our intuitions as to what might constitute a natural style by guaranteeing that such things also happen to be computable.

6. CONCLUSIONS

As a basis for document representation the arbitrated, coordinated preferential attribute grammar schemes provide a unique combination of features: a completely developed formal semantics, the modular isolation of abstract source (logical) and result (physical) document structure (when specialized to the case of layout processing), the modular isolation of abstract source (result) document entities one from the other as governed by productions and the structured definitions of predicates via productions, the explicit availability of the specific source (logical), and result (layout) structures to the definitions of (document-related) attributes.

REFERENCES

1. A.L. Brown, Jr. and H.A. Blair, A logic grammar foundation for document representation and document layout,, In EP90 Proceedings of the International Conference on Electronic Publishing, Document Manip- ulation and !Qpography, (Edited by R.K. Furuta), pp. 47-64, Cambridge University Press, Cambridge, (1990).

2. A.L. Brown, Jr., T. Wakayama and H.A. Blair, A reconstruction of context-dependent document processing in SGML, In EPSO: Proceedings of the International Conference on Electronic Publishing, Document Manipulation, and Qpography, (Edited by C. Vanoirbeek and G. Coray), pp. l-25, Cambridge University Press, Cambridge, (1992).

3. J.R. Schoenfield, Mathematical Logic, Addison-Wesley, Reading, MA, (1967). 4. H. Rogers, Jr., Theory of Recursive Functions and Eflective Computability, McGraw-Hill, New York, (1967). 5. C.-L. Chang and C.-T. Lee, Symbolic Logic and Mechanical Theorem Proving, Academic Press, New York,

(1973). 6. J.W. Lloyd, Foundations of Logic Programming, Second edition, Springer-Verlag, Berlin, (1987). 7. S.A. Greibach, Theory of Program Structures: Schemes, Semantics, Vetijkation, Springer-Verlag, Berlin,

(1975). 8. M.A. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, (1978).


9.

10. 11. 12.

13.

14.

15.

16. 17.

18.

19.

20.

P. Deransart, M. Jourdan and B. Lorho, Attribute Grammars: Definitions, Systems and Bibliography, Volume 323 of Lecture Notes in Computer Science, Springer-Verlag, New York, (1988). G.E. Hughes and M.J. Cresswell, An Introduction to Modal Logic, Methuen, London, (1968). G.E. Hughes and M.J. Cresswell, A Companion to Modal Logic, Methuen, London, (1984). A.L. Brown Jr., S. Mantha and T. Wakayama, Preference logics: Towards a unified approach to nonmone tonicity in deductive reasoning, Presented at the Second International Symposium on Artificial Intelligence and Mathematics, Ft. Lauderdale, FL, 1992. A.L. Brown Jr., S. Mantha and T. Wakayama, Preference logics and nonmonotonicity in logic programming, In Logic at Tver, International Conference on Logical Foundations of Computer Science, (Edited by A. Nerode), Springer-Verlag, Tver, Russia, (1992). D.E. Knuth and M.F. Plass, Breaking paragraphs into lines, Software--Practice and Experience 11, 1119- 1184 (1981). C.F. Goldfarb, Information processing-text and office systems-Standard Generalized Markup Language (SGML), Technical Report DIS 8879, International Standards Organization (ISO), Geneva, (1986). C.F. Goldfarb, The SGML Handbook, Oxford University Press, Oxford, (1990). A.J. Korenjak and J.E. Hopcroft, Simple deterministic languages, In Seventh Annual IEEE Symposium on Switching and Automata Theory, pp. 36-46, (1966). R. McNaughton, Parenthesis grammars, Journal of the Association for Computing Machinery 14 (3), 49&500 (1967). G. Senizergues, The equivalence and inclusion problems for NTS languages, Journal of Computer and System Sciences 31, 303-331 (1985). L. Bossson and G. Senizergues, NTS languages are deterministic and congruential, Journal of Computer and System Sciences 31, 332-342 (1985).

a formal specification of document processing

Documents