a corpus-based account of left-detached items in the recent history of english: left dislocation vs....

39
An edited version of this manuscript is published in English Text Construction 8.1: 21-64. DOI 10.1075/etc.8.1.02tiz *I am grateful to Lieven Vandelanotte for his valuable suggestions, help, encouragement and patience. My thanks are also due to two anonymous referees for their constructive criticism and comments. Any remaining obscurities are entirely my own responsibility. I am also grateful to the following institutions for generous financial support: the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund (grant no. FFI2013-44065-P), and the Autonomous Government of Galicia (grant no. GPC2014/060). A corpus-based account of left-detached items in the recent history of English: Left Dislocation vs. Left Detached-sequences* David Tizón-Couto Universidade de Vigo [email protected] ABSTRACT This paper investigates sequences featuring a coreferential link between a left-detached constituent and a resumptive in the following main clause [‘LDet-sequences’]. Their syntactic, semantic and textual behaviour is investigated in historically recent texts (since Modern English) where such detachments are not generally expected to replicate the behaviour that has been attested for contemporary spoken English Left Dislocation [‘LDis’] (cf. Geluykens 1993; Gregory and Michaelis 2001; Snider 2005; Netz et al. 2011). A typology of LDet-sequences is proposed in order to investigate them beyond their apparent structural similarities with spoken English LDis. Left-detached referents are found to typically hold a weak relationship with clause- grammar; however, some of the LDet-sequences attested also exhibit a certain degree of deviation from the less syntactic and freer, discoursal behaviour that characterises others. As far as their diachronic development is concerned, only LDet-sequences that closely resemble LDis illustrate the declining course reported in previous research for LDis (cf. Pérez-Guerra and Tizón-Couto 2009). Lastly, LDet-sequences are more frequent in drama and fiction; thus, the prediction that they are employed to recreate conversation (cf. Geluykens 1992) is provisionally borne out. Keywords: Left Dislocation, detachments, history of English, information structure

Upload: uvigo

Post on 21-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

An edited version of this manuscript is published in English Text Construction 8.1: 21-64. DOI 10.1075/etc.8.1.02tiz

*I am grateful to Lieven Vandelanotte for his valuable suggestions, help, encouragement and patience. My thanks are also due to two anonymous referees for their constructive criticism and comments. Any remaining obscurities are entirely my own responsibility. I am also grateful to the following institutions for generous financial support: the Spanish Ministry of Economy and Competitiveness and the European Regional Development Fund (grant no. FFI2013-44065-P), and the Autonomous Government of Galicia (grant no. GPC2014/060).

A corpus-based account of left-detached items in the recent history of English: Left Dislocation vs. Left Detached-sequences* David Tizón-Couto Universidade de Vigo [email protected] ABSTRACT This paper investigates sequences featuring a coreferential link between a left-detached constituent and a resumptive in the following main clause [‘LDet-sequences’]. Their syntactic, semantic and textual behaviour is investigated in historically recent texts (since Modern English) where such detachments are not generally expected to replicate the behaviour that has been attested for contemporary spoken English Left Dislocation [‘LDis’] (cf. Geluykens 1993; Gregory and Michaelis 2001; Snider 2005; Netz et al. 2011). A typology of LDet-sequences is proposed in order to investigate them beyond their apparent structural similarities with spoken English LDis. Left-detached referents are found to typically hold a weak relationship with clause-grammar; however, some of the LDet-sequences attested also exhibit a certain degree of deviation from the less syntactic and freer, discoursal behaviour that characterises others. As far as their diachronic development is concerned, only LDet-sequences that closely resemble LDis illustrate the declining course reported in previous research for LDis (cf. Pérez-Guerra and Tizón-Couto 2009). Lastly, LDet-sequences are more frequent in drama and fiction; thus, the prediction that they are employed to recreate conversation (cf. Geluykens 1992) is provisionally borne out. Keywords: Left Dislocation, detachments, history of English, information structure

1

1. INTRODUCTION Left detachments (as in That guy over there, I am sure I know him) belong in the spoken and informal varieties of the language. Unplanned discourse tends to feature more detachment constructions, which “are almost entirely absent in the formal planned register” (Givón 1979: 229) or which may seem “inappropriate in formal registers” (Lambrecht 1994: 182). In fact, English conversation appears to show the greatest use of Left Dislocation [LDis],1 while written English shows LDis usage mostly within pseudo conversations (Geluykens 1992: 99). LDis has been traditionally conceived of as a string consisting of a left-detached item (usually an NP; e.g. London in London, I love it) and an ensuing clause including a resumptive pronoun or phrase in a coreferential relationship with the former (e.g. it in London, I love it). Anaphoric coreference and semantic relatedness between the detached item and the resumptive pronoun trigger a link whose nature as either a discourse- or as a sentence-based phenomenon has produced much debate in the literature (Keenan-Ochs and Schieffelin 1976: 241; Vat 1997: 95; Pérez-Guerra and Tizón-Couto 2009: 31).

English written historical texts contain a variety of strategies that place syntactically disconnected items (at the clausal level) before the main proposition, in a manner that resembles LDis as characterised by generative grammarians (beginning with Ross 1967: 253; e.g. London, I love it),2 usually in contrast with Topicalisation (e.g. London I love). This paper proposes that it is possible to identify sequences within historical texts of recent periods (especially Modern English) that contain elements and features of what has been defined as LDis in the literature on contemporary spoken English. These detachment-headed sequences are, a priori, not expected to replicate the grammatical or discourse-organisational behaviour that has been attested for contemporary spoken English Left Dislocation [LDis] because, on the one hand, some of them do not fully match the grammatical features commonly used to define recent LDis and, on the other hand, LDet-sequences can be employed to achieve different rhetorical effects and discourse functions in the written medium. For instance, Shakespeare’s classic line in Hamlet (i.e. To be or not to be – that is the question) hardly represents informal or unplanned register, where referents must be negotiated in turn taking (cf. Ford et al. 2003). In fact, left-dislocated constituents are more frequent in formal written genres, where traces of orality are not to be expected unless deliberately imitated, than in letters or diaries (i.e. speech-like genres (cf. Culpeper and Kytö 2010)) after Late Middle English (cf. Pérez-Guerra 1999: 223; Pérez-Guerra and Tizón-Couto 2009: 40).

Many of the examples considered in this study show a fairly similar behaviour to spoken LDis; however, these examples must be contrasted with others which include additional features such as inner adjectival modification or a list shape (cf. Section 4) in order to ascertain where they stand in relation to the concept of LDis. Therefore, formal and functional issues such as length, semantics or information status must be considered in order to account for ‘LDet-

1 The most typically employed abbreviation in the previous literature on Left Dislocation is LD; however, LDis is employed here in order to distinguish Left Dislocation [LDis] from what is here termed Left Detachment sequences [LDet-sequences]: a range of sequences that more or less closely resemble LDis but that occur in writing and behave differently at several levels. 2 LDis was initially characterised by Ross (1967: 253) as the fronting of an NP from a clause into the left-most or sentence-initial position, external to a proposition which contains a pronominal copy anaphorically referring back to the fronted NP. Cf. Anagnostopoulou et al. (1997) for a collection of key papers on the syntactic analysis of LDis.

2

sequences’ (where ‘LDet’ stands for left detachment) that do not fully comply with the traditional conception of the LDis construction as exemplified in London, I love it. Besides formal issues such as the syntactic function of the resumptive pronoun or the semantic relation between the left-detached constituent and the resumptive item, two particular information-structural features of detachments that have been previously researched in the literature dealing with the behaviour of spoken LDis are examined in depth. The ‘textual’ behaviour of LDet-sequences is measured in terms of information status and topicality or textual persistence (i.e. the stretch of text, measured in number of clauses, where the left-detached referent remains activated) of the left-detached referent.

The distribution of the attested LDet-sequences across historical periods and genres is also investigated in order to (a) consider the implications of the current results to previously suggested diachronic trends concerning LDis and (b) assess the general purpose of sequences that resemble LDis in written records. As far as the diachronic development of the construction is concerned, Pérez-Guerra and Tizón-Couto (2009) found that the decline in the use of the LDis construction from Late Middle English [LME] onwards is statistically significant. The low frequencies they report for LDis from Late Modern English [LModE] to Present Day English [PDE] (approximately 1 instance per 10,000 words) are in keeping with the process of syntactisation (cf. Pérez-Guerra 2005), in progress in these periods, according to which peripheral constituents that cannot be integrated into the syntactic structure of the clause are avoided in at least written (planned) linguistic production. Pérez-Guerra and Tizón-Couto (2009) show that the diachronic decrease of LDis also correlates with the general decline of other fronting strategies such as Topicalisation since LME.3 In fact, LDis is especially susceptible to advancing syntactisation and an unmarked first position since it involves elements that, by definition, cannot be accommodated within the syntactic structure of the clause (cf. Section 4).

The results suggest that LDet-sequences in written Modern English texts predominantly behave as LDis does in contemporary spoken English as regards several formal variables and textual features (i.e. information status and topicality); however, some restrictions must be placed on the aggregate results. More precisely, a typology of LDet-sequences is required in order to inquire beyond their apparent similarities with spoken English LDis. As a whole, LDet-sequences (comprising a left detachment and a coreferential resumptive item) link discourse reference continuity with internal clause syntax. Thus, they show clear traces of detachment from the clause (e.g. the left-detached item cannot usually replace the resumptive phrase, and the two elements may be partly linked semantically) and features that clearly set them apart from other items that take the first linear position in a clause such as Topicalisation or prototypical subjects. However, some of the types attested also evince a certain degree of resistance to the noticeably syntactically freer behaviour that characterises other types and, therefore, show a higher degree of similarity with spoken English LDis (cf. Section 6.1).

Since this study investigates four different attested LDet-sequences, the aggregate results on diachronic developments are not straightforwardly comparable to Pérez-Guerra and Tizón-Couto’s (2009) above-mentioned results on LDis alone, in which it was shown that the 3 In contrast, the data Pérez-Guerra and Tizón-Couto (2009) report on concerning adverbial fronting does not show a parallel evolution: there seems to be a more convoluted and winding path influenced by the decrease of sentence-initial conjuncts (e.g. therefore, otherwise, notwithstanding, however, nevertheless, besides, else, etc. (cf. Pérez-Guerra 1999: 220) and the increase of sentence-initial (subject-oriented) disjuncts (e.g. perhaps, doubtless, etc. (cf. González Álvarez 2002: 287)).

3

construction suffers a significant decline similar to that in other fronting strategies such as Topicalisation. However, the current results, especially those on the LDet-sequences that most closely resemble LDis, suggest similar declining trends. Furthermore, some recent findings on left-dislocated NPs in a different set of data (cf. Tizón-Couto, forthcoming) supplement the findings reported in Section 7 for historical development and genre. Concerning text-types, the current results show that LDet-sequences are far more frequent in drama and fiction than in any other genre. Thus, the prediction that they are employed as traits of orality in order to recreate conversation (cf. Geluykens 1992) is borne out. 2. SCOPE OF THE STUDY AND DATA This study leaves aside left-detached vocatives,4 loose appositions,5 the as-for construction6 and self-correction items7 and focuses on examples of left-detached constituents of two kinds:

4 Vocatives are set apart from LDis on the basis that they refer to directly accessible elements from the interactional context and can thus be omitted without resulting in ungrammaticality:

(i) Come, Dorothy! a maid of ten has got nothing to do with lovers. (Besant, Walter. 1887. Dorothy Foster) 5 There are two kinds of apposition described in the literature (cf. Burton-Roberts 1975, Acuña-Fariña 1999, Keizer 2005): close and loose apposition, which are exemplified in (i):

(i) a. Van Gogh the painter… [close apposition] b. Van Gogh, the painter, … [loose apposition]

Despite the apparent similarities between loose apposition an LDis, the fact that loose apposition (iii) must deploy a second pause and the fact that a resumptive pronoun may appear at a large remove in LDis (ii) set apart both constructions.

(ii) Mingus, I have heard that he is a nice fellow. [LDis] (iii) He, Mingus, is a nice fellow. [loose apposition]

In addition, as pointed out by Acuña-Fariña (1996: 150), both members of an apposition must be combined in order to occupy the focus position in cleft sentences ((v) vs. (vii)), while such grouping is unlikely in LDis ((iv) vs. (vi)).

(iv) *It is Mingus, he that is a nice chap. (v) It was Bill, the doctor, that was here with Jane. (vi) *Billi, it was the doctori that was here with Jane. (vii) Bill, it is he that is a nice chap.

Finally, in terms of information status, appositions seem to comply with the principle of end-weight, while left-dislocates violate it in order to allow the subsequent clause to respect it (cf. Meyer 1992: 123). 6 In Tizón-Couto (2012), an argument is provided for the exclusion of the as-for construction from the category of LDis on the basis of three criteria: first, such elements may occur without a copy in the core (i.e. the ensuing clause); second, they are possible in Dik’s (1997: 391) parenthetical position (cf. examples (i) to (iii) below), a fact that renders them closer to adverbials; and, third, they may co-occur with proper instances of LDis, a mutually exclusive construction in English (cf. example (iv)). In addition, at the functional level, the as-for construction seems to be a stronger marker of topichood than LDis at the clausal level that carries out the function of aspectual modifier/adjunct. However, left-dislocated constituents should not fulfil a function in the sentence they introduce (cf. Pérez-Guerra and Tizón-Couto 2009: 32).

(i) He doesn’t have a clue, as for History, who Hitler and Mao are. (ii) He doesn’t have a clue, unfortunately, who Hitler and Mao are. (iii) *He doesn’t have a clue, History, who Hitler and Mao are. (iv) As for his honour, your brother, he will doubtless in some way achieve great-ness, as his grandfather

before […] (Besant, Walter. 1887. Dorothy Foster) 7 Self-correction should be set apart from LDis. Mere reformulation of a whole clause or VP, for example, does not trigger the same structural and functional effects as LDis, even when a coreferential relationship exists between an NP (the wisest of men in all ages… in (i)) in an incomplete clause (have not the wisest of men in all ages…) and a proform (they) in the ensuing reformulation (have they not had their Hobby-Horses…).

(i) Nay, if you come to that, Sir, have not the wisest of men in all ages, not excepting Solomon himself, — have they not had their Hobby-Horses; — their running horses, — their coins and their cockle-shells, their drums and their trumpets, their fiddles, their pallets, — their maggots and their butterflies? (Sterne, Laurence. 1759. Tristram Shandy)

4

(a) those that conform to the prototype of LDis (those which feature an NP as the left-detached item and a personal pronoun as the resumptive item: Your dad, he really likes football) and (b) those that are not as prototypical in terms of the grammatical categories of the two related items (e.g. one featuring a clause as a left-detached constituent and a noun phrase as a resumptive: The fact that she came, I am not to blame for it) but also seemingly resemble the arrangement of LDis (i.e. [left-detached constituent + ensuing clause containing a resumptive item]). In an attempt to facilitate comparisons with the relevant literature, the analysis has been limited to prototypical instances in terms of the syntactic function of the resumptive element or phrase. Thus, the investigation focuses on those examples in which the resumptive item fulfils the core syntactic functions of subject, object, complement of preposition or subject complement.

The data employed in this study have been drawn from four periods included in ARCHER. Version 3.1 of the corpus contains 1.7 million words, in the form of 1,037 texts sampled from seven 50-year historical periods covering early and late Modern to Present-Day English (1650-1990). Table 1 lists the periods included in this paper. For the purposes of the present study, these periods have been arranged into three groups, (1) early Modern English (EModE (1650-1749), 335,511 words), (2) late Modern English (LModE (1800-1849), 298,231 words) and (3) Present Day English (PDE (1900-1949), 176,769 words).

Periods researched and number of files from ARCHER

Number of words Period labels in analysis

1650-1699 (89 files) 160,493 EModE (335,511 words) 1700-1749 (103 files) 175,018 1800-1849 (119 files) 298,231 LModE (298,231 words)

1900-1949 (98 files) 176,769 PDE (176,769 words) TOTAL 810,511 All periods (810,511 words)

Table 1. Periods and number of files from ARCHER searched in this study

Table 2 shows the overall distribution of LDet-sequences in the corpus. These were manually extracted.8

Period Frequency EModE (1650-1749) 3.06 (n=103) LModE (1800-1849) 4.15 (n=117) PDE (1900-1949) 2.65 (n=45) TOTAL 3.26 (n=265)

Table 2. Overall distribution of LDet-sequences (normalised per 10,000 words)

8 Searching for two coreferential items in an LDet-sequence requires previous tagging of the corpus. ARCHER does not include a tag for left-detached or dislocated elements or resumptive items. To my knowledge, only the Penn/Helsinki corpora of Historical English offer machine-searchable tags for Left Dislocation [‘LFD’] and Resumptive [‘RSP’]. These corpora are the focus of ongoing research on left-detached elements in English (cf. Tizón-Couto, forthcoming). Without professional tagging of this kind, careful manual search becomes the only reliable tool to find examples of left peripheral constructions in written texts.

5

It is worth pointing out that the current study focuses on data from periods within the recent written history of English in order to test the extent to which contemporary analysis may account for the (syntactic, semantic and textual) behaviour of left-detachment constructions during those periods and, also, to provide a preliminary outline of both the historical development of detachment constructions and their role within different text types. Lastly, the small number of examples found in the PDE period may seem to hamper comparison; however, most previous studies focus on PDE spoken data and thus provide a fairly substantial body of evidence that overcomes this potential deficiency. 3. REVIEW OF THE LITERATURE: THE FUZZY CHARACTER OF LDIS The following subsections briefly survey previous examinations of LDis from the syntactic, semantic, and information structural viewpoints. Some degree of terminological debate and disagreement can be found in the literature dealing with left detachments from each of these perspectives. 3.1 THE ISSUE OF SYNTACTIC BEHAVIOUR English LDis corresponds to what, in contrastive linguistics, is generally labelled ‘Hanging Topic Left Dislocation’ (HTLD),9 i.e. a left-dislocated NP which is not grammatically linked to a copy via case-marking. This correspondence makes it difficult to draw the line distinguishing what may constitute LDis in English and what may not if one were to accept that phrasal categories other than NP can occupy the left-dislocate slot (as in In this cupboard, Steve put the beans there). Then, if the categorical features of the left-dislocate can go beyond the NP, English LDis, as a label, would arguably be very hard to define because it does not have a corresponding case-connected counterpart as in German (Grohmann 2000) and Dutch ‘Contrastive Left Dislocation’ (CLD) (De Vries 2007) or Romance ‘Clitic Left Dislocation’ (CLLD) (Villalba 2000).10 The only more syntactically connected counterpart to English LDis would be Topicalisation. For early generativists,11 Topicalisation resulted from wh-movement (cf. Chomsky 1977) and would, 9 HTLD is a left-detachment process by which one left-dislocated NP is linked to a clitic pronoun, a strong pronoun or an epithet (full NP) in the core clause. It shows “neither connectedness nor island-sensitivity”, its properties placing it “closer to anaphoric discourse relations than to syntactic ones” (Villalba 2000:103):

(i) Xoán, o que din e [que [eu son listo] e [el e parvo.]] [no island constraints] (Galician) ‘John, what that they-say is that I am smart and he is stupid’

10 Typically, ‘LD structures’ comprise CLLD, CLD and HTLD (cf. footnote 9 above). First, CLLD is a left-detachment phenomenon by means of which a left-dislocated element of any syntactic category, which can be iterated (and ordered freely in that case), is linked to a resumptive clitic in the core clause showing connectedness and island-sensitivity (cf. Villalba 2000: 44):

(i) O libro, compreino na librería Michelena. [clitic pronoun] (Galician) ‘The book, I-bought-it at bookstore Michelena’

Second, CLD is a left-detachment structure through which a left-dislocated element of any syntactic category, which cannot be iterated, is linked to a demonstrative (or d-word in the literature) or an NP in the core, showing connectedness and island sensitivity (cf. Grohmann 2000: 140):

(ii) Diesen Hund, den mag ich überhaupt nicht. [connectedness] (German) ‘This dog, [it] like I really not’

11 Ross (1973: 553), a clear representative of this approach, defines Topicalisation as “[a] process which is formally almost identical to Left Dislocation, with the exception that while [Left Dislocation] leaves behind a pronoun to mark the position in the sentence that the fronted NP used to occupy, the rule of Topicalization does not”.

6

therefore, be blocked in question wh-preposing, for instance. On the other hand, LDis would not be blocked in this context:

(1a) *Those papers what did Joanne do with? [Topicalisation] (1b) Those papers, what did Joanne do with them? [LDis]

Section 6 below offers a range of tests on the syntactic behaviour of LDet-sequences that confirm that they do not behave like Topicalisation in many respects since they show clear signs of syntactic detachment. For instance, the punctuation of the left-detached item (i.e. statement, exclamation or interrogation) does not typically match that of the ensuing clause, as it does in Topicalisation. However, this prediction for LDis and Topicalisation is well-known (cf. Anagnostopoulou et al. 1997) and not the main focus of this paper, which is more concerned with the syntactic behaviour of LDet-sequences in comparison with the behaviour reported in the literature for spoken English LDis. Even though some of the attested types of LDet-sequences seem to be more connected to the clause at certain levels (cf. Section 6.1), it is in fact hard to define LDis as a clause-based phenomenon in written historical texts; thus, a range of formal and functional qualifications are required (see Section 4). The syntactic features investigated are explained and justified in the methodology section (Section 5). 3.2 THE ISSUE OF ANAPHORIC (CO)REFERENCE The task of drawing the boundaries of LDis is complicated by attested instances where (a) straightforward coreference does not hold between the left-dislocate and the resumptive item or (b) a non-canonical semantic link holds between them. Two main opposing interpretations can be found in the literature regarding the potential link between a left-detached item and its corresponding resumptive item in the main clause: although most authors recognise the possibility of non-strict coreference, others propose that such instances be ruled out of the LDis construction. Within the inclusive approach, Rodman ([1974] 1997: 39), for instance, considers non-anaphoric instances part of the LDis construction and offers example (2) below as illustrative of non-strict coreferentiality in LDis.

(2) (As to) noxious odours, our sheepdog farts after eating escargots.12

Examples cited by Gundel (1977: 67), Chomsky (1977: 81) and Barcelona Sánchez (1988) also align within the early (1970s-80s) inclusive interpretations. Conversely, Hirschbühler (1997: 58) does not regard instances such as (3) as an example of LDis in French, on the basis that although “the epithet is well-formed”, it may not serve as an anaphor for the noun phrase to which it is related.

(3) *Ma carte d’identité, j’ai perdu cette putain. ‘My identity card, I’ve lost that whore’

12 In numbered examples, italics have been employed to signal the left-detached constituent. Bold has been used to highlight the resumptive item.

7

Like Hirschbühler (1997: 4), both Lambrecht (1996) and Prince (1997) discard examples such as (4) below, which Van Riemsdijk (1997) terms ‘loose-aboutness LD[is]’. Geluykens (1992: 22) also notes that there is a ‘quasi-LD[is]’ where strict coreferentiality does not hold (cf. example (4)). Nonetheless, Geluykens’s (1993: 725) subsequent research maintains that in instances such as (5) there is an element in the core proposition (papers) which is semantically strongly linked to the left-detached item (conference), by virtue of belonging to the same frame or scenario.

(4) London, Trafalgar Square is nice. (5) As for the conference, I like the papers.

Lastly, Prince (1997: 138) concludes that instances of LDis in which semantics or anaphora are not strict between the left-dislocated constituent and the resumptive phrase “seem functionally indistinguishable from canonical Left-Dislocations containing in situ personal pronouns, but further research is required”. This study is, partly, an attempt to address such call for further research and, thus, sides with an inclusive point of view that acknowledges potential semantically weaker connections between the left-detached constituent and the resumptive item.13 Further details on this issue are provided in the methodology section (Section 5). 3.3 THE ISSUE OF INFORMATION STATUS A great deal of work has been done on the information structural features of LDis in contemporary oral English (Montgomery 1982; Kies 1988; Geluykens 1992, 1993; Prince 1997, 1998; Gregory and Michaelis 2001; Snider 2005). This literature contains a variety of contrasting opinions and findings on the information status of left-dislocates. Keenan-Ochs and Schieffelin (1976: 242), for example, argue that speakers mostly use LDis to introduce discourse-new referents. In addition, for Reinhart (1981: 64), LDis is used to change the current topic and introduce a new one. Example (6), from my dataset, illustrates LDis employed to introduce a brand new referent:

(6) ENID: <(diffident, cautious)> Looking back at what? PRINCESS: At everything. ENID: This mystic side to you, Zena, is it something new? PRINCESS: No. ENID: Your late husband – did he know of it? PRINCESS <(lifting her shoulders slightly):> He may have guessed. ENID: Only ‘guessed’. (Firbank, Ronald. 1920. The Princess Zubaroff. Act 11, Scene 1)

13 Examples such as (5), in the main text, have not been included in the current study on the basis that the left-detached item is introduced by as for (cf. footnote 6), but not due to the less prototypical semantic link between conference and papers. It must be pointed out that left-detached items introduced by as for and showing a syntactic or semantic link with a resumptive pronoun or phrase were not found in the corpus researched.

8

Instances of LDis headed by NPs introducing new participants, i.e. “not mentioned previously but knowable or assumable by the hearer from outside the immediate verbal context”, account for 80.5% of Montgomery’s (1982: 426) data. Instances of LDis which introduce irrecoverable, topical referents account for 76.9% of Geluykens’s (1992: 137) data. Lastly, Snider (2005: 23) concludes that information status is a significant predictor of left dislocation in spoken English: in his data, hearer-known but discourse-new entities (termed ‘mediated’ entities) account for 66.9%, with the subtype of set-coded entities, referring to ‘a subset, superset, or member of a previously introduced set’ [46.3%], most likely to left-dislocate. Thus, most authors agree on the role of LDis as promoting the introduction of new referents in spoken discourse.

On the other hand, authors such as Barcelona Sánchez (1988) consider left-dislocates, in general, to be linked or given items. Example (7) below illustrates how LDis may be employed to highlight a given referent:

(7) FARAKER: Most annoying…you know I’ve lost all the notes I took at the other

monasteries. They were in the baggage we lost in the rush up here. MRS FARAKER: Oh, that dreadful night… don’t talk about it, please. FARAKER: No, dear, I’m talking of the notes… my notes. (Fagan, James Bernard. 1992. The Wheel of Life)

Barcelona Sánchez (1988: 15) further asserts that ‘connection’ is most obvious when the ‘detached topic’ is headed by as-for, since this expression underlines the fact that the speaker is choosing one from a set of previous topics. In a similar vein, Gómez-González (2001: 293) determines that only 27.7% of the instances of LDis found in LIBMSEC [Lancaster IBM Spoken English Corpus] introduce new referents. The lower frequency of newly introduced LDis in Gómez-González’s data could be explained by the fact that she, like Barcelona Sánchez, also includes the as-for construction within LDis (cf. footnote 6).

The small number of studies focusing on written and historical texts report that left-dislocates tend to be given. In her study of LDis in Old English, Traugott (2007: 12) finds that only 18% of subject LDis and 32% of her object LDis have no antecedent, while the remainder are either inferrable by means of a poset14 relationship or show a total or partial identity relationship with a previous item in the text. Pérez-Guerra and Tizón-Couto (2009) show a similar trend in the early and late Modern English periods, namely that less than 40% of the left-dislocated segments are absolutely non-referring [i.e. irrecoverable]. The aggregate results of the current study also suggest that there is a tendency for left-detached constituents to be given. However, the data for each type (of LDet-sequence) indicates that some of them are more liable to be given but others most frequently detach new or inferrable items (cf. Section 6.2).

The interaction between information status and textual or topic continuity, which is measured here quantitatively (by number of clauses), is also explored in the results section. This interaction has not been frequently addressed in the literature on LDis. Measuring topicality is 14 According to Prince (1998: 289), poset relations include, along with the usual set relations and the identity relations, relations like is-a-part-of (e.g. the relation between a song and a CD on which that song occurs) and is-a-subtype-of (e.g. the relation between Siberian huskies and dogs), but they do not include functional dependency relations. In essence, a poset involves relationships such as set/subset, part/whole, type/subtype, greater than/lesser than, and identity.

9

required in order to test whether LDet-sequences isolate constituents at the syntax-discourse interface, where they “set up referent chains which can transcend clausal boundaries, maintaining topic continuity as long as the speaker or writer wishes” (Downing and Locke 2006: 226). In other words, measuring topic persistence is a means to investigate whether LDet-sequences have the ability to enhance topicality at the paragraph level. The methods section specifies how this has been done (Section 5). 4. TYPES OF LDET-SEQUENCES This study deals with sequences including a left-detached constituent and an ensuing clause that, in turn, contains a resumptive item. Thus, a left-detached item may initiate an LDet-sequence as long as there is a clear semantic or anaphoric relationship that holds between the left-detached referent and a parallel item (cf. Larsson 1979: 22). ‘LDet-sequences’ comprise:

a (hanging) constituent which (within one speaker-turn) holds a syntactic and/or semantic link with a proform or anaphoric phrase in the core of the clause to which it is attached, excepting vocatives, prototypical appositives, self-correction items and the as-for construction. The left-detached constituent must not be directly insertable in the core clause (i.e. without eliminating the resumptive element), cannot fulfil a function in the core clause (e.g. adjunct), is always followed by a suprasegmental pause (usually reflected by commas or semicolons in writing), and must exhibit an obligatory one-to-one anaphoric relation with the resumptive or copy.

Four LDet-sequences have been attested that conform to this definition: LDet proper [PRP], Summarising LDet [SUM], Attributive LDet [ATR] and Acknowledge LDet [ACK]. Table 3 outlines the different types of LDet-sequences.

Left-detached item PAUSE Ensuing clause/sentence LDet proper [PRP] NP[+deictic]i/ XP [+deictic]i/Clause [+deictic]i [[pron/NP]i] Clause/Sentence

Summarising LDet [SUM] Two or more NPs [+deictic]i, Two or more XPs [+deictic]i [[pron/NP]i] Clause/Sentence

Attributive LDet [ATR] NP [+deictic] [+adjectival modification]i [[pron/NP]i]Clause/Sentence

Acknowledge LDet [ACK] XP [+deictic] [+echo]i

[[pron/NP/XP]i] Clause/Sentence

Table 3. Types of LDet-sequences

10

The first type, LDet proper [PRP], is typically headed by a noun phrase with a clear deictic and referential reading (cf. (8) and (9)).

(8) RACHAEL: Let me pass. I must, will go to my children. JACK: <(throwing up the purse)> And they may want a breakfast. RACHAEL: Villain! though you insult the wife, have pity on the mother <(crosses, he

seizes her)> – let me go! JACK: Not now – I have gone too far. RACHAEL: Oh! you will not! Mercy! Martin! <(despairingly)> he comes not! JACK: <(passionately)> You may rave. You’ve roused me, and I’ll not be trifled with. RACHAEL: Help! help! <(they struggle)> My husband! – he is here! – <(JACK,

surprised, lets her go, and falls back. She rushes to the door, and seizes a woodcutter’s bill that is lying on some wood near the wall, (Jerrold, Douglas. 1832. The Rent Day)

(9) I am a personal friend of the King. The King, you know – my personal friend.

<(Suddenly serious again and looking in front of him.)> Unhappily he has taken my castle. The board is very complicated, and the castles are all black. It is too complicated, really. I go from castle to castle, and they are all black. (Hamilton, Patrick. 1943. The Duke in Darkness: A Play in Three Acts)

Although the tag PRP suggests it is the closest to what has been termed Left Dislocation (LDis) in the literature, it should be pointed out that PRP includes many instances where the left-detached NP is followed by an -ing modifier (cf. example (10) below). In those cases, the ascription of the detached constituent to either the matrix or a subordinate clause is not straightforward. This issue, raised by Pérez-Guerra and Tizón-Couto (2009), has been resolved here by considering as LDet-sequences those instances where the initial nominal element (although followed by a non-finite modifier) corefers with the copy and where the whole unit of NP and -ing clause does not function as an adverbial modifier of the ensuing clause. In instances such as (10), the head of the NP followed by the non-finite modifier and the proform are coreferential and produce an LDet-sequence.

(10) The weaver, as wary as he was, was now blinded. He saw everything carried so cleverly that he had not the least distrust, but [the bareheaded fellow sitting down and drinking]i, hei delivered the silk to the other who goes directly in at the great gate, and the weaver, seeing that and having the other fellow with him, thought all was well, but he did not find it so. (Kirkman, Francis. 1673. The Counterfeit Lady Unveiled)

However, in cases such as (11) the supposed copy proform they does not resume the singular referent expressed by the butler plus its non-finite postmodifier, but rather the PP-embedded them; in other words, the constituent the butler being desired to join himself with them, but he refusing this also does not convey one singular referent and thus cannot be analysed as a left-detached NP since its head does not hold a one-to-one relationship with the resumptive

11

pronoun. The constituent occurring before the comma is, in consequence, a subordinate clause functioning as an adjunct or adverbial (whether of time, reason or manner).

(11) [And the butler being desired to join himself with [them]j, but he refusing this also]i, [they]j all fall to work, and [he not being to be prevailed with to accompany [them]j in working, any more than in feasting or dancing]i, [they]j all disappeared, and the butler is now alone; but instead of going forwards, home he returns, as fast as he could drive, in a great consternation; (Defoe, Daniel. 1720. Life and Adventures of Duncan Campbell)

Summarising LDet (12) is formally quite similar to PRP, but it sets up a series of NPs which

are regrouped in the ensuing clause via a nominal proform (most typically; cf. (12)) or a full noun phrase (13):

(12) SIR C: Ugh. if Sir Lennox had less confidence in dancing after Lady Cranberry, it

would be quite as well, but it’s the fashion now, a married couple can never travel without a bodkin, eh, ugh, here they come, oh, what shall I do? the hyena, the hippopotamus of Africa, the rhinoceros, the domestic pig, all love their young, but-ugh. (Serle, Thomas. 1820. Exchange no Robbery; or the Diamond Ring)15

(13) Adam and Eve standing under a tree, she, with the apple in her hand; – the patriarch

Abraham, with a tree growing out of his body, and his descendants sitting owl-like upon its branches; – ladies with flowing locks of gold; knights in armour, with most fantastic, long-toed shoes; jousts and tournaments; and Minnesingers, and lovers, whose heads reach to the towers, where their ladies sit; – and all so angular, so simple, so childlike, – all in such simple attitudes, with such great eyes, and holding up such long, lank fingers! – These things are characteristic of the Middle Ages, and persuade me of the truth of history. (Longfellow, Henry Wadsworth. 1839. Hyperion. A Romance)

Attributive LDet is usually headed by an NP which is modified by an adjective or which

expresses per se a quality attached to the referent resumed by the copy in the ensuing clause.16 It carries out a predicating function (cf. Ono and Thompson 1994: 415), i.e. left-detached items

15 The discourse effect of the Summarising type seems to be similar to that discussed by Miller (2001) for long subjects, namely to sum up the content of the preceding lines and thus make it available as a discourse referent for a subsequent judgement.

(i) [That much of what he calls folklore is the result of beliefs carefully sown among the people with the conscious aim of producing a desired mass emotional reaction to a particular situation or set of situations]SUBJ is irrelevant.

Miller (2001: 688) quotes (i) in order to show that heaviness cannot be the only factor that accounts for extraposition, as it is often understood. 16 This type of LDet-sequence would include what Biber et al. (1999: 136-137) term a ‘detached predicative’ headed by a nominal (but not an adjective), characteristic of descriptive writing, by means of which it is possible to express a great deal of information very concisely and background part of the message (initial position) or provide supplementary information (end position):

(i) A Saxon princess, she was born at Exning near Newmarket around AD 360, the daughter of Anna, King of East Angles.

12

of this kind epithetically characterise a previous or new referent that is reiterated in the following clause. Therefore, one of the essential features of Attributive LDet is the speaker’s marking of a stance of ‘affect’ towards the entity which occupies a central argument in the subsequent clause, as exemplified in (14) and (15).

(14) It was different, however, with the limping horse. Misfortunate brute! one of its fore-

legs had folded below it, and snapped through at the fetlock joint. (Moir, David Macbeth. 1828. The Life of Mansie Wauch, Tailor in Dalkeith).

(15) SPARKISH: No, by the Universe, Madam, he does not rally now; you may believe him:

I do assure you, he is the honestest, worthyest, true hearted Gentleman – A man of such perfect honour, he wou’d say nothing to a Lady, he does not mean. PINCHWIFE: Praising another Man to his Mistriss! (Wycherley, William. 1675. The country-wife)

Lastly, Acknowledge LDet relies on an acknowledgment by a different speaker, who uses repetition as a cohesive device to continue with the dialogue immediately after an intervention by another speaker.17 Hence, lexical repetition or grammatical parallelism is employed to provide cohesion before a new statement (cf. Kies 1988: 62). Acknowledge LDet (16) most frequently fronts a short NP that is reiterated via a nominal proform or full NP, although virtually any constituent may enter this non-prototypical echoing construction and then be repeated in the ensuing clause.

(16) SIR Dav: If you were in Petticoats, I shou’d take you for the Kentish Miracle – What is

this Officer’s Name, Friend, that you serve? MANAGE: Captain Bounce, Sir. SIR Dav: Bounce! I fancy you are related to him, are you not, Friend? (Centlivre, Susanna. 1709. The Man’s bewitch’d, or, The Devil to do about Her)

Table 4 shows the distribution of each of the four LDet-sequences in the periods under study.

17 The fact that this type of LDet-sequence occurs at the beginning of a turn, directly after another speaker’s intervention, sets it far apart from LDis, where a new or recoverable element can be promoted to initial position at the middle of a turn or inside a monologue. In example (i), the exclamation Sincerity! refers to something the same speaker has just uttered (his sincerity):

(i) I suspected him, and determined to test his sincerity. Sincerity! It seems like a profanation of the word to write it in connection with such a monster, so asked him point-blank: “Why may I not go to-night?” (Stoker, Bram. Dracula)

13

Table 4. Distribution of LDet-sequences (Normalised frequency per 10,000 words) 5. METHODOLOGY After a manual search was completed, the examples were coded for different categorical and numeric values concerning several variables. The main aspects analysed are listed below and concern syntactic (A-D), semantic (E) and textual (F-H) behaviour:

A. Syntactic function of the resumptive element B. Potential replacement of the resumptive element by its corresponding left-detached item C. Punctuation [P]: combinations of Ps of both the left-detached constituent and the

ensuing clause D. Length of the left-detached item (in number of words) E. Semantic relationship between the left-detached item and the copy: total, partial or

metonymic F. Information status of the left-detached referent: given, inferrable or new G. Topic Continuity [TC] and Subsequent Mention [SM] of the left-detached referent (in

number of clauses) H. Genre

A. In their landmark study of LDis in spoken English, Gregory and Michaelis (2001: 9) found that in 89.3% of their examples, the resumptive pronoun which corefers with the preclausal NP has the grammatical function of subject (cf. examples (8) and (9) above). In their view, this finding suggests that LDis ensures that only discourse-active referents appear in the subject role (cf. also Prince 1997: 123). Similarly, Snider (2005: 17) found that in 69.4% of his data the resumptive element acts as subject. Lastly, Traugott (2007: 6), who only considers subject and object LDis in her analysis, found that 84.4% of her examples are subject LDis. This study looks into the syntactic functions fulfilled by resumptive items in written historical data to establish whether they follow a similar subject-related trend and to see whether this variable suggests distinctions among the range of LDet-sequences attested. B. Each instance has also been coded for the possibility of directly replacing the resumptive element with the left-detached constituent without causing an ungrammatical clause. In

Type EModE (1650-1749)

LModE (1800-1849)

PDE (1900-1949)

TOTAL

LDet proper 1.9 (n=64) 1.6 (n=48) 0.96 (n=17) 1.59 (n=129) Summarising LDet 0 (n=0) 0.23 (n=7) 0.17 (n=3) 0.12 (n=10) Attributive LDet 0.44 (n=15) 0.87 (n=26) 0.34 (n=6) 0.57 (n=47) Acknowledge LDet 0.71 (n=24) 1.2 (n=36) 1.07 (n=19) 0.97 (n=79) All types 3.06 (n=103) 4.15 (n=117) 2.65 (n=45) 3.26 (n=265)

14

example (14) above, for instance, the left-detached constituent (Misfortunate brute) cannot replace its resumptive element (one of its forelegs): *One of misfortunate brute’s fore-legs had folded… In contrast, in examples (15) and (16) above the resumptive element could be, respectively, erased and substituted by the preceding left-detached NPs, yielding the grammatical clauses A man of such perfect honour wou’d say nothing to a Lady and I fancy you are related to Bounce. This variable has its roots in the intuition that replacement should usually be possible if LDis is a phenomenon that is tightly linked to the clause (as in the prototypical London, I love it). This aspect has not, to my knowledge, been previously quantified. C. Previous analyses of LDis in the literature do not account for punctuation [‘P’] variation across its two essential members (i.e. the left-detached constituent and the ensuing clause). P variation is established here on the basis of the punctuation attested in the texts and recorded on the basis of basic terms for sentential force (i.e. interrogative, exclamative or statement).18 It is assumed that, although a left-detached item cannot be claimed to constitute a clause or a sentence by itself, it may exhibit a neutral mode (coded as ‘statement’) or a more specific force coded either as ‘interrogative’ or ‘exclamative’ when the left-detached item is immediately followed by a question mark or exclamation mark respectively. Thus, in (16) above, the left-detached constituent has E[xclamative] punctuation (Bounce!), and the ensuing clause features I[interrogative] punctuation (I fancy you are related to him, are you not, Friend?). In contrast, both the left-detached constituent and the resumptive pronoun in (17) below display S[tatement] punctuation (The Inhabitants of this delicious Isle, as they are without Riches … so are they without the Vices …; see below). The decision to consider P as a variable was made upon observing the wide range of possible combinations during the process of manually extracting the instances. This variable proves useful in distinguishing the attested types (cf. Section 6.1). D. The length of the left-detached constituent was measured in number of words in order to provide a quick means of making (a) a general comparison between LDis/LDet-sequences and both unmarked themes and Topicalisation in Modern English texts and (b) a specific comparison between the attested types of LDet-sequences. Data for comparison have been adapted from Pérez-Guerra (1999). E. The character of the link held between the left-detached referent and the resumptive item is also investigated. Three different kinds of semantic relations have been attested between the left-detached constituent and the resumptive phrase: total identity (e.g. Lemon, I love it), partial identity (e.g. Lemons, he likes a rare sort) or metonymic identity (e.g. Lemons, I hate to find seeds in my drink). Partial identity and metonymic identity may be felt to overlap. However, notice that the resumptive phrase a rare sort (of lemon) in the example of partial identity does not necessarily have to be a lemon but may well be a different fruit; in contrast, seeds are always an essential part of the whole lemon. In addition, the relationship between John and his office in John, I can’t find his office cannot be labelled as metonymic.

18 Only four instances of an imperative clause (following the left-detached constituent) were attested. To avoid a fourth category in P, these four instances were analysed as statements [S] since none of them includes exclamative or interrogative punctuation.

15

F. The analysis adopted for the information status and topicality features is of a strictly anaphoric kind – that is, it relies on what can be found in the text, not on speaker assumptions. In order to account for the information status of left-detached referents, then, any kind of speaker- or hearer-presuppositional category, such as ‘universal knowledge’ or ‘inferrable from the situation’, was avoided. Hence, the count of anaphoric items relies on formal aspects for the evaluation of both topic persistence and information status, not only because of the written and historical nature of the data but also because speaker- or hearer-based intuition, in terms of operationalization, is known to be unreliable.19 Accordingly, an item was considered ‘given’ only when it was already mentioned in the text prior to its occurrence as a left-detachment (cf. example (7) above). In this study, the distance between the detached item and its referent in previous discourse is measured in terms of the number of intervening clausal units. Thus, a left-detached referent is ‘inferrable’ as long as some semantically/cohesively similar element occurs in the 20 clauses preceding the left-detached referent (cf. Givón 1983: 354). Finally, I have characterised a detachment as ‘new’ when the referent does not appear at all in the preceding discourse (cf. example (6) above). To summarise:

1. The referent occurs at least once in the 20 preceding clauses and is, thus, given. 2. The referent is inferrable from linguistic context, i.e. from previously mentioned

elements (in the 20 preceding clauses). 3. The referent does not appear in the text and can be considered brand new.

G. LDis has often been claimed to be a device for the introduction of topical entities new to the discourse, or topic promotion of items that have been mentioned earlier but are now propped up to the status of topic (cf. Section 3.3). The usual diagnostic to find out whether the item has indeed become the new topic is topic persistence, i.e. the degree to which reference to the item persists in the subsequent discourse, as proposed by Givón (1983: 357). The operational measure of when an item can be identified as ‘persisting’ varies; I will follow Givón in taking as my measure the 20 clauses following the left-dislocated item. Following Traugott (2007: 13-14), I will make a further distinction with respect to ‘persisting’ topics in that I will mark them as either Topic Continuity [TC] or as just Subsequent Mention [SM]. Topic Continuity is a subtype of Subsequent Mention and applies to items that in a subsequent mention show up as the subject of a main clause. This reflects the special status of topics that persist over longer stretches of discourse; Givón (1983: 8) refers to them as “discourse topics”:

[A discourse topic is] the most crucially involved [participant] in the action sequence of the paragraph; it is the participant most closely associated with the higher-level ‘Theme’ of the paragraph; and finally it is the most likely participant to be coded as the primary topic – or grammatical subject – of the vast majority of sequentially ordered clauses comprising the thematic paragraph.

19 In this vein, Geluykens (1992: 10) argues that “it may well be that the givenness status of a linguistic item depends, ultimately, on speaker-assumptions; however, it might be wiser to disregard this and to develop concepts which are, first and foremost, operational”.

16

Wald (1983: 104) links discourse topics to what he calls “the FIRSTNESS property of topics”, a property associated with one and the same referent recurring as subject over a larger stretch of discourse, “as a consequence of ‘aboutness’’’. Charting Topic Continuity in the data, then, will allow us to find out whether some of the LDet-sequences more frequently introduce discourse topics than others. Example (17) shows Topic Continuity in an instance of LDet proper:

(17) The Inhabitants of this delicious Isle, as they are without Riches and Honours, so are they without the Vices and Follies that attend them; and were they but as much strangers to Revenge, as they are to Avarice or Ambition, they might in fact answer the poetical Notions of the Golden Age. (Pope, Alexander. 1717. The Correspondence of Alexander Pope Volume I: 1704-1718)

Subsequent Mention, on the other hand, is understood in the most general sense. That is, the position that the persisting item occupies is not significant as long as some kind of anaphoric or semantic reference to the left-detached constituent is evident. H. The data analysed in this study, which cannot claim to investigate the genre variable in full depth (cf. Tizón-Couto, forthcoming), belong to the following text types: drama (146,820 words), fiction (174,615), journals (89,700), legal (52,277), letters (59,153), medicine (71,792), news (91,601), science (87,257) and sermons (37,296).20 6. RESULTS AND DISCUSSION 6.1 SYNTACTIC AND SEMANTIC BEHAVIOUR Generally, the subject position (58.9%) is far more frequent than the object slot (22.3%) for the resumptive elements in the data (for which complement of preposition (10.6%) and subject complement (8.3%) were also considered). These results confirm the claim that left-detachments in written and spoken English serve a similar basic function, namely to turn referents of different information status into a subject that has ‘given’ status (cf. Lambrecht 1994).

20 Although very robust genre typologies have been proposed in the literature (cf. especially Culpeper and Kytö 2010: 16-18), the relatively scarce amount of data from letters or sermons, in this case, does not allow for the quantitative conflation of separate genres into clusters such as ‘speech-like’ (e.g. diaries and letters) or ‘speech-purposed’ (e.g. sermons and drama). The relationship between such clusters is part of ongoing research on a much larger set of corpora where items may be extracted semi-automatically (Penn Parsed Corpora series).

17

Figure 1. Type of LDet-sequence and syntactic function of resumptive element: S[ubject], O[bject], C[omplement of] P[reposition] or S[ubject] C[omplement]

Figure 1, which gives percentages relative to the total number of LDet instances, shows that the subject function [S] dominates the picture for LDet proper [PRP] and Attributive LDet [ATR], and the same can be said (despite the low frequency of the type) of Summarising LDet [SUM]. The figure also reveals the heterogeneous distribution of syntactic arguments that corefer with the left-detached item in Acknowledge LDet [ACK] and the fact that the resumptive element appears most frequently in the post-verbal position (object, subject complement or complement of preposition). Thus, Acknowledge LDet is different from the other types in that it runs counter to the strong tendency of resumptive elements to gravitate towards the subject function (Gregory and Michaelis 2001: 9; Snider 2005: 17). In the data considered here, ACK is typically employed by characters in drama or fiction to reply to other characters by means of a quick allusion to an active referent (cf. example (16) above). It almost exclusively detaches given items from the immediately preceding context (cf. Section 6.2, Figure 5) by means of a separate intonation unit with a different punctuation (cf. Table 5 below). Arguably, then, ACK’s echoing character accounts for its divergent distribution as regards the syntactic function of the resumptive item.

Figure 2 below shows percentages relative to the total number of instances of resumptive items that may or may not be grammatically replaced by their coreferring left-detached constituents.

PRP SUM

ATR ACK

0%

10%

20%

30%

0%

10%

20%

30%

S O CP SC S O CP SCSyntactic function of resumptive item

Per

cent

age

rela

tive

to a

ll in

stan

ces

18

Figure 2. Type of LDet-sequence and potential replacement of resumptive item by left-detached constituent (percentages relative to total number of LDet-sequences)

If LDis were closer to the purely syntactic behaviour of Topicalisation (a view that no one has really held in the literature), a much lower value would be expected for the negative option (‘N’) in LDet proper (PRP). This is not the case. Furthermore, LDet-sequences which are expected to be more discourse-like, due to their turn-initiating role and their ability to echo any constituent from the previous speaker’s contribution (i.e. ACK), do not block replacement much more radically than those which are more integrated into the ensuing clause in the sense that they carry out an information structural function associated with the clausal phenomenon of Topicalisation (i.e. PRP and ATR). Thus, PRP or ATR may be employed to simplify processing and create contrast, just like Topicalisation (cf. Prince 1997, 1998), but do not clearly favour replacement with reference to ACK, which carries out a more interactional task: it specialises in fronting a given referent as a deictic element in conversation while simultaneously taking the floor (cf. Geluykens 1992, 1993). In sum, the results generally point in the direction of a weak syntactic relation with the ensuing clause in terms of the attempted ‘reconstruction’ of the corresponding canonical clause. Therefore, potential replacement of the resumptive item, which is not mentioned in previous treatments of LDis in the literature (except for Perez-Guerra’s 1999 basic definition of LDis), is not really the best test for drawing distinctions between the different types of LDet-sequences discussed here; however, it does reflect the overall ‘clause-detached’ character of these sequences.

Another aspect that is ignored in previous accounts of LDis is P[unctuation] (in terms of what it can tell us about the declarative, exclamative or interrogative modulation of a detached constituent or clause). Table 5 shows the considerable variety of P combinations that LDet-sequences may deploy (in the shape of percentages distributed along type). P1 stands for the

PRP SUM

ATR ACK

0%

5%

10%

15%

20%

25%

0%

5%

10%

15%

20%

25%

N Y N YN[o]/Y[es]

Per

cent

age

rela

tive

to a

ll in

stan

ces

19

punctuation of the left-detached constituent and P2 stands for the punctuation of the ensuing clause.

PRP %

SUM %

ATR %

ACK %

TOTAL % P1 P2

E+ E 3.87 0 8.51 2.53 4.15 I 0 0 0 0 0 S 13.17 0 55.32 45.56 29.81 I+ E 0 0 0 0 0 I 0.77 0 0 12.65 4.15 S 0.77 0 0 13.92 4.52 S+ E 0.77 20 4.25 8.86 4.52 I 3.87 0 2.12 6.32 4.15 S 76.74 80 29.78 10.12 48.67 Totals 100 100 100 100 100

Table 5. Type of LDet-sequence and P pattern: S[tatement], E[xclamative], I[nterrogative]

Two tendencies may be highlighted regarding P combinations. First, each type of LDet-sequence selects a preferred P1+P2 pattern (indicated in bold in the table), and two patterns (E+S and S+S – in italics in the table) dominate the total figures. Second, an even stronger tendency can be derived from the figures for ‘P2’, namely that taken together, 83% of the instances have the value ‘S’ (i.e. they show the pattern ‘_+S’). If a more restricted or wider range of P combinations, respectively, were taken as a clue pointing towards the syntax- or discourse-orientation of LDet-sequences, then LDet proper and Summarising LDet would be the most syntax-like in that they mostly combine statements (in the left-detachment) with statements (in the main clause). The main clause in Attributive LDet also shows a strong tendency to be a statement, although ATR left-detachments are most usually exclamative. Lastly, the scattered distribution of punctuation patterns for Acknowledge LDet shows that it is the least regular of the four types: the dispersion of mid-to-low values along the ACK column in Table 5 reflects its echoing and turn-taking character, as well as suggesting a less purely information structural function with reference to LDet proper.

Table 6 offers an exploratory comparison of the results obtained in this study for the length (measured in number of words) of the left-detached constituents (DTC, i.e. Tizón-Couto) with those of Pérez-Guerra (1999) (JPG) for both left-dislocates and a range of other first-position items.

20

Period in the History of English Type of theme LME EModE LModE PDE

JPG DTC JPG DTC JPG DTC JPG DTC Non-pronominal unmarked themes

2.9 - 2.7 - - - 3.5 -

Topicalised items (Mean) - - 3.4 - - - 5 - Top. Objects - - 4.7 - - - 5.2 - Top. Indirect objects - - 3.5 - - - 7 - Top. Subj. Complements - - 2.2 - - - 2.8 - Left Dislocation 8.9 - 8.8 - - - 7.4 - LDet proper - - - 9 - 7.1 - 6.3 LDet-sequences (Mean) - - - 6.4 - 5.6 - 4.6

Table 6. Length of Left-dislocated and Left-detached themes in different periods of English

in contrast with unmarked non-pronominal and topicalised items (based on Pérez-Guerra 1999)

According to Table 6, both Topicalisation and LDis must be characterised as marked organisation strategies with respect to the length of the sentence-initial material. On the one hand, Topicalisation presents an approximate mean length of 4.2 words if the EModE and PDE periods from Pérez-Guerra (1999) are aggregated.21 As for LDis, in all the periods the mean number of words in the left-dislocated constituents is significantly higher than the length of both topicalised (4.2 words) and unmarked subjects – 2.08 words (as reported in Pérez-Guerra 1999: 56). Thus, the lengthy character of topicalised and, especially, left-dislocated constituents is what made them particularly vulnerable to the process of syntactisation undergone by the English thematic/clausal system after LME (cf. Section 1, cf. Pérez-Guerra 2005). In agreement with this view, the length of the left-detached items calculated here for written English across different periods exceeds that of LDis in oral production as measured by Snider (2005: 23), in whose data “the weight distribution peaks at 3 words and then decreases”.

The length measure also reveals clear distinctions between the different types of LDet-sequences attested, as shown by Table 7.

21 Only the periods which more closely resemble the ones under investigation in this study have been selected from the wide-span analysis offered in Pérez-Guerra (1999: 226), namely from late Middle English (LME) to PDE. It must be pointed out that LModE data was not included for Topicalisation since it is not part of Pérez-Guerra’s analysis of the length variable.

21

Type EModE (1650-1749)

LModE (1800-1849)

PDE (1900-1949)

Total

LDet proper 9.01 7.18 6.35 7.98 Summarising LDet - 21.28 10.33 18 Attributive LDet 3.33 3.96 4.33 3.8 Acknowledge LDet 1.66 1.77 2.26 1.86 All types 6.47 5.64 4.62 5.79

Table 7. Length (in number of words) of left-detached constituents per type of LDet-

sequence and period Summarising LDet is the most marked strategy in terms of sentence-initial weight. In stark contrast, Acknowledge LDet features a short initial item (1.89 words) which approaches the length of unmarked subjects (2.08 according to Pérez-Guerra 1999), while Attributive left-detachments (3.8 words) rank closer to the weight figure for Topicalisation (4.2 words approximately, according to Pérez-Guerra 1999). The most significant conclusion is that Acknowledge LDet does not deliberately detach heavy items, whereas the other types do to varying degrees.

An additional word-count measure was also used that further sets Acknowledge LDet apart from the rest of LDet-sequences in terms of linear connection with the main clause, namely the material separating the left-detached constituent and the ensuing clause or ‘Intervening Material’ (between dashes in (18)).

(18) MRS FARAKER: I feel so well I should just love a cigarette, and there isn’t one.

MACLAREN: Cigarette – why, of course – I’ve got some. <(Takes his case from his pocket and opens it). > (J.B. Fagan, The Wheel of Life, 1922, 154)

The mean Intervening Material is slightly higher than one word for ACK, while it is significantly lower than one word for the rest of the types (except for Summarising, where the low number of examples clearly skews the mean).

EModE (1650-1749)

LModE (1800-1849)

PDE (1900-1949)

Total

LDet proper 0.2 0.3 0.4 0.3 Summarising LDet NA 1.0 0.0 0.7 Attributive LDet 0.2 0.5 0.7 0.4 Acknowledge LDet 0.5 1.9 0.9 1.3 All types 0.3 0.9 0.6 0.6

Table 8. Intervening material [between the left-detached constituent and the main clause]

per type of LDet-sequence and period (number of words)

Figure 3 suggests that the semantic links between the left-detached referents and the resumptive phrases are fairly traditional (i.e. mostly total identity or ‘T’) in most cases except,

22

once again, for Acknowledge LDet in that, although T also dominates, more than one third of the links are partial (‘P’) or metonymic (‘M’).

Figure 3. Type of LDet-sequence and semantic link between left-detached constituent and resumptive phrase (percentages relative to total): M[etonymic], P[artial], T[otal]

In general, then, the semantic connection between the left-detached referent and the copy is not as relaxed for those LDet-sequences that carry a more similar function with reference to Topicalisation (i.e. SUM, PRP and ATR). The listing format imposes a total identity relation in Summarising LDet that could be claimed to be the second essential feature of this type (besides considerable weight of the left-detached constituent). Lastly, although they tend to be linked via ‘total identity’ (‘T’), LDet proper and Attributive LDet are also productive across non-prototypical – and especially metonymic – reference (‘M’). 6.2 TEXTUAL BEHAVIOUR: INFORMATION STATUS AND TOPICALITY As previously pointed out in Section 3.3, most authors dealing with contemporary spoken English report high percentages of new left-detached referents or referents that can be inferred from non-linguistic material (cf. Montgomery 1982; Geluykens 1992; Gregory and Michaelis 2001; Snider 2005). In contrast with this body of evidence from spoken English, the results of this study accord with previous accounts of LDis in written historical genres in the sense that (a) given items are, generally, more frequent (57.3%) and (b) many left-detached referents (14.3%) can be inferred from the textual material found in the preceding 20 clauses. Figure 4 shows a relatively higher percentage of new elements in LModE; however, the picture where given items dominate is replicated across periods, i.e. the difference between the three stages is not

PRP SUM

ATR ACK

0%

10%

20%

30%

40%

0%

10%

20%

30%

40%

M P T M P TSemantic link between left-detached item and resumptive phrase: M[etonymic], P[artial], T[otal]

Per

cent

age

rela

tive

to a

ll in

stan

ces

23

statistically significant (chisq.test: first to second period: p = 0.1; second to third period: p = 0.3; all periods included: p = 0.2).

Figure 4. Information status of left-detached referent and period (percentage relative to all instances of LDet-sequences)

Figure 5. Information status of left-detached constituent and type of LDet-sequence (percentage relative to all instances of LDet-sequences)

EModE LModE PDE

0%

5%

10%

15%

20%

25%

given inferrable new given inferrable new given inferrable newInformation Status of left-detached item

Per

cent

age

rela

tive

to a

ll in

stan

ces

PRP SUM

ATR ACK

0%

10%

20%

0%

10%

20%

given inferrable new given inferrable newInformation Status of left-detached item

Per

cent

age

rela

tive

to a

ll in

stan

ces

24

Figure 5 shows the figures for the information status of the different types of LDet-sequences. In contrast with the aggregate numbers (cf. Figure 4), the results for each type complicate the view that left-detached items most frequently set up given items for the periods under study. In fact, 47.3% of the most prototypical LDet-sequences (i.e. PRP) promote new items (34.4% in EModE, 64.5% in LModE and 47% in PDE; cf. also bolded figures in Table 9 below for percentages relative to all instances across all periods). Likewise, Summarising LDet (SUM) seems to specialise in fronting unknown information. Nonetheless, Attributive and Acknowledge LDet strings clearly favour given and inferrable items across the board. Thus, most speaker judgements are applied to referents that are already present in the discourse (ATR) and turn-chasing repetitions of a referent (ACK) usually left-isolate an item which has been mentioned in the closest leftward vicinity (i.e. the previous clause). As expected, the differences are highly significant for type distribution (p < 0.001). As a means of both summary and further detail, Table 9 below offers the percentages for the distribution of each LDet-sequence according to period and information status.

PRP SUM ATR ACK All types Period Information status EModE (1650-1749)

given 23.7 0 19.1 30 23.8 inferrable 8.4 0 12.8 0 6.4 new 16.8 0 0 1.1 8.7

LModE (1800-1849)

given 6.9 10 46.8 43.3 24.5 inferrable 6.1 0 6.4 2.2 4.9 new 24.4 60 2.1 1.1 14.7

PDE (1900-1949)

given 3.1 0 6.4 20 9.1 inferrable 4.6 0 2.1 2.2 3 new 6.1 30 4.3 0 4.9

All periods given 33.6 10 72.3 93.3 57.3

inferrable 19.1 0 21.3 4.4 14.3 new 47.3 90 6.4 2.2 28.3

Table 9. Type of LDet-sequence, period and information status (percentages relative to all

instances across all periods) In a nutshell, the information status of left-detached constituents is very heterogeneous in written genres. The aggregate percentages in this study of LDet-sequences line up with the existing literature on LDis in written (historical) genres in suggesting that, unlike in the spoken medium, the information status of left-detached referents is not often ‘new’. On the other hand, the figures for LDet proper and Summarising LDet separately do rank much closer to the general claim made in the literature accounting for LDis in spoken English, namely that LDis specialises in setting up new referents.

25

Turning to topicality, Tables 10 and 11 include the mean Topic Continuity and Subsequent Mention scores (cf. Section 5), respectively, calculated in terms of number of clauses in which the left-detached referent persists per type of LDet-sequence for each of the periods.

Type EModE (1650-1749)

LModE (1800-1849)

PDE (1900-1949)

Total

LDet proper 1.31 0.75 1 1.06 Summarising LDet - 1.28 1 1.2 Attributive LDet 1.13 1.11 0.5 1.04 Acknowledge LDet 0.16 0.38 0.78 0.41 Mean 1 0.75 0.84 0.87

Table 10. Topic Continuity (i.e. mean number of clauses in which the left-detached

referent persists as subject) in the LDet-sequences and periods researched

Type EModE

(1650-1749) LModE (1800-1849)

PDE (1900-1949)

Total

LDet proper 2.45 2.08 2.76 2.35 Summarising LDet - 1.28 1.33 1.3 Attributive LDet 2.66 2.57 2.83 2.63 Acknowledge LDet 2.04 1.97 2.94 2.22 Mean 2.38 2.11 2.75 2.32

Table 11. Subsequent Mention (i.e. mean number of clauses in which the left-detached

referent persists) in the LDet-sequences and periods researched The findings, especially for LDet proper and Attributive LDet, support the view that LDet-sequences enhance topicality at the paragraph level, which is, according to Givón (1983: 7), “the most immediately relevant level of discourse within which one can begin to discuss the complex process of continuity in discourse”. In fact, the numbers for persistence (corresponding to my category of Subsequent Mention) calculated by Brown (1983: 336) as well as by Givón (1983: 355) for LDis in oral language, namely 1.95 clauses and 2.16 clauses respectively, are not significantly lower than the overall results obtained in this study for LDet-sequences. The data described by Montgomery (1982: 429) for LDis in oral interviews of a specific community (a small East Tennessee hill town) are even higher: his mean subsequent mention measure adds up to approximately 3.7 clauses.22 Therefore, it seems safe to conclude that left-detached items generally act as boosters concerning Subsequent Mention. However, it cannot be generally concluded, from the current data at least, that they have a significant effect on the ‘firstness’ property (or ‘subjectness’) of the topics following the left-detached constituent (cf. Givón 1983: 22 This persistence figure is not explicitly offered by Montgomery (1982: 429). However, it can be calculated by taking into account his general figures: roughly 75% of his tokens of LDis are thematic in a span of 1-5 clauses, 18% extend their thematic scope to the span of 6-10 clauses and around 7% surpass the 10 clause span.

26

8; Wald 1983: 104). The mean Topic Continuity figure never goes far beyond one clause, i.e. the main clause following the left-detached constituent. The beanplots in Figure 6 provide a more intuitive visualisation of the data. For both plots, the y-axis specifies how far the left-detached referents may persist for each type of LDet-sequence in number of clauses, and the beans capture the estimated density of the continuity distributions based on each individual observation. The mean for each type of LDet-sequence is signposted by the four thick horizontal bars crossing each bean, while the average calculated for all LDet-sequences is indicated by the (also horizontal) thin dotted line crossing all four beans. The bean plot on the left shows the Subsequent Mention distributions, while the one on the right illustrates the Topic Continuity distributions. For instance, the bean plot on the left suggests that the estimated mean Subsequent Mention for PRP is almost 2.5 clauses (as signalled by the horizontal bolded bar crossing the PRP bean), or that a few instances of PRP may extend beyond 10, and even 15, clauses in contrast with left-detached referents in the SUM, ATR or ACK, which are noticeably blocked before reaching the 10th clause.

Figure 6. Beanplots of Subsequent Mention and Topic Continuity by type of LDet-sequence

A comparison of the two sets of beanplots suggests that SUM ranks noticeably below the mean Subsequent Mention (thin horizontal dotted line on left plot), and ACK ranks below the mean Topic Continuity (thin horizontal dotted line on right plot). Their deviation from PRP’s mean scores for SM and TC, respectively, suggests that these two LDet-sequences may not only be formally, but also functionally, less central with reference to traditional LDis. In contrast, PRP’s mean scores correspond with the continuity features previously reported in the literature for modern-day LDis. Thus, comparison of the two plots highlights that left-detached elements in PRP present the most consistent picture regarding the topical dimension: (a) they stay on as subjects for a mean of 1.06 clauses and may continue as subject for up to 6.5 clauses. Also, (b) they set up a referent which is continued as any constituent in the 2.35 following clauses and may continue for up to 19 clauses. Lastly, comparison of the two plots also suggests the more general conclusion pointed out above that the means for Subsequent Mention reflect the

0

5

10

15

PRP SUM ATR ACK

1

22.5

19

Subsequent Mention

0

2

4

6

PRP SUM ATR ACK

1

1.5

7

Topic Continuity

27

general ability of left-detached constituents to promote a topic, while the means for Topic Continuity do not. Let us then look at the interactions of Subsequent Mention, the variable which truly captures the ability of LDet-sequences to successfully promote a topic in discourse, with both type of LDet-sequence (‘type_X’ in Appendix A) and information status (‘IS’ in Appendix B). First, the linear model in Appendix A suggests that there may be an effect of ‘type’ on ‘subsequent_mention’ (with SUM and ATR leading to lower and higher values, respectively); however, the effect is not very strong and a little uncertain: p-values are near the 0.05 mark but do not quite reach it. A Kruskal-Wallis chi-squared test (i.e. a non-parametric alternative to the linear model) over the same data provides a p-value of 0.035, which confirms the significance of the distances between the means for SUM and the other three types; however, it seems fairly risky to generalise and conclude that one type of LDet-sequence is much more effective than the rest in the task of promoting Subsequent Mention since the data for SUM are scarce. Second, concerning the information status of the left-detached item, it is key to remember Givón’s (1983: 357) claim that “the high persistence values of definite left-dislocation, together with the high referential distance for that category, marks it clearly as a typical chain-initial, new topic introducing category”. The linear model in Appendix B summarises the interaction between information status and Subsequent Mention. The p-values suggest that, in general, given items persist longer (0.66 clauses longer). The interaction is also significant when applied to the LDet proper data alone (p = 0.04), and there is a tendency to distinguish between ‘given’ and ‘inferrable’ in the Attributive LDet data (p = 0.07). These interaction results suggest that, after introduction by means of an LDet-sequence, new referents meet an unreceptive context where they are more likely to fade out than other items which have been around for longer in the text. On the other hand, given and inferrable items seem easier to project forward by means of a left detachment. 7. PERIOD AND GENRE Pérez-Guerra and Tizón-Couto (2009) report on a significant decrease of LDis in historical texts from Late Middle English (LME) to PDE due to the process of syntactisation that they claim drove English clause syntax towards SVO order (cf. Section 1) and the avoidance of heavy initial elements (cf. Section 6.1 on the ‘length’ of the left-detached constituent). As Figure 7 shows, such declining trend can only be observed in this study for the LDet-sequence that most closely resembles spoken English LDis, namely PRP (the diamond-marked line).

28

Figure 7. Historical development of four LDet-sequences across three periods

Figure 7 displays a clear negative correlation between time and usage for LDet proper (Kendall’s Correlation Test= -1, p = 0.3).23 However, the p-value indicates this is not a reliable computation. The low p-value has to do with the small number of data points (i.e. only three) that the statistical calculation of Kendall’s tau has to use in this case. Therefore, more data points, including for Middle English, are required in order to calculate a more reliable outcome that can be associated with the aforementioned process of syntactisation. Ongoing research on data from 11 periods from the Penn Helsinki corpora series24 suggests a significant decrease of left-dislocated NPs since Middle English (p < 0.01; cf. Tizón-Couto, forthcoming). The steep decline of left-detached NPs attested since Middle English is accounted for by the synchronous decline of V2 word-order. Thus, the loss of V2, and the resulting loss of a first multifunctional position that could host both unmarked topics and focused material, led to the rise of new constructions and a redefinition of older positions (Los and Komen 2012: 896). More precisely,

23 Hilpert and Gries (2009: 390) suggest the use of Kendall’s τ, which is a correlation statistic, to find out whether an increase in one ordinal or continuous variable corresponds to an increase in a second variable. Correlations can be both positive and negative, so that increases in one variable may also systematically relate to decreases in another (cf. Hilpert 2013: 30). 24 The data employed in this section (cf. Figure 8) were automatically extracted, by using CorpusSearch 2, from the Penn-Helsinki Parsed corpora of Middle English (PPCME2), Early Modern English (PPCEME), Modern British English (PPCMBE) and the Parsed Corpus of Early English Correspondence (PCEEC). The Penn-Helsinki corpora of historical English consist of running texts and text samples of British English prose across its history – from the earliest Middle English documents up to the First World War. These corpora comprise a wide range of genres including biography, diary, drama, educational treatise, handbook, history, law, private and non-private letters, philosophy, science, sermon, travelogue and trial proceedings.

29

Los and Komen (2012: 884) argue that this decline of V2 word-order was accompanied by a rise of the cleft construction as a resolution strategy after “the loss of a first position that could host contrastive constituents”. They show that the cleft construction undergoes a clear and significant increase after 1500 (i.e. right after the loss of V2; cf. Jespersen 1937: 76, Traugott 1972) until 1910 (also with data from the annotated Penn Helsinki corpora). Their hypothesis is that the increase of the cleft construction both remediates the decline and prompts the recycling of two available constructions that marked focus, namely pre-verbal nominal constituents with a focus particle (e.g. only John, right in the middle of town; cf. Los and Komen 2012: 889) and Contrastive Left Dislocation, which they define as “noun phrase left dislocation with resumptive demonstratives” (2012: 896), as in (19):

(19) The people who earn millions and pay next to no tax, those are our targets. (Birner and Ward 2002: 1413)

Thus, on their way to PDE, pre-verbal nominal constituents were recycled and recuperated (especially only) and Contrastive Left Dislocation “remained available for dislocates with a resumptive subject” (Los and Komen 2012: 896).

Beyond Middle English, the Penn Helsinki data show an overall decrease of left-dislocated NPs from the Early Modern (n.f. 2.45) to the Late Modern (n.f. 1.32) period (cf. Tizón-Couto, forthcoming). Table 7 above also shows a declining trend for PRP across these two periods (n.f. 1.9 > n.f. 1.6). This decline could be linked to the establishment of the syntactic and orthographical bases of the sentence. Lennard (1995: 67), for instance, states that “a conceptual change from the ‘period’, defined aurally and rhetorically, to the ‘sentence’, defined visually and syntactically, should be dated to the mid-to-late seventeenth century”. Left-dislocated NPs show clear genre variation (in the Penn Helsinki data) as this shift towards new sentence punctuation takes place.

30

Figure 8. Distribution of LDisNPs per genre-cluster according to period from 1500 to 1914

in annotated Penn Helsinki corpora (norm. freq. 10,000 words)

The normalised frequencies for the four different genre-clusters25 in the last six subperiods included in the Penn Helsinki corpora (cf. footnote 25) reveal a negative correlation between time and usage for both the speech-like (Kendall’s Correlation Test = -1, p = 0.002) and the writing-related (KCT = -0.73, p = 0.05) clusters. On the one hand, left-dislocated NPs seem to have been gradually banned from letter writing and most formal non-religious texts in the period from EModE to LModE. On the other hand, they do not significantly decrease in sermons and drama (speech-purposed) or fiction (mixed). Thus, left-dislocated NPs apparently undergo rhetorical specialisation in written English to signal emphasis, in religious contexts, or features of orality (i.e. conversational exchanges), in fiction and drama. In fact, there is a slight expansion in speech-purposed genres (drama and sermons) and mixed genres (fiction and trial proceedings) after the 1700-1769 period for the Penn Helsinki data (cf. Figure 8). Such a rise is also materialised in this study’s data from ARCHER (cf. Table 12) in the higher number of

25 Following the typology in Culpeper and Kytö (2010: 16-18), the genres in the Penn Parsed Corpora series were classified into four different clusters: (i) writing-related registers, represented by biography, educational treatise, handbook, history, law, philosophy, science and travelogue; (ii) mixed registers such as prose fiction and trial proceedings; (iii) speech-purposed registers, like drama and sermons; and (iv) speech-like registers, like diaries and letters.

31

instances found for drama and fiction in the 1800-1849 period, and the later (1900-1949) higher figure for sermons (n.f. 0.1 > n.f. 0.28). In agreement with the findings for the Penn Helsinki data, the figure for sermons was also expected to have increased more steadily from the EModE to the LModE period in this study; however, the insignificant leap attested by Table 12 (n.f. 0.09 > n.f. 0.1) is accounted for by the underrepresented status of the homiletic genre in this study (only 37,296 words; cf. footnote 19).

drama

fiction

journal

sermon

letter

news

science

medicine

legal Period

1650-1749

1.46 (n=49)

0.54 (n=18)

0.3 (n=10)

0.09 (n=3)

0.06 (n=2)

0.21 (n=7)

0.21 (n=7)

0.21 (n=7)

0 (n=0)

1800-1849

2.62 (n=78)

0.87 (n=26)

0 (n=0)

0.1 (n=3)

0.13 (n=4)

0.03 (n=1)

0.03 (n=1)

0 (n=0)

0.13 (n=4)

1900-1949

1.75 (n=31)

0.23 (n=4)

0.11 (n=2)

0.28 (n=5)

0.11 (n=2)

0 (n=0)

0 (n=0)

0 (n=0)

0.06 (n=1)

Total 1.95 (n=158)

0.59 (n=48)

0.15 (n=12)

0.14 (n=11)

0.1 (n=8)

0.1 (n=8)

0.1 (n=8)

0.09 (n=7)

0.06 (n=5)

Table 12. Period and genre in data from ARCHER (norm. freq. 10,000 words)

The data from ARCHER in Table 12 suggest that LDet-sequences were employed far more frequently in those genres which attempt to recreate orality, namely a speech-purposed genre like drama (1.9) or a mixed genre that recreates conversation such as fiction (0.6), than in others which could have been expected to be more influenced by orality on the part of the writer such as letters (speech-like) or sermons (speech-purposed). As for the relationship between the different types of LDet-sequences and genre (cf. Table 13), Acknowledge LDet turns out to be the most frequent in drama (4.7), where the repetition of a character’s words by another via this type may be used (a) to create cohesion and keep the reader on track or (b) as a rhetorical marker of astonishment, etc. on the part of a particular character. LDet proper (3.5) and Attributive LDet (2.4) are also more common within this genre, as strategies by means of which the author or the characters could include new ideas/characters or assign features to existing referents/characters. Beyond ‘drama’, the results suggest that LDet proper (PRP) is the most frequent across genres. drama

fiction

journal

sermon letter

news science

medicine

legal Type LDet proper 3.47

(n=51) 1.55 (n=27)

1.34 (n=12)

1.88 (n=7)

0.85 (n=5)

0.76 (n=7)

0.92 (n=8)

0.98 (n=7)

0.96 (n=5)

Summarising LDet

0.2 (n=3)

0.23 (n=4)

0 (n=0)

0.54 (n=2)

0.17 (n=1)

0 (n=0)

0 (n=0)

0 (n=0)

0 (n=0)

Attributive LDet

2.38 (n=35)

0.46 (n=8)

0 (n=0)

0.27 (n=1)

0.34 (n=2)

0.11 (n=1)

0 (n=0)

0 (n=0)

0 (n=0)

Acknowledge LDet

4.7 (n=69)

0.52 (n=9)

0 (n=0)

0.27 (n=1)

0 (n=0)

0 (n=0)

0 (n=0)

0 (n=0)

0 (n=0)

Table 13. LDet-sequence and genre (norm. freq. 10,000 words)

32

Figure 9 illustrates the diachronic development of PRP in general (diamonds), in drama (squares) and in fiction (triangles) in contrast with the flatter developments of LDet-sequences in drama (crosses) and fiction (stars).

Figure 9. Historical development of LDet proper (aggregate, in drama, and in fiction) vs.

historical development of all LDet-sequences (in drama and in fiction) across three periods

In Figure 9, PRP follows a decreasing trend in the aggregate for all genres and even in fiction; it is the usage of PRP in drama that makes the decreasing trend less significant. The rising trends for ‘PRP_drama’ in Table 9 and for ‘speech-purposed’ in Table 8 support the hypothesis that, after around 1750, LDet-sequences (and left-dislocated NPs) specialised as devices that authors employed to deliberately recreate orality or conversation, noticeably, in drama and, probably, in fiction (cf. the slightly increasing trend of left-dislocated NPs evidenced in Table 8 for ‘mixed’).

To conclude, although the results from ARCHER offer some valuable insight into the issue of genre, it must be pointed out that the normalised frequencies obtained for genre distribution in this dataset are rather low and inconclusive; this is why the decision was made to complement them by means of results from the annotated Penn Helsinki corpora. The interaction between left-dislocated NPs (as well as LDet-sequences) and historical genres clearly constitutes a topic for further research to be undertaken on a wider range and a more balanced number of text-types (cf. Tizón-Couto, forthcoming).

33

8. CONCLUDING REMARKS This paper has looked into the syntactic, semantic and textual behaviour of left-detached constituents which are followed by a main clause including a coreferential resumptive phrase. The behaviour of such ‘LDet-sequences’ in written historical texts (since the Early Modern English period) is compared with the behaviour of LDis in contemporary spoken English as described in the literature. LDet-sequences in the recent history of written English behave like contemporary spoken English LDis as regards several formal variables and textual features. In terms of their formal characteristics, they mostly corefer with subject resumptives and show clear signs of detachment from clause-syntax in that (a) the left-detached constituent may not easily replace the resumptive element, (b) punctuation marks of the two items are far from homogeneous in writing (around 50% of the instances do not combine two matching Ps) and (c) metonymic or partial semantic relationships may often hold between the left-detached referent and the resumptive phrase. As expected, sequences headed by a markedly detached LDet-sequence (i.e. Acknowledge LDet) show more radical differences with reference to LDis in that resumptive elements appear postverbally, P combinations show a clearly dispersed distribution and semantic links are reasonably often partial or metonymic. Besides, Acknowledge LDet does not front items that exceed the length of prototypical subjects, as is the case for the other three types attested. Thus, ACK can be considered as a reference-point of pure discoursal behaviour in contrast with the other three LDet-sequences whose behaviour is more like LDis. Although these other three types of LDet-sequences (i.e. ATR, PRP and SUM) mostly comply with the formal and functional tendencies reported in the literature for contemporary spoken LDis, they also deviate in that the word-lengths of PRP and Summarising LDet clearly exceed the length reported for LDis in spoken English (except for Attributive LDet which ranks closer to the 3 word average reported by Snider 2005). Conversely, Acknowledge and Attributive LDets do not feature as heavy an initial item since they do not aim at setting up (a list of) new items, but rather at establishing cohesion more quickly by invoking a previous referent or attaching a quality to a given referent, respectively. The informative character of LDet-sequences aligns with the results previously reported in the literature for written historical data: although there is a fairly high level of heterogeneity, items are mostly given. Nonetheless, a closer look at each separate type shows that LDet proper ranks higher in the number of new elements introduced and, thus, closer to the figures reported for LDis in contemporary spoken English. This fact speaks for the need, when dealing with written (historical) texts, to establish types of left detachments rather than interpreting any sequence including a ‘left-dislocated constituent’ and a ‘resumptive item’ (within an ensuing clause) as ‘Left Dislocation’.

The textual behaviour of LDet-sequences matches that previously reported for LDis in contemporary spoken English, especially if topicality is considered. The left-detached referent persists (in terms of Subsequent Mention) for an average of 2.32 clauses and, thus, cohesively expands beyond the clause containing the resumptive element, i.e. into the paragraph (cf. Givón 1983: 7). However, the ‘firstness’ property of topics (Topic Continuity) does not hold for any of the four types attested: left-detached referents typically occupy the subject position in the ensuing main clause, but shift to other clause-argument positions in the second and subsequent clauses.

34

The interaction between type of LDet-sequence and persistence (Subsequent Mention) does not suggest reliably significant differences, but the interaction between information status and Subsequent Mention reveals that left-detached items which are already active or semi-active in the text have a slight tendency to persist longer than new items. Thus, LDis and LDet-sequences do not only front given items more frequently in written (historical) texts, but they can also be claimed to do so more successfully. This outcome reinforces the assumption that the (co)referential networks created in the written medium quantitively and qualitatively exceed those usually provided in the spoken language.

Lastly, only the diachronic evolution of LDet-sequences that closely resemble LDis, namely PRP, concurs with previously reported declining trends for left-detached constituents (cf. Pérez Guerra and Tizón-Couto 2009). Complementary data from the annotated Penn Helsinki corpora suggests a starker decline of left-dislocated NPs since Middle English and a possible rhetorical or stylistic specialisation of these detachments after EModE (and into LModE) to recreate features of orality in speech-purposed texts (drama and sermons) and fiction. This hypothesis is reinforced by the ARCHER data analysed in this study, according to which LDet-sequences show up far more frequently in drama and fiction than in any other genre. Thus, the prediction that LDis is employed as a trait of orality in order to recreate conversation (cf. Geluykens 1992, 1993) is borne out. The extent to which these items are true markers of orality (as is reported to be the case for clause-initial and; cf. Culpeper and Kytö 2010) is the object of ongoing research on a larger dataset dealing with a wider range of text-types (cf. Tizón-Couto, forthcoming).

REFERENCES

Acuña-Fariña, Juan Carlos. 1996. The Puzzle of Apposition. On So-called Appositive Structures in English. Santiago de Compostela: Universidade de Santiago de Compostela (Servicio de Publicacións e Intercambio Científico).

Acuña Fariña, Juan Carlos. 1999. On apposition. English Language and Linguistics 3 (1): 59-81. Anagnostopoulou, Elena, Henk van Riemsdijk and Frans Zwarts (eds.). 1997. Materials on Left

Dislocation. Amsterdam: John Benjamins. ARCHER 3.1 = A Representative Corpus of Historical English Registers version 3.1. 1990-

1993/2002/2007/2010/2013. Originally compiled under the supervision of Douglas Biber and Edward Finegan at Northern Arizona University and University of Southern California; modified and expanded by subsequent members of a consortium of universities. Current member universities are Bamberg, Freiburg, Heidelberg, Helsinki, Lancaster, Leicester, Manchester, Michigan, Northern Arizona, Santiago de Compostela, Southern California, Trier, Uppsala, Zurich. Examples of usage taken from ARCHER were obtained under the terms of the ARCHER User Agreement (available on the Documentation page of the ARCHER website, <http://www.manchester.ac.uk/archer/>, last accessed on 30 June 2014).

Barcelona Sánchez, Antonio. 1988. El tópico desgajado en inglés: Motivación pragmática. Atlantis 10 (1-2): 9-20.

Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan. 1999. Longman Grammar of Spoken and Written English. Harlow: Longman.

35

Birner, Betty. 2004. Discourse functions at the periphery: Noncanonical word order in English. ZASPIL 35 (1): 41-62.

Birner, Betty and Gregory Ward. 2002. Information packaging. In The Cambridge Grammar of the English Language, Rodney Huddleston and Geoffrey Pullum (eds.). Cambridge: Cambridge University Press, 1363-1447.

Brown, Cheryl. 1983. Topic continuity in written English narrative. In Topic Continuity in Discourse: A Quantitative Cross-language Study, Talmy Givón (ed.). Amsterdam: John Benjamins, 313-341.

Burton-Roberts, Noel. 1975. Nominal apposition. Foundations of Language 13 (3): 391-419. Chomsky, Noam. 1977. On wh-movement. In Formal Syntax, Peter Culicover, Thomas Wasow

and Adrian Akmajian (eds.). New York: Academic Press, 71-132. Culpeper, Jonathan and Merja Kytö. 2010. Early Modern English Dialogues: Spoken Interaction

as Writing. Cambridge: Cambridge University Press. De Vries, Mark. 2007. Dislocation and backgrounding. In Linguistics in the Netherlands,

Bettelou Los and Marjo van Koppen (eds.). Amsterdam and Philadelphia: John Benjamins, 235-247.

Dik, Simon. 1997. The Theory of Functional Grammar. Part II: Complex and Derived Constructions. Kees Hengeveld (ed.). Berlin: Mouton de Gruyter.

Downing, Angela and Phillip Locke. 2006. English Grammar: A University Course. Abingdon and New York: Routledge.

Ford, Cecilia, Barbara Fox and Sandra Thompson. 2003. Social interaction and grammar. In The New Psychology of Language. Cognitive and Functional Approaches to Language Structure. Vol. 2, Michael Tomasello (ed.). London: Lawrence, 119-143.

Geluykens, Ronald. 1992. From Discourse Process to Grammatical Construction: On Left Dislocation in English. Amsterdam: John Benjamins.

Geluykens, Ronald. 1993. Syntactic, semantic and interactional prototypes: the case of left-dislocation. In Conceptualizations and Mental Processing in Language, Richard Geiger and Brygida Rudzka-Ostyn (eds.). Berlin: Mouton de Gruyter, 709-730.

Givón, Talmy. 1979. On Understanding Grammar. Orlando: Academic Press. Givón, Talmy. 1983. Topic Continuity in Discourse: A Quantitative Cross-language Study.

Amsterdam: John Benjamins. Gómez-González, María Ángeles. 2001. The Theme-topic Interface: Evidence from English.

Amsterdam: John Benjamins. González Álvarez, Dolores. 2002. Disjunct Adverbs in Early Modern English: A Corpus-based

Study. Vigo: University of Vigo. Gregory, Michelle and Laura Michaelis. 2001. Topicalization and left-dislocation: A functional

opposition revisited. Journal of Pragmatics 33 (11): 1665-1706. Grohmann, Kleanthes. 2000. Copy left dislocation. In Proceedings of the Nineteenth West Coast

Conference on Formal Linguistics, Roger Billerey and Brook Danielle Lillehaugen (eds.). Somerville, MA: Cascadilla Press, 139-152.

Gundel, Jeanette. 1977. The Role of Topic and Comment in Linguistic Theory. Bloomington, In.: Indiana University Linguistics Club.

Hilpert, Martin. 2013. Constructional Change in English: Developments in Allomorphy, Word Formation, and Syntax (Studies in English Language). Cambridge: Cambridge University Press.

Hilpert, Martin and Stefan Gries. 2009. Assessing frequency changes in multi-stage diachronic corpora: Applications for historical corpus linguistics and the study of language acquisition. Literary and Linguistic Computing 24 (4): 385-401.

Hirschbühler, Paul. 1997. The source of lefthand NPs in French. In Materials on Left Dislocation, Elena Anagnostopoulou, Henk van Riemsdijk and Frans Zwarts (eds.). Amsterdam: John Benjamins, 55-66.

36

Jespersen, Otto. 1937. Analytic Syntax. Copenhagen: Levin and Munksgaard. Keenan-Ochs, Elinor and Bambi Schieffelin. 1976. Foregrounding referents: A reconsideration of

left dislocation in discourse. Proceedings of the 2nd Annual Meeting of the Berkeley Linguistics Society, 240-57.

Keizer, Evelien. 2005. The discourse function of close appositions. Neophilologus 89 (3): 447-467.

Kies, Daniel. 1988. Marked themes with and without pronominal reinforcement: their meaning and distribution in discourse. In Pragmatics, Discourse and Text: Some Systemically-inspired Approaches, Erich Steiner and Robert Velman (eds.). London: Pinter, 47-72.

Kroch, Anthony and Lauren Delfs. 2004. The Penn-Helsinki Parsed Corpus of Early Modern English (PPCEME). Department of Linguistics, University of Pennsylvania. CD-ROM, first edition, <http://www.ling.upenn.edu/hist-corpora/> (Last accessed on 30 June 2014).

Kroch, Aanthony and Ariel Diertani. 2010. The Penn-Helsinki Parsed Corpus of Modern British English (PPCMBE). Department of Linguistics, University of Pennsylvania. CD-ROM, first edition, <http://www.ling.upenn.edu/hist-corpora/> (Last accessed on 30 June 2014).

Kroch, Anthony and Ann Taylor. 2000. The Penn-Helsinki Parsed Corpus of Middle English (PPCME2). Department of Linguistics, University of Pennsylvania. CD-ROM, second edition, <http://www.ling.upenn.edu/hist-corpora/> (Last accessed on 30 June 2014).

Lambrecht, Knud. 1994. Information Structure and Sentence Form: Topic, Focus, and the Mental Representations of Discourse Referents. Cambridge: Cambridge University Press.

Lambrecht, Knud. 1996. On the formal and functional relationship between topics and vocatives. Evidence from French. In Conceptual Structure, Discourse and Language, Adele Goldberg (ed.). Stanford, Cal.: CSLI Publications, 267-288.

Larsson, Eva. 1979. La Dislocation en Français. Étude de syntaxe generative (Études Romanes de Lund). Lund: CWK Gleerup.

Lennard, John. 1995. Punctuation: and – Pragmatics. In Historical Pragmatics, Andreas Jucker (ed.). Amsterdam: John Benjamins, 65-98.

Los, Bettelou and Erwin Komen. 2012. Clefts as resolution strategies after the loss of a multifunctional first position. In The Oxford Handbook of the History of English, Terttu Nevalainen and Elizabeth Closs Traugott (eds.). New York: Oxford University Press, 884-898.

Meyer, Charles. 1992. Apposition in Contemporary English. Cambridge: Cambridge University Press.

Miller, Philip. 2001. Discourse constraints on (non)extraposition from subject in English. Linguistics 39 (4): 683-701.

Montgomery, Michael. 1982. The functions of left dislocation in spontaneous discourse. In The Ninth Lacus Forum, John Morreal (ed.). Columbia: Hornbeam, 425-432.

Netz, Hadar, Ron Kuzar and Zohar Eviatar. 2011. A recipient-based study of the discourse functions of marked topic constructions. Language Sciences 33 (1): 154-166.

Newmeyer, Frederick. 2004. On split-CPs, uninterpretable features, and the ‘perfectness’ of language. ZASPIL 35 (2): 399-422.

Ono, Tsuyoshi and Sandra Thompson. 1994. Unattached NPs in English conversation. BLS 20 (1994): 402-419.

Parsed Corpus of Early English Correspondence, parsed version. 2006. Annotated by Ann Taylor, Arja Nurmi, Anthony Warner, Susan Pintzuk and Terttu Nevalainen. Compiled by the CEEC Project Team. York: University of York and Helsinki: University of Helsinki. Distributed through the Oxford Text Archive.

Pérez-Guerra, Javier. 1999. Historical English Syntax: A Statistical Corpus Based Study on the Organisation of Early Modern English Sentences. Muenchen: Lincom Europa.

37

Pérez-Guerra, Javier. 2005. Word order after the loss of the verb-second constraint or the importance of early Modern English in the fixation of syntactic and informative (un-)markedness. English Studies 86 (4): 342-369.

Pérez-Guerra, Javier and David Tizón-Couto. 2009. On left dislocation in the recent history of English: Theory and data hand in hand. In Dislocated Elements in Discourse: Syntactic, Semantic, and Pragmatic Perspectives, Benjamin Shaer, Philippa Cook, Werner Frey and Claudia Maienborn (eds.). London: Routledge, 31-48.

Prince, Ellen. 1997. On the functions of left-dislocation in English discourse. In Directions in Functional Linguistics, Akio Kamio (ed.). Philadelphia: John Benjamins, 117-144.

Prince, Ellen. 1998. On the limits of syntax, with reference to left-dislocation and topicalization. In The Limits of Syntax, Peter Culicover and Louise McNally (eds.). San Diego: Academic Press, 281-302.

Reinhart, Tanya. 1981. Pragmatics and Linguistics: An analysis of sentence topics. Philosophica 27 (1): 53-94.

Rodman, Robert. 1974. On left dislocation. Papers in Linguistics 7: 437-66. [1997. Reprinted In Materials on Left Dislocation, Elena Anagnostopoulou, Henk van Riemsdijk and Frans Zwarts (eds.). Amsterdam: John Benjamins, 31-54]

Ross, John. 1967. Constraints on Variables in Syntax. MIT, Mass.: Doctoral Dissertation. [1986. Published as Infinite Syntax! Norton, NJ: Ablex].

Ross, John. 1973. A fake NP squish. In New Ways of Analyzing Variation in English, Charles-James Bailey and Roger Shuy (eds.). Washington, DC: Georgetown University Press, 96-140.

Snider, Neal. 2005. A corpus study of left dislocation and topicalization. Stanford University: Linguistics Department, TS. <http://www.stanford.edu/ ~snider/pubs/qp1.pdf.> (Last accessed on 29 April 2011).

Tizón-Couto, David. 2012. Left Dislocation in English. A Functional-discoursal Approach (Linguistic Insights 143). Bern: Peter Lang.

Tizón-Couto, David. Forthcoming. A corpus-based account of the decline of left-dislocated noun phrases in the recent history of English. Proceedings of the 18th International Conference on English Historical Linguistics, Leuven.

Traugott, Elizabeth Closs. 1972. The History of English Syntax: A Transformational Approach to the History of English Sentence Structure. New York: Holt, Rinehart and Winston.

Traugott, Elizabeth Closs. 2007. Old English left-dislocations: Their structure and information status. Folia Linguistica 41 (3-4): 405-441.

Van Riemsdijk, Henk. 1997. Left dislocation. In Materials on Left Dislocation, Elena Anagnostopoulou, Henk van Riemsdijk and Frans Zwarts (eds.). Amsterdam: John Benjamins, 1-10.

Vat, Jan. 1997. Left dislocation, connectedness and reconstruction. In Materials on Left Dislocation, Elena Anagnostopoulou, Henk van Riemsdijk and Frans Zwarts (eds.). Amsterdam: John Benjamins, 67-92.

Villalba, Xavier. 2000. The syntax of sentence periphery. PhD dissertation, Universitat Autònoma de Barcelona.

Wald, Benji. 1983. Referents and topics within and across discourse units: Observations from current vernacular English. In Discourse Perspectives on Syntax, Flora Klein-Andreu (ed.). London/New York: Academic Press, 91-106.

38

APPENDIX

A. Linear model: interaction between ‘type’ [of LDet-sequence] and ‘subsequent_mention’ (logarithmic values have been used to soften the scarcity of data for SUM)

B. Linear model: interaction between ‘subsequent_mention’ and ‘IS’ [information status]

lm(formula = log(subsequent_mention) ~ type, data = archer)

Estimate Std. Error t value Pr(>|t|) Intercept (type_prp) 0.57760 0.05547 10.412 <2e-16 *** type_sum -0.36966 0.20830 -1.775 0.0771 . type_atr 0.20632 0.10796 1.911 0.0570 . type_ack 0.02135 0.08693 0.246 0.8061 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.6349 on 274 degrees of freedom Multiple R-squared: 0.028, Adjusted R-squared: 0.01736 F-statistic: 2.631 on 3 and 274 DF, p-value: 0.0504

lm(formula = subsequent_mention ~ IS, data = archer) Estimate Std. Error t value Pr(>|t|) Intercept (IS_given) 2.5987 0.1541 16.859 <2e-16 *** IS_inferrable -0.5724 0.3447 -1.661 0.0980 . IS_new -0.6654 0.2682 -2.481 0.0137 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 1.9 on 262 degrees of freedom Multiple R-squared: 0.02702, Adjusted R-squared: 0.0196 F-statistic: 3.638 on 2 and 262 DF, p-value: 0.02764