taking permissible shortcuts? limited evidence, heuristic reasoning and the modal auxiliaries in...

30
Taking permissible shortcuts? Limited evidence, heuristic reasoning and the modal auxiliaries in early Canadian English* Stefan Dollinger 1. Introduction The present study focusses on the ‘other’ North American variety of Eng- lish, Canadian English (CanE) from a diachronic perspective. While CanE is a relative newcomer to the field of linguistics, diachronic studies in real- time, as Brinton and Fee (2001: 426) point out, have been almost entirely missing. Much of what is known about historical CanE is in the field of lexicology, as a direct result of the work on the Dictionary of Canadianisms on Historical Principles (Avis et al. 1967), which has become seriously out of date. 1 In the present study, an attempt is made to illustrate a kind of ‘bad’ data problem that is of particular relevance to the study of Late Modern English (LModE) varieties in colonial contexts. The modal auxiliaries in Ontario English (OntE) in the period from 1776 to 1849 shall serve as a test case here and are surveyed in relation to British English (BrE), with some com- parisons to American English (AmE). One of the problems diachronic lin- guists are confronted with in many colonial contexts, such as in early On- tario, is a large array of input varieties and only exemplary real-time evi- dence that is available in machine-readable data formats. In this paper, an approach is suggested that aims to circumvent these data limitations in co- lonial contexts for the period between 1700 and 1900. The focus of the present paper is therefore not on the variables and their developments as such (see Dollinger forthc.c in this respect), but on what is best called a heuristic method to overcome, at least in part and as an approximation, a) the lack of readily-available data, b) low token discourse frequencies and c) genre-specific variation that makes the characterization of linguistic vari- ables within national varieties a particular challenge.

Upload: ubc

Post on 20-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Taking permissible shortcuts? Limited evidence, heuristic reasoning and the modal auxiliaries in early Canadian English*

Stefan Dollinger

1. Introduction

The present study focusses on the ‘other’ North American variety of Eng-lish, Canadian English (CanE) from a diachronic perspective. While CanE is a relative newcomer to the field of linguistics, diachronic studies in real-time, as Brinton and Fee (2001: 426) point out, have been almost entirely missing. Much of what is known about historical CanE is in the field of lexicology, as a direct result of the work on the Dictionary of Canadianisms on Historical Principles (Avis et al. 1967), which has become seriously out of date.1

In the present study, an attempt is made to illustrate a kind of ‘bad’ data problem that is of particular relevance to the study of Late Modern English (LModE) varieties in colonial contexts. The modal auxiliaries in Ontario English (OntE) in the period from 1776 to 1849 shall serve as a test case here and are surveyed in relation to British English (BrE), with some com-parisons to American English (AmE). One of the problems diachronic lin-guists are confronted with in many colonial contexts, such as in early On-tario, is a large array of input varieties and only exemplary real-time evi-dence that is available in machine-readable data formats. In this paper, an approach is suggested that aims to circumvent these data limitations in co-lonial contexts for the period between 1700 and 1900. The focus of the present paper is therefore not on the variables and their developments as such (see Dollinger forthc.c in this respect), but on what is best called a heuristic method to overcome, at least in part and as an approximation, a) the lack of readily-available data, b) low token discourse frequencies and c) genre-specific variation that makes the characterization of linguistic vari-ables within national varieties a particular challenge.

356 Stefan Dollinger

After a case study on the modals CAN and MAY that illustrates the ap-proach adopted here, the results of this kind of reasoning will be presented for eleven modals and semi-modals in OntE in relation to BrE and an at-tempt will be made to assess their behaviour in the light of later develop-ments on a cline from conservative to progressive behaviour when com-pared to BrE. It will become clear that we will inevitably have to rely on some kind of heuristic reasoning in order to arrive at generalizations. A heuristic process can be described as

A process, such as trial and error, for solving a problem for which no algo-rithm exists. A heuristic for a problem is a rule or method for approaching a solution. (Blackburn 2005: s.v. heuristic).

For our purposes, we can equate the term ‘algorithm’ with the fully-fledged, empirical, corpus-based study, firmly based in the philological tradition that has traditionally been centred on a standard variety. Heuristic reasoning seeks to make the best of limited data and aims to allow state-ments on the development of colonial varieties when not all data is avail-able. Heuristic principles, while used pervasively in other disciplines, such as in engineering, computer science, social sciences (see Michalewicz and Fogel 2004, Tversky and Kahneman 1982), have not yet been explicitly applied to English historical linguistics (while its principles may have played a role), because, perhaps, they do seem to be incompatible with the standard of description that is usually found in English historical linguis-tics. The test case presented here, however, will make it clear that the time may be ripe for embracing ‘good enough’ solutions and approximations more directly even in this discipline, and thus follow the lead of the ‘harder’ sciences. It will be shown that heuristic means are a kind of rea-soning that may provide fairly reliable shortcuts to answers to research questions that would otherwise be years, if not decades, away.

Two kinds of ‘bad data’ scenarios are usually discussed in historical lin-guistics. While both points are not new, I will refer to two more recent con-tributions to this discussion. First, the problem that historical documents “survive by chance, not by design”, often containing “a normative dialect” and not the vernacular (Labov 1994: 11) is something one has to deal with. This problem becomes less pervasive in the LModE period, when data pro-duced by minimally-schooled writers become available, which has proven to be a rich resource. Second, Nevalainen (1999) addresses the problem of acquiring social information on the informants and points out that in Early

Taking permissible shortcuts? 357

Modern English (EModE), and to an increased extent in LModE, the chances to reconstruct the social milieu of a writer increase dramatically. In the colonial context, however, I would like to suggest a third ‘bad’ data problem which is revealed in scenarios of new-dialect formation (Trudgill 2004) and the complex nature of input varieties in colonial Englishes. Put in a nutshell, the problem is the sheer variety of input varieties that came to form new colonial Englishes. To exemplify the problem let us review briefly the external language history of the variety for the sociohistorically most important variety of Canadian English, Ontario English.

1.1. Late Modern English, colonial English, Ontario English

The Late Modern English period (LModE) has only fairly recently become the object of linguistic enquiry. Although Jones (1989: 279) referred to the eighteenth and nineteenth centuries as “the Cinderellas of English historical linguistic study” not long ago, the late 1990s saw a considerable increase of research activity.2 The LModE period coincides with an accelerated spread of colonial Englishes and entails a focus on what Clyne (1992) refers to as ‘non-dominant’ varieties of English, i.e. varieties other than British and American English. Indeed, Trudgill (2002: 44) proclaims that for the period after 1700 there is “no real excuse” to confine one’s focus to the two domi-nant varieties of English in more general works of English historical lin-guistics.

Little pre-1900 historical work has been done on the historically most important variety of CanE, namely OntE, which provided the model for Standard Canadian English as an urban middle class dialect (cf. Chambers 1998: 252). Thomas (1991) is a notable exception. Based on linguistic atlas data from the 1950s and employing the apparent-time design, he traces the development of Canadian Raising back to the late nineteenth century. Chambers’ (1981, 1993, 2004) work on language attitudes in Victorian Ontario adds another important layer to the available documentation, as does Hultin (1967). M. Bloomfield (1948) and Scargill (1957), the two classic contributions on the origins of CanE, are, however, based on settle-ment history alone and are devoid of linguistic data. Only recently, a real-time study of pre-Confederation OntE on the modal auxiliaries from the viewpoint of new-dialect formation (Trudgill 2004) has been completed (Dollinger forthc.a, forthc.c).

358 Stefan Dollinger

1.2. External history

The external history of Canada is divided into four distinct settlement waves (Chambers 1991, 1998). Because of the temporal focus on pre-Confederation (pre-1867) OntE in the present paper, settlement waves I and II are of importance here. Wave I is comprised of early immigration from the United States into Ontario, from roughly around 1776 to 1793, with a small trickle continuing until 1812, when war with the United States broke out. This early immigration resulted in the first peopling of Ontario and the steady transformation of the “primeval forest west of Montreal” that had previously been inhabited by only “a few hundred English-speaking” peo-ple (Orkin 1970: 52f), most of whom were transient fur traders.

This first wave, however, was a rather heterogeneous mix of ‘American loyalists’. Besides native English speakers, most of them of early AmE, there were significant elements of disbanded German soldiers (including some Swiss German speakers), large contingents of Scottish Gaelic speak-ers and sizeable Dutch elements, which were joined by some Scots speak-ers, Irish English speakers and French speakers (Dollinger forthc.c: section 3.2). As those minorities came early, they are expected to have had an in-fluence on early CanE via language contact phenomena, which included L2 varieties that waned in the late 1800s.

Wave II is comprised of post-1815 immigration. After the War of 1812 had come to an end (in 1814) and the Napoleonic wars had ended in Europe, the European population surplus was dumped on the colonies. This wave is characterized by massive immigration from the British Isles (al-though sizeable immigration from the German states is reported prior to 1850, Bausenhart 1989). The second wave included Scottish Gaelic speak-ers and Scottish English speakers, speakers of Northern English dialects, large numbers of Southern Irish and Ulster Scots (from Northern Ireland and also from the USA) and only a minority of Welsh, southern English dialect speakers and AmE speakers. In 1812 the Ontarian (then Upper Ca-nadian) population was somewhere in the vicinity of 83,000 inhabitants (Gourlay 1822: 612), the majority of which were Americans (cf. Akenson 1999: fn 117). This demographic make-up was changed by wave II.

Figure 1 shows the relative population input to Ontario for the years 1829–1859, giving an impression of the numerical swamping of the Ameri-can base with immigrants from the British Isles. For instance, immigration in 1829–31 alone outnumbered the entire population of the province in the

Taking permissible shortcuts? 359

year 1812, with almost 16,000 coming in 1829, 28,000 in 1830, and 50,000 in 1831, and was to continue, with ups and down, in that fashion:

Arrivals at Port of Quebec from overseas

0

10

20

30

40

50

60

70

80

9281

1381

3381

5381

7381

9381

1481

3481

5481

7481

9481

1581

3581

5581

7581

9581

stnar

gim

mi lla fo e

gatnecre

P

England % Ireland % Scotland % Europe % Maritime

Figure 1. Arrivals at the Port of Quebec 1829–1859 (raw data from Cowan 1961: 289, table II; classification based on ports of departure)3

We have some evidence from recent socio-historical studies that the over-whelming majority of these second wave immigrants did, contrary to one’s intuition, not settle in the established centres founded by the American loyalists. Akenson (1999: 36) reasons that for Irish immigrants, who repre-sented the bulk in the later years in figure 1, “one has to conclude that the overwhelming majority of Irish migrants to Ontario settled in the country-side”. Indeed, there was plenty of unsettled land in Ontario at the time, just waiting to be cleared (Wood 2000: 29–31).

This finding has important implications from a sociolinguistic point of view, as it contradicts the otherwise logical reasoning that most of these immigrants would have settled “in towns and villages founded by the Loy-alists” (Chambers 1998: 252). It also implies that AmE varieties – both native as well as L2 varieties – would have tended to be used in Ontario’s

360 Stefan Dollinger

centres and that dialects from the British Isles would have been evident in the countryside, before they eventually would have merged (except in the case of Ontario’s linguistic enclaves such as the Ottawa Valley or Peter-borough). Linguistically, however, the second wave has usually been con-sidered from a twentieth-century perspective as having had only a highly limited influence on OntE (Avis 1978: 4, Chambers 1998: 263). Real-time data show some patterns that do not necessarily corroborate the limited influence of BrE variants for all areas of modal usage (Dollinger forthc.b, data on SHALL and WILL).

1.3. Data and variables

The data come from three corpora: the Corpus of Early Ontario English, pre-Confederation section (CONTE-pC) (see appendix 3), provides the CanE material, A Representative Corpus of Historical English Registers, version 1 (ARCHER-1)4, the AmE and BrE data, and the Corpus of Late Eighteenth-Century Prose (CL18P)5 is used for data from NW England. Table 1 summarizes the varieties, corpora, genres and periods for the avail-able data:

Table 1. Overview of corpora, periods and genres (shaded areas: nonexistent corpus data)

CanE6 (CONTE-pC)

AmE (ARCHER-1)

BrE (ARCHER-1)

NW-BrE (CL18P)

Genres letters, diaries, newspapers letters Period 1 1776–1799 1750–1799 1750–1799 1761–1790 Period 2 1800–1824 Period 3 1825–1849

1800–1849

Period 4+5 1850–1899 1850-1899 1850–1899 While most findings are based on periods 1–3, period 4+5 was used as an additional benchmark in AmE in one instance (item no. 6 in Table 3). Pe-riod 2+3, which is referred to later on, comprises the years 1800–1849. Grey shadings in Table 1 indicate data that are not available. While these gaps are apparent, the bigger gaps are not shown in Table 1 as they concern the input varieties for which we do not even have corpora. When aiming to characterize regional varieties (or national varieties),7 we would wish for corpus data from all input varieties. As we know from the external history

Taking permissible shortcuts? 361

in Section 1.3, this is a formidable task. Given the immensity of data needed to compare colonial varieties with their input, there is of course the question whether we may be in the position to produce findings at all.

In order to fully address questions such as the influence of input varie-ties on newly-formed dialects or the conservatism or progressivism of a variety in relation to another one, we would need a complete set of input data corpora, which would include a number of varieties in the Ontarian context. For the pre-1800 period, besides AmE, (standard) southern BrE and north-western regional BrE, we would need to have Irish English data, Scottish English, L2 varieties of Scots Gaelic speakers, German, Dutch, French speakers and First Nations speakers, among others. For the post-1800 period, we would require data from Northern British English, south-ern BrE, both Southern as well as Northern Irish English (Ulster Scots) information, AmE, Scottish and Scots Gaelic L2 speakers, AAVE and First Nations in Canada and L2 varieties from immigrants (differences in the genres in different varieties, e.g. personal letters, further complicate the picture and are almost beyond the control of the researcher).

One way towards problem solving in heuristics is to consider the avail-able data carefully, which is easily illustrated by examples from logic (see Zbigniew and Fogel 2004: 9). In historical linguistics the data may be – within the limitations of ‘bad’ data outlined above – ‘available’ in libraries and archives, but not readily accessible. Often this catch is interpreted as a deficit of the researcher, and not as a result of the vastness of the task of data mining. While it is always a good idea to increase one’s baseline data, I would argue that we should seriously consider heuristic approaches in historical linguists and make use of ‘good-enough’ solutions that have proven useful in other disciplines.

Related to the data question is the problem of quantitative vs. qualita-tive studies. Concerning the former, the cut-off point of what constitutes eligible frequencies (e.g. at least 10 tokens) is, of course, inspired by the available corpus sizes. The modal auxiliaries have usually tended to yield, on the whole, acceptable frequencies in the standard corpora. But what are we to do if conventional corpus sizes do not produce the expected token counts? I would argue, again thinking heuristically, that there may be an alternative route to going straight back to the archives in search for more data. This line of reasoning, making the proverbial best of ‘bad’ data, will be illustrated in the quantitative treatment – and statistical testing – of lim-ited data that, combined with Present Day English (PDE) findings, may

362 Stefan Dollinger

prove just enough to characterize the LModE development of the modals in the Ontarian context.

Even in the LModE period the modal auxiliary complex is undergoing considerable changes in usage and most of those changes tend to stylistic rather than categorical in nature (Denison 1998: 165, 1993). In Section 2, CAN and MAY serve as examples for the kind of heuristic reasoning used for an assessment of whether CanE showed progressive or conservative behaviour in the modal auxiliaries in relation to BrE. In Section 3, the over-all picture for the modals in early OntE will be presented.

2. A close-up: CAN and MAY in early OntE

The early development of CAN and MAY is well documented. In OE mæg, the formal ancestor of MAY, is prevalent in the sense ‘to be able to’. Fol-lowing a pragmatic/semantic cline of grammaticalization, this ability use was extended to denote possibility and was subsequently developed into permissive uses. Permissive uses can be occasionally found in OE (Traugott 1972: 71f), but the “full performative use of may, as in You may go ‘I permit you to go’, did not gain wide currency until the sixteenth cen-tury” (Traugott 1972: 118). OE cann, on the other hand, originally meant ‘to know, be acquainted with, know how to’. Once CAN acquired the meaning of ‘to be able to’, MAY gave up the meaning of ‘to be able, be strong’ (Kytö 1991: 65). After this semantic change, it was only a question of time before the meanings of ‘possibility’ and finally ‘permission’ devel-oped by implication, i.e. pragmatic factors in the context favouring these interpretations rather than ‘ability’ readings (Kytö 1991: 65).

CAN and MAY, revolving around the notions of ability, permission and possibility, have been studied at the synchronic level and for older lan-guage stages up to EModE (e.g. Coates 1983; Facchinetti 2003, 2002, 2000; Kytö 1991: 81–258, Warner 1993: 176–178, Denison 1993: 292–325), while in LModE general developments have been identified in the area of permission (e.g. Traugott 1972: 170f; Kytö 1991: 65; Denison 1993: 303). In comparison to the long-term development from OE to EModE, the study in LModE has not been pursued as rigorously. Simon-Vandenbergen (1984) seems to be the only quantitative study of CAN and MAY that in-cludes semantic categories as well as a sizeable proportion of LModE texts (based on personal and official letters and drama data). She shows (1984: 364) that root uses of CAN, in the sense of permission, first occur in nine-

Taking permissible shortcuts? 363

teenth-century plays and encroach on the territory of root MAY. A number of examples are found in the three corpora that illustrate this shift of func-tions of CAN, as shown in example 1(a–c):

(1) a. ability (usually with animate subject): I fancy Kitty can do nothing better … (ARCHER-1, BrE, 1750–99, letter section)

b. neutral, root possibility (ambiguous case): I should like to by it if I can by it two Advantage (CONTE-pC, 1825–49 [a minimally-schooled letter writer])

c. permission: We have decided upon my sitting with Mamma every night during tea and Minnie during dinner as then I can read Mamma's prayers to her. (CONTE-pC, 1825–1849; a teenaged diary writer) [the girl is allowed to read to her sick mother then]

As pointed out by Coates (1983: 139 for 1960s BrE), 1 (b, c) illustrate the core functions of MAY, which means that CAN was beginning to compete with MAY in the latter’s core domain. This long-term developmental sce-nario provides the background for the identification of progressive and conservative forms – CAN as progressive, and MAY as conservative – with CAN conveying more informal undertones that MAY (Coates 1983: 103).

Table 2 shows the areas of competition for CAN and MAY in the late eighteenth and nineteenth centuries and gives examples of their uses. The grey shaded cells highlight the semantic areas in which CAN and MAY compete. The example in MAY/permission can be paraphrased in PDE by CAN, and it is possible to substitute CAN/permission with MAY without a significant change in meaning. MAY/root possibility can again be re-phrased with CAN, while CAN/root possibility with MAY is more of a borderline case, but acceptable in some varieties. We are attempting to trace changes in these two areas, which expanded in LModE, in CanE in relation to BrE. As MAY had become obsolete as a marker of ability in ME and CAN was not used for epistemic possibility, the competition is limited to the areas of root possibility and permission. From a twentieth-century perspective, we know that CAN was to gain the upper hand in both permis-sion and, to some extent, in root possibility meanings, as a result of influ-ences from informal genres (see Coates 1983: 106f) and is now showing first uses in epistemic readings (Coates 1995 cites first examples of 1990s

364 Stefan Dollinger

AmE for epistemic CAN, Facchinetti 2000 finds first examples in early 1990s BrE).

Table 1. CAN and MAY – semantic categories and examples

Prototype FORM

NOTIONAL FUNCTION

EXAMPLE

Permission If he show any Disposition to write me a peniten-tial Letter, you may encourage it; not that I think it of any Consequence to me, but because it will ease his Mind and set him at rest. (BrE-1)

Root possibility Any person wishing to purchase may depend upon getting a great bargain (CanE-2)

MAY

Epistemic possibility

… and my ideas upon the several points which may, between this and then, occur to me (AmE-1)

Ability I fancy Kitty can do nothing better … (BrE-1) Permission I have your Certificate that the land is not leased

or vacant of course none [of the settlers] can be located without the sanction of the Lt. Gov. as in other cases. (CanE-2)

CAN

Root possibility Nothing can be more satisfactory than the readi-ness and unanimity with which the Legislature have applied to meet the emergencies (CanE-3)

Figures 2 and 3 show the results for CanE and BrE in periods 1 and 2+3 and for AmE in period 1. All instances of CAN are shown in relation to MAY; the functions of ‘permission’ and ‘root possibility’ are shown by genre, as the choice of CAN and MAY seems to have been influenced by text type (See note8 for abbreviations used), but we will need to summarize the usage across different genres later on.

First, we will focus on root possibility uses of CAN, as shown in Figure 2. By comparing the CanE with the BrE data, we can see a picture that is largely parallel in incidence in the diary and letter genres. Only in the newspaper genre the pattern diverges. The AmE data are closer to the CanE values across all genres, which indicates the loyalist base of CanE. Viewed in the larger diachronic context, CanE seems more progressive in its use of CAN in diaries (i.e. closer to 1960s BrE usage, where CAN is predomi-nantly used in possibility readings, Coates 1983: 101), but more conserva-tive in letters and newspapers than BrE. The apparent tendency for an in-crease of CAN in diaries and letters is, however, not confirmed by statisti-

Taking permissible shortcuts? 365

cal testing, which yields no significance for an increase in CAN for the changes in Figure 2 at the 95% level (Appendix 1).

From our background knowledge, these data seem to suggest, by heu-ristic reasoning, a drift scenario in diaries and letters, even though statistical testing does not help us here. The different developmental patterns in CanE and BrE newspapers could either be a result of chance, or alternatively, as it shows the statistically strongest change (Appendix 1, n–1 and n–2+3), a stylistic change towards more BrE norms in Canadian newspapers. CanE uses MAY more often than CAN in letters, but not in diaries, which indi-cates a more formal style in CanE letters than in British ones. It is probably best to interpret these figures as reflecting instances of directional drift and stylistic variation in CanE, with more formal possibly more conservative tendencies in the letter and newspaper genres. We can say that ‘root possi-bility’ in newspapers tended to be expressed by MAY in CanE, as opposed to BrE, but with both varieties converging in the 30% range in period 2+3 (Figure 2). Overall, this change at the time can be perceived as a change from above the level of consciousness, as the use of CAN is found be to documented – and castigated – by eighteenth-century grammarians (Sundby et al. 1991: 211).

For the second function in which CAN and MAY overlap, permission uses, a change has been reported for LModE. Figure 3 shows the develop-ment for CAN denoting permission and we see an increase CAN in both CanE and BrE in all genres. While differences in percentages appear in part to be considerable, again, no change of CAN is statistically significant (cf. appendix 2), which is partly a result of low token incidences. In 1960s BrE, informal texts tend to show more uses of permission CAN (Coates 1983: 101), but are generally used less frequently than possibility readings of CAN. Using a different classification system, Facchinetti (2002: 239) shows for early 1990s BrE that 5% of all uses of CAN are deontic readings, which are one core of our permission uses.

What should we make of these results? It is clear that CAN is expand-ing its use in both CanE and BrE in the areas of permission and, for diaries and letters, root possibility uses. But neither the differences between CanE and BrE in each period, nor their increases from period 1 to 2+3 are signifi-cant, and statistical testing (neither chi-square nor Fisher’s Exact) does not help us much to confirm a long-term change corroborated in previous stud-ies. The change appears to be too slow to reach the 95% significance level in 25-50 year periods in the data. We know by hindsight, however, that in twentieth-century BrE, CAN is usually the default variant and not MAY,

366 Stefan Dollinger

which “is marked for formality” (Coates 1983: 103 for BrE, Ehrman 1966: 12 notes the use of permission CAN in 1961 AmE dialogue data). With this knowledge, we would interpret the diachronic changes in Figures 2 and 3 as a LModE parallel development in both CanE and BrE leading up to the PDE distribution, despite their lack of statistical significance in our data.

This ‘heuristic’ conclusion could easily be tested in a quantitative framework. As Appendices 1 and 2 show, the token frequencies are low. In BrE diaries (Appendix 1, first table), which shows a solid increase of root possibility CAN from period 1 to 2+3, the chi-square value is 1.04, and therefore short of the 3.84 required to reach significance at the 95% confi-dence interval. If we quadrupled the token frequencies, we would reach a value of 4.18, and this gives us an idea of how much data are needed. We would need at least four times as much as we have now, or c. 500,000 words for the late eighteenth and early nineteenth century to be able to base our study on firm statistical grounds. And here, even CONCE (Kytö, Ru-danko, Smitterberg 2000: 89), the biggest corpus of nineteenth-century BrE to date, would provide too little data for our three genres to meet the base-line data criterion (although the c. 250,000 words of letter data are a sub-stantial body of evidence). Clearly, the data mining needed to test our heu-ristic reasoning on strict quantitative terms, would be a project on its own.

The examples show that changes which appear to be instances of slow moving drift or parallel development do not necessarily reach levels of significance in the data. We have also said, however, that based on our knowledge of the further development, it is justifiable to interpret the data as exhibiting a parallel development, since five out of six instances of a change point in the same direction (all except for the CanE and BrE news-paper data). This approach is best described as heuristic reasoning. How-ever: we do not have AmE data from the early nineteenth century to con-firm our reasoning of drift in the last instance, and we are at a loss for other input varieties of OntE (such as Scottish English, Ulster English or Irish English, or post-1800 Northern English). We do know, however, that in twentieth-century BrE, CAN has become the default form for root possibil-ity meanings (cf. Coates 1983: 101) and more and more in permission uses, despite its condemnation by prescriptive grammarians.

The method applied here is hardly new as it has been implicitly applied in many diachronic studies. What is new, though, is the claim that we can-not, and should not, expect studies to include data that are, strict sensu, needed to make any generalization about colonial varieties, but should em-

Taking permissible shortcuts? 367

'Root possibility' CAN (vs. MAY) in three varieties

50.0

20.0

33.8

60.9

52.9

80.0

39.5

51.6

70.460.0

29.4 55.6

22.7

42.950.0

0

20

40

60

80

100

d1 d2+3 l1 l2+3 n1 n2+3

genres & periods

tnecrep

CanE

BrE

AmE

Figure 2. Diachronic development of CAN and MAY coding for ‘root possibility’ in CanE and BrE (AmE-1 for comparison)

'Permission' CAN (vs. MAY) in three varieties

0.0

15.0

33.3

100.0100.0

50.050.0

14.3

60.0

50.0

75.080.0

0.00.0

100.0

0

20

40

60

80

100

d1 d2+3 l1 l2+3 n1 n2+3

tnecrep

CanE

BrE

AmE

N

Figure 3. Diachronic development of CAN and MAY coding for ‘permission’ in CanE and BrE (AmE-1 for comparison)

368 Stefan Dollinger

brace the educated guessing applied in heuristics. The method is an ap-proximate comparison of the situation in early OntE with BrE (and AmE varieties) and it needs to be stressed that the method cannot rule out reverse developments. However, it would be somewhat unlikely for CAN to move backwards given clines of grammaticalization (e.g. Traugott 1989). As the method makes use of the data available and applies other sources and benchmarks, its heuristics provides us with insights into otherwise un-charted waters. Section 3 shows the results of this heuristic reasoning for eleven modals in a total of 19 contexts in early OntE.

3. The bigger picture: eleven modals in early OntE

In this section, the nine core modals, CAN/MAY, COULD/MIGHT, SHALL/WILL, SHOULD/WOULD, MUST, plus OUGHT TO, and semi-modal HAVE TO, will be presented in terms of progressive or conservative behaviour in CanE in relation to BrE. For this purpose, the results for each period, the diachronic development as well as the empirical base have been considered, gauged and assigned to one category in the manner illustrated in the previous section. The assessment is carried out on a 5-tier grid, clas-sifying each function into ‘conservative’, ‘neutral’ and ‘progressive’, and ‘towards conservative’ or ‘towards progressive’ for intermediate cases.

Clearly, this kind of attempt to classify the overall behaviour across the three genres also requires some form of heuristics, i.e. the consideration of the available data and a principle to synthesize the variables into one meas-ure. As the empirical base for permission uses of CAN (cf. Appendix 2) is especially slim in CanE diaries and newspapers (CanE-d, CanE-n), prefer-ence is given to the letter data, which are slightly more conservative than BrE data. The overall assessment for permission uses of CAN in CanE is perhaps best described as ‘towards conservative’ (item no. 1 in Table 3). Root possibility CAN, with diaries more progressive, letters more conserva-tive and newspapers conservative (but approaching BrE values), is also best characterized as ‘towards conservative’ (item no. 2). The remaining 17 contexts were assessed in a similar manner, resulting in Table 3. Table 3 shows that pre-Confederation OntE does not lean heavily in any direction. If anything, it appears to be slightly more progressive in its overall use of the modals than BrE. Interestingly, in six out of nine cases it patterns with AmE data from period 1, which linguistically confirms the loyalist base hypothesis for AmE input. The data from NW England (NW-BrE) in pe-

Taking permissible shortcuts? 369

riod 1, however, are rather distant from both CanE and AmE where avail-able (items no. 6 and 15). Item no. 6, use of SHALL and WILL in the first person, is subject to massive change from period 1 to period 2+3, which is the reason for its occupying two slots (see note 10). Without more BrE data we cannot decide on the two categories, but it seems clear that the change was the result of regional BrE influence (Dollinger forthc.b).

The overall pattern shown in Table 3 suggests that early CanE did not simply follow AmE or BrE usage, but was beginning to show idiosyncratic developments in the modal auxiliary complex and most likely elsewhere. For the area of vocabulary, the unique Canadian character has been long accepted (Lovell 1955: 5) and researched (Avis et al. 1967; de Wolf 1997; Barber 2004). Table 3 provides some indicators for developments specific to CanE in the use of modal auxiliaries. For item no. 18, COULD in epis-temic uses, for instance, the loyalist base theory of American input cannot account for this behaviour, as AmE and CanE are placed at opposite ends of the spectrum. In others areas, such as the use of first person WILL (item no.6), CanE may well have been more progressive than AmE, although this remains a hypothesis until post-1800 AmE data can be found. An overall assessment such as that represented in Table 3 allows new insights into the behaviour and genesis of colonial varieties and would warrant the element of ‘educated guesswork’ inherent in heuristic approaches, of which the summarization in Table 3 is the result.

4. Possible conclusions: evidence and reasoning

The overall results in Table 3 show that CanE modal use appears to have been slightly on the progressive side when compared with BrE. While largely matching AmE use, early OntE shows some developments that may have been specific to the variety.

As LModE varieties are a very recent area of inquiry, and LModE dia-lectal variation largely remains to be studied (but see for instance Watts and Trudgill 2002), the present situation may be comparable to the state of knowledge of EModE variation in the 1980s (Görlach 1988). If one wishes to make generalizations along the lines suggested in Table 3 between a colonial variety and BrE – with some reasoning for AmE and regional BrE – some approximations and heuristic ways of reasoning are necessary in the light of gaps in the historical corpus inventory. Given the usual

370 Stefan Dollinger

Table 3. Eleven modals in CanE in comparison to BrE (data for AmE-1 and NW-BrE-1 is indicated in relation where applicable) (adapted from Dollinger forthc.c: table 11.1)

VARI-ABLES

function / context Conser-vative

Towards cons.

Neutral Towards progr.

Progres-sive

permission

1) root poss.

2) affirmative con-texts

(AmE-1)

CAN &MAY

3) negative contexts (AmE-1)

OUGHT TO

4) overall

5) 1st person9

(NW-BrE) (AmE-1)

6) 2nd person

7) 3rd person

8) inanimate subjects

SHALL &WILL

9) passive structures

10) hypotheticals SHOULD &WOULD 11) non-hypotheticals

12) root uses (AmE-1)

13) epistemic uses (AmE-1)

MUST &HAVE TO

14) ep. MUST NEE-CESS.

(NW-BrE) (AmE-1)

15) affirmative con-texts

AmE-1)

16) negative contexts (AmE-1)

17) epistemic uses (AmE-1)

COULD &MIGHT

18) non-epistemic uses

Taking permissible shortcuts? 371

production phase of a historical corpus of about three to four years and the multitude of input varieties for LModE colonial varieties, we are still some time off from reaching at more complete data sets of input varieties which were, as is well known, mainly regional dialects (Hickey 2004a: 1). We have seen in Section 1.3 that in the Ontarian context, at least ten groups in wave I and nine groups in wave II prior to 1850 would need to be consid-ered. Of the required 19 corpora, we have just one regional corpus for pe-riod 1 (NW-BrE), one AmE for period 1 and two BrE text collections (ARCHER-1 and CONCE), the latter two of which include texts that are closer to standard varieties and may not be the type of material one would ideally wish for.10 In some respects, we are 16 corpora short of being able to reach hard and fast conclusions. Moreover, given current copyright prac-tices, we may be a couple of generations of researchers away from the nec-essary research tools.

Circumventing this lack of resources, the approach laid out in Sections 2 and 3 outlines a method that synthesizes the OntE data with existing real-time resources in combination with studies of the variables in later periods. While this approach is somewhat compromised by the possibility of retro-grade developments in the nineteenth and twentieth centuries, it offers a more precise scenario of language use in colonial varieties and their rela-tions to dominant varieties of English.11 All that is needed is one reliable corpus of the variety to be surveyed, such as CONTE-pC, and access to the available corpora of input varieties. Findings gleaned from studies such as the present one are, given the lack of resources, likely to hold for many years to come.

While the current approach employs both percentage changes in the dis-tributional frequency of features and statistical testing, it does not necessar-ily rely on levels of significance in the interpretation of its findings. Given the short period intervals of 25 years, which are necessary to tap into the process of new-dialect formation (Trudgill 2004), the customary sample sizes in historical linguistics do not easily produce statistical significance at the 95% level even for relative high-frequency items like the modals. How-ever, the percentage distributions appear to show a clear long-term trend from a present-day perspective.

The biggest challenge for LModE variationist linguistics is the lack of specialized corpora for regional English dialects, and this is compounded by the lack of sufficiently large corpora for historical linguistic analysis using corpus linguistics techniques. We have seen that – in theory – at least four times the customary sample sizes would be the base line needed to

372 Stefan Dollinger

reach significance levels for the modal auxiliaries in semantic studies. However, as I have tried to show in the discussion of CAN and MAY, one can arrive at meaningful interpretations of even limited data sets by exploit-ing the relative temporal proximity of LModE to twentieth-century English, using Coates’, Ehrman’s and Facchinetti’s studies.

As a result, given the limited resources for corpus compilation, the pro-duction of smaller, specialized corpora of more diverse LModE varieties, rather than the production of a bigger-sized corpus a single variety, should remain a priority. Logically, the “three main source regions of extraterrito-rial varieties of English” in the British Isles (Hickey 2004b: 33), English English (with a differentiation of north and south [Trudgill 2004]), Scottish English and Irish English (both Ulster Scots and Southern Irish English), from around 1700 to 1900, would be the first choices for new research tools for the spread of English in the LModE period. In the Ontarian context, 19th-century AmE data is another prime desideratum.

While we would wish to have access to complete dialect lineages of all input dialects involved in the formation of a given variety (as called for by Wolfram and Schilling-Estes 2004: 182), corpora of LModE Scottish and Irish English would tremendously facilitate the line of heuristic reasoning proposed in Sections 2 and 3. Ultimately, only complete data sets can pre-vent us from drawing tentative conclusions about influences, independent developments and questions of colonial lag that may not quite stand the test of time. However, by taking a more explicit, exploratory heuristic point of view, we do not necessarily have to call it quits until these resources mate-rialize.

Appendices: Statistical testing (95% level)

For the following contingency tables, both the Chi-Square Test scores (the standard in much of corpus linguistics when statistical testing is applied) and Fisher’s Exact Test scores are provided. Please note that Fisher’s Exact is to be preferred in all cases where one of the cells contains less than five instances (Vogt 2005: 122; Oaks 1998: 25, cf. Woods et al. 1986: 144,) (which is the case in all but five tables). In all tables shown here, both tests arrive at the same conclusion about statistical significance, which supports Woods et al.’s assessment – given the ‘bad’ data situation in historical lin-guistics – to “go ahead and carry out the chi-squared test even if some ex-pected frequencies are rather too small” (1986: 145). Provided one is aware

Taking permissible shortcuts? 373

that the chi-square values tend to be rather larger than they ought to be and one would consider this in one’s interpretations, this seems to be acceptable practice. Chi-square calculations (χ2) follow the test statistic used in Nel-son, Wallis and Aarts (2002: 264–7) and are calculated to answer the ques-tion whether the choice of CAN is affected by the independent variable (periods 1 and 2+3, BrE and CanE respectively) and not whether the entire grammatical choice (table) is affected (see Nelson, Wallis and Aarts 2002 for an account). The latter method, usually offered by online tools (e.g. Georgetown Linguistics Web Chi Square Calculator), reaches significance more easily but says nothing about which of the two variables (CAN or MAY) is significant. Fisher’s Exact p-values are calculated with Preacher and Briggs’ (2001) online tool and show the two-tailed p-values (p), which produces values in between two more extreme values (one of which might be chosen according to the expected distribution of the variables, which was not applied in the following calculations).

Appendix 1: Root possibility uses of CAN

BrE-d CAN MAY CanE-l CAN MAY CanE-n CAN MAY

1 5 12 1 18 16 1 2 2

2+3 5 4 2+3 39 25 2+3 4 1

χ2 = 1.04 < 3.84, not

significant

p = 0.23 > 0.05, not sig-

nificant

χ2 = 0.24 < 3.84, not

significant

p = 0.52 > 0.05, not sig-

nificant

χ2 = 0.3 < 3.84, not sig-

nificant

p = 0.52 > 0.05, not sig-

nificant

CanE-d CAN MAY BrE-n CAN MAY n-1 CAN MAY

1 2 8 1 16 15 CanE 2 8

2+3 25 49 2+3 15 23 BrE 16 15

χ2 = 0.52 < 3.84, not

significant

p = 0.49 > 0.05, not sig-

nificant

χ2 = 0.56 < 3.84, not

significant

p = 0.34 > 0.05, not sig-

nificant

χ2 = 1.72 < 3.84, not sig-

nificant

p = 0.14 > 0.05, not sig-

nificant

374 Stefan Dollinger

BrE-l CAN MAY n-2+3 CAN MAY

1 24 16 CanE 25 49

2+3 19 8 BrE 15 23

χ2 = 0.27 < 3.84, not

significant

p = 0.44 > 0.05, not sig-

nificant

χ2 = 0.2 < 3.84, not sig-

nificant

p = 0.68 > 0.05, not sig-

nificant

Appendix 2: Permission uses of CAN

CanE-d

CAN MAY CanE-n

CAN MAY BrE-l CAN MAY

1 1 1 1 0 0 1 1 6 2+3 1 0 2+3 5 0 2+3 3 2 χ2 = 0.25 < 3.84, not significant p = 1.0 > 0.05, not significant

χ2 cannot be calculated p = 1.0 > 0.05, not significant

χ2 = 1.83 < 3.84, not significant p = 0.22 > 0.05, not significant

CanE-l

CAN MAY BrE-d CAN MAY BrE-n CAN MAY

1 2 11 1 2 2 1 4 4 2+3 8 16 2+3 4 2 2+3 6 2 χ2 = 1.01 < 3.84, not significant p = 0.44 > 0.05, not significant

χ2 = 0.3 < 3.84, not significant p = 1.0 > 0.05, not significant

χ2 = 0.4 < 3.84, not significant p = 0.61 > 0.05, not significant

l-1 CAN MAY l-2+3 CAN MAY CanE 2 11 CanE 8 16 BrE 1 6 BrE 3 2 χ2 = 0.37 < 3.84, not significant p = 1.0 > 0.05, not significant

χ2 = 0.78 < 3.84, not significant p = 0.34 > 0.05, not significant

Taking permissible shortcuts? 375

Appendix 3

Textual sources of the Corpus of Early Ontario English: periods, texts and sample sizes (number of words). CONTE-pC size: 125,000 words (CONTE size: ca. 225,000 words).

CONTE-pC is the first part of the CONTE corpus (1776–1899, see Dollinger 2006) and will be made accessible to reseachers as soon as copy-right has been fully cleared and the manual has been completed (at the time being, please contact the author for more information).

period newspapers diaries (semi-)official letters

Upper Canada Gazette, ca. 2,800 Canada Constellation, ca. 1,700

Benjamin Smith, ca. 1,800 Anne Powell, ca. 6,200

61 letters (by 48 authors)

1 1776–1799

sum: 4,500 words sum: 8,000 words sum: 15,000 words Upper Canada Guard-ian, ca. 8,200 Upper Canada Gazette, ca. 5,000 Kingston Gazette, ca. 1,400

Benjamin Smith, ca. 8,500 Ely Playter, ca. 8,500 (Eleanora Hallen, ca.3,700 not incl. in pC version)12

65 letters (by 48 authors)

2 1800–1824

sum: 14,600 words sum: 17,000 words sum: 15,000 words Upper Canada Gazette, ca. 8,500 Niagara Argus, ca. 4,500 Gore Gazette, ca. 3,000

Sophia MacNab, ca. 11,400 Charlotte Harris, ca. 9,200

77 letters (by 64 authors)

3 1825–1849

sum: 16,000 words sum: 20,600 words sum: 15,000 words TOT

125,700 genre total: 35,100 genre total : 45,600 genre total : 45,000

Period SIN-speakers Lower Class Middle Class

1 2,200 3,200 11,700 2 1,700 1,000 14,000 3 1,000 3,100 12,000

376 Stefan Dollinger

Social (and regional stratification) within CONTE-pC: sample sizes for Scottish-Irish-Northern English speakers (SIN) and social class subsec-tions.

Notes

* The research for this paper was funded by the Austrian Academy of the Hu-manities and Sciences, Österreichische Akademie der Wissenschaften, DOC grant 21701. I would like to thank three anonymous reviewers for their feed-backs on an earlier version of this paper.

1. Recently, a revision process has begun. See www.dchp.ca (21 Dec. 2006) for more details.

2. The best indicators are probably the founding of the Late Modern English Conference series (LMEC) in 2001 and the appearance of special collections within bigger English historical linguistics conferences (Bueno Alonso et al. eds. forthc., Dalton-Puffer et al. eds. 2006), the appearance of textbooks and reference guides (Bailey 1996, Romaine 1998, Görlach 1999, 2001, Beal 2004) and major projects such as Tieken-Boon van Ostade’s ”The codifiers and the English language“ project, see http://www.ulcl.leidenuniv.nl/index.php3?m=9&c=122 (31 Jan. 2006).

3. Inner-British migration is a factor to be considered here. While a departure port in England or Scotland does not necessarily mean that the immigrant was English or Scottish (Liverpool as a gateway for the Irish is a prime example, or Glasgow and Greenock in Scotland), “relatively few Britishers [English and Scottish] sailed from Irish ports” (Akenson 1999: 14). Cowan (1961: 287) stresses the fact the that Irish were using British ports, especially once steam-boat transportation from Ireland to Britain had become available (probably around 1825) reducing the costs. This would cause the Irish element to be un-derrepresented in figure (1), further increasing their share of the total (later) immigration.

4. I am indebted to Christian Mair at Freiburg University for granting me access to ARCHER-1 (compiled by Douglas Biber et al.)

5. Compiled by David Denison, Linda van Bergen and Joana Soliva Proud. My thanks go to David Denison for granting me online access to the corpus (cf. http://lings.ln.man.ac.uk/info/staff/dd/papers/newcastle_late_18c.pdf, 24 Jan. 2006).

6. OntE is referred to as CanE in the context of BrE and AmE, unless stated oth-erwise.

7. In the past two decades, genre-specific analyses have made big strides (Biber 1988, Kytö and Rissanen 1983), and while it is self-evident that variables dif-

Taking permissible shortcuts? 377

fer between genres, regional provenance will be foregrounded as the prime in-dependent variable. We will therefore need to devise a means how to assess the behaviour of a linguistics variable across the genres used. In this respect, we aim to complement detailed genre analysis with statements of the overall behaviour of a variable in a given variety in relation to other varieties.

8. Both the diagrams and the appendices use the following abbreviations: d = diaries, l = letters, n = newspapers, periods 1, 2+3 as defined in table (1). CanE-l would therefore mean “Canadian English letters” as used in the appen-dices.

9. 1st person WILL poses challenges for the 5-tier grid, without referring to the diachronic development, and shows the limitations of this, admittedly rough, assessment.

10. Note that Southern AmE corpora, SPOC and BLUR [see Schneider 2007: 355, 358] would not constitute input varieties of OntE.

11. However, given the steady, long-term development of the modals along clines [Traugott 1989, Abraham 2002], however, it is somewhat unlikely that rever-sals would have occurred.

12. Cf. Dollinger (2006: 25) for the provisional CONTE design (full version).

References

Abraham, Werner 2002 Modal verbs: epistemics in German and English. In Modality and its

interaction with the verbal system (Linguistik Aktuell, 47), Sjef Bar-biers, Frits Beukema and Wim van der Wurff (eds.), 19–50. Amster-dam: Benjamins.

Akenson, Donald H. 1999 The Irish in Ontario. A study in rural history. 2nd ed. Montreal:

McGill-Queen’s University Press. Avis, Walter S.

1978 Problems in the study of Canadian English. In Walter S. Avis: essays and articles. Selected from a quarter century of scholarship at the Royal Military College of Canada, Kingston, Thomas Vincent, George Parker and Stephen Bonnycastle (eds.), 3–12. Kingston: Royal Military College of Canada.

Avis, Walter S., Charles Crate, Patrick Drysdale, Douglas Leechman, Matthew H. Scharill Charles L. Lovell (eds.)

1967 A dictionary of Canadianisms on historical principles. Toronto: Gage.

378 Stefan Dollinger

Barber, Katherine (ed.)2

2004 Canadian Oxford dictionary. 2nd ed. Toronto: Oxford University [11998] Press.

Bausenhart, Werner 1989 German immigration and assimilation in Ontario, 1793–1918. New

York: Legas. Beal, Joan C.

2004 English in modern times 1700–1945. London: Arnold Hodder. Biber, Douglas

1988 Variation across speech and writing. Cambridge: Cambridge Uni-versity Press.

2004 Modal use across registers and time. In Studies in the history of the English language II. Unfolding conversations, Anne Curzan and Kimberly Emmonds (eds.), 189–216. (Topics in English Linguistics, 45) Berlin: Mouton de Gruyter.

Biber, Douglas, Susan Conrad and Randi Reppen 1998 Corpus linguistics. Investigating language structure and use. Cam-

bridge: Cambridge University Press. Blackburn, Simon

2005 The Oxford dictionary of philosophy. 2nd ed. Online version. Ox-ford: Oxford University Press.

Bloomfield, Morton W. 1948 Canadian English and its relation to eighteenth century American

speech. Journal of English and Germanic Philology 47: 59–66 [re-printed in Chambers (1975), 3–11].

Brinton, Laurel J. and Margery Fee 2001 Canadian English. In The Cambridge History of the English Lan-

guage. Vol. VI. English in North America, John Algeo (ed.), 422–440. Cambridge: Cambridge University Press.

Bueno Alonso, Jorge L., Dolores González Álvarez, Javier Pérez Guerra, and Esperanza Rama Martínez (eds.)

Forthc. ‘Of varying language and opposing creed': New Insights into Late Modern English. Bern: Peter Lang (= Linguistic Insights).

Chambers, J. K. (ed.) 1975 Canadian English. Origins and structures. Toronto: Methuen.

Chambers, J. K. 1981 ‘Lawless and vulgar innovations’: Victorian views of Canadian Eng-

lish. Toronto Working Papers in Linguistics 2: 13–44. 1993 “Lawless and vulgar innovations”: Victorian views on Canadian

English [rev. version] In Focus on Canada, Sandra Clarke (ed.), 1–26. (Varieties of English Around the World, G11) Amsterdam: Benja-mins.

Taking permissible shortcuts? 379

1998 English: Canadian varieties. In Language in Canada, John Edwards (ed.), 252–272. Cambridge: Cambridge University Press.

2004 ‘Canadian Dainty’: the rise and decline of Briticisms in Canada. In Lega-cies of colonial English. Studies in transported dialects, Raymond Hickey (ed.), 224-241. (Studies in English Language) Cambridge: Cambridge University Press.

Clyne, Michael (ed.) 1992 Pluricentric languages. Differing norms in different nations. Berlin:

Mouton de Gruyter. Coates, Jennifer

1983 The semantics of the modal auxiliaries. London: Croom Helm. 1995 The expression of root and epistemic possibility in English. In Mo-

dality in grammar and discourse (Typological Studies in Language 32), J. Bybee and S. Fleischmann (eds.), 55–66. Amsterdam: Benja-mins.

Cowan, Helen 1961 British immigration to British North America. The first hundred

years. Rev. and enl. ed. Toronto: University of Toronto Press. Dalton-Puffer, Christiane, Dieter Kastovsky, Nikolaus Ritt and Herbert Schendl (eds.)

Forthc. Syntax, style and grammatical norms: English from 1500–2000 (Linguistic Insights, 39). Bern: Peter Lang.

Denison, David 1993 English historical syntax: verbal constructions. (Longman Linguis-

tics Library) London: Longman. 1998 Syntax. In The Cambridge history of the English language. Vol. IV:

1776–1997, Suzanne Romaine (ed.), 92–329. Cambridge: Cambridge University Press.

de Wolf, Gaelan Dodds, Robert J. Gregg, Barbara P. Harris and Matthew H. Scar-gill (eds.)5

1997 Gage Canadian Dictionary. 5th ed. Toronto: Gage. [11967]

Dollinger, Stefan 2006 Oh Canada! Towards the Corpus of Early Ontario English. In The

changing face of corpus linguistics (Language and Computers, 55), Antoinette Renouf and Andrew Kehoe (eds.), 7–25. Amsterdam: Rodopi.

forthc.a. The modal auxiliaries HAVE TO and MUST in the Corpus of Early Ontario English: gradience and colonial lag theory in Early Canadian English. Canadian Journal of Linguistics 51/2&3: 189–210.

380 Stefan Dollinger

forthc.b. The importance of demography for the study of historical Canadian English: three examples from the Corpus of Early Ontario English. In J. L. Bueno Alonso et al. (eds.).

forthc.c. New-dialect formation in Canada: evidence from the modal auxilia-ries (Studies in Language Companion Series). Amsterdam: Benja-mins.

Ehrman, Madeline E. 1966 The meanings of the modals in present-day American English. The

Hague: Mouton. Facchinetti, Roberta

2000 Can and could in contemporary British English: a study of the ICE-GB corpus. In New frontiers of corpus research (Language and Computers. Studies in Practical Linguistics, 36), Pam Peters, Peter Collins and Adam Smith (eds.), 229–246. Amsterdam: Rodopi.

2002 Can. In Variation in central modals. A repertoire of forms and types of usage in Middle English and Early Modern English, Maurizio Gotti, Marina Dossena, Richard Drury, Roberta Facchinetti and Maria Lima (eds.), 45–65. Bern: Lang.

2003 Pragmatic and sociological constraints on the functions of may in contemporary British English. In Modality in contemporary English (Topics in English Linguistics, 44), Roberta Facchinetti, Manfred Krug and Frank Palmer (eds.), 301–327. Berlin: Mouton de Gruyter.

Georgetown Linguistics Web Chi Square Calculator. Georgetown Linguistics Web Chi square Calculator. Programmed by Catherine N. Ball, Jeffrey Connor-Linton, Kathi Taylor. http://www.georgetown.edu/faculty/ballc/webtools/web_chi.html (28 Dec. 2006).

Görlach, Manfred 1987 Colonial lag? The alleged conservative character of American Eng-

lish and other ‘colonial’ varieties. English World-Wide 8 (1): 41–60. 1988 The study of early Modern English variation - the Cinderella of Eng-

lish historical linguistics. In Historical dialectology. Regional and social, Jacek Fisiak (ed.), 211–228. (Trends in Linguistics. Studies and Monographs, 37) Berlin: Mouton de Gruyter.

Gourlay, Robert 1822 Statistical account of Upper Canada. Compiled with a view to a

grand system of emigration (CIHM 35937). London: Simkin & Mar-shall.

Hickey, Raymond 2004a Introduction. In Legacies of colonial English. Studies in transported

dialects, Raymond Hickey (ed.), 1–30. (Studies in English Lan-guage) Cambridge: Cambridge University Press.

Taking permissible shortcuts? 381

2004b Dialects of English and their transportation. In Legacies of colonial English. Studies in transported dialects, Raymond Hickey (ed.), 33–58. (Studies in English Language) Cambridge: Cambridge University Press.

Hultin, Neil C. 1967 Canadian views of American English. American Speech 42: 243–

260. Jones, Charles

1989 A history of English phonology. London: Longman. Kytö, Merja

1991 Variation in diachrony, with early American English in focus. Stud-ies on CAN/MAY ad SHALL/WILL. Bern: Lang.

Kytö, Merja, Juhani Rudanko and Erik Smitterberg 2000 Building a bridge between the present and the past: a corpus of 19th-

century English. ICAME Journal 24: 85–97. Kytö, Merja and Matti Rissanen

1983 The syntactic study of Early American English. The variationist at the mercy of his corpus? Neuphilologische Mitteilungen 84: 470–490.

Labov, William 1994 Principles of linguistic change. Volume 1: Internal factors. Oxford:

Blackwell. Lovell, Charles J.

1955 Lexicographic challenges of Canadian English. Journal of the Cana-dian Linguistic Association (Canadian Journal of Linguistics) 1/(1) (Mar.): 2–5.

Marckwardt, Albert H. 1958 American English. New York: Oxford University Press.

Michalewicz, Zbigniew and David B. Fogel 2004 How to solve it: modern heuristics. 2nd ed. Berlin: Springer.

Nelson, Gerald, Sean Wallis and Bas Aarts 2002 Exploring natureal language. Working with the British component of

the International Corpus of English. Amsterdam: Benjamins. Nevalainen, Terttu

1999 Making the best use of ‘bad’ data. Neuphilologische Mitteilungen. 100 (4): 499–533.

Oaks, Michael P. 1998 Statistics for corpus linguistics. Edinburgh: Edinburgh University

Press. Orkin, Mark M.

1970 Speaking Canadian English. An informal account of the English [1971] language in Canada. [Reprint]. New York: McKay Company.

382 Stefan Dollinger

Preacher, Kristopher J. and Nancy E. Briggs 2001 Calculation of Fisher’s Exact Test: an interactive calculation tool

for 2 x 2 tables. Avail. from http://www.quantpsy.org (20 Dec. 2006).

Scargill, Matthew H. 1957 Sources of Canadian English. Journal of English and Germanic Phi-

lology 56: 611-614 [reprinted in Chambers (1975), 12–15]. Schneider, Edgar W.

2007 MY BABY LOVES ME , SHE LOVE ME: verbal -s variability in the history of black and white dialects of the southern United States. In From Beowulf to But’n’ben a-go-go: varieties of English throughout time (Austrian Studies in English, 95), Ute Smit, Stefan Dollinger, Julia Hüttner, Gunther Kaltenböck and Ursula Lutzky (eds.), 349–362. Vienna: Braumüller.

Simon-Vandenbergen, A. M. 1984 Deontic possibility: a diachronic view. English Studies 65 (4): 362–

365. Sundby, Bertil, Anne Kari Bjørge and Kari E. Haugland

1991 A dictionary of English normative grammar 1700–1800. Amsterdam: Benjamins.

Thomas, Eric R. 1991 The origin of Canadian Raising in Ontario. Canadian Journal of

Linguistics 36: 147–170. Traugott, Elizabeth Closs

1972 A history of English syntax. A transformational approach to the his-tory of English sentence structure. New York: Holt, Rinehart & Winston.

1989 On the rise of epistemic meanings in English: an example of subjec-tification in semantic change. Language 65 (1): 31–55.

Trudgill, Peter 2002 The history of the lesser-known varieties of English. In Alternative

histories of English, Richard Watts and Peter Trudgill (eds.), 29–44. London: Routledge.

2004 New dialect formation. The inevitability of colonial Englishes. Edin-burgh: Edinburgh University Press.

Tversky, Amos and Daniel Kahneman 1982 Judgment under uncertainty: heuristics and biases. In Judgment un-

der uncertainty: heuristics and biases, Daniel Kahnemann, Paul Slovic and Amos Tversky (eds.). Cambridge: Cambridge University Press.

Vogt, W. Paul 2005 Dictionary of statistics & methodology. Thousand Oaks, CA: Sage.

Taking permissible shortcuts? 383

Warner, Anthony R. 1993 English auxiliaries. Structure and history. Cambridge: Cambridge

University Press. Watts, Richard and Peter Trudgill (eds.)

2002 Alternative histories of English. London: Routledge. Wolfram, Walt and Natalie Schilling-Estes

2004 Remnant dialects in the coastal United States. In Legacies of colonial English. Studies in transported dialects, Raymond Hickey (ed.), 172–202. (Studies in English Language) Cambridge: Cambridge University Press.

Wood, J. David 2000 Making Ontario. Agricultural colonization and landscape re-

creation before the railway. Montreal: McGill-Queen’s University Press.

Woods, Anthony, Paul Fletcher and Arthur Hughes 1986 Statistics in language studies. Cambridge: Cambridge University

Press.