: ottawa-hull: 3,5 . tlf: 150 . : pisa corpus: 10 . : inl ... · barnbrook, geoff. 1996. ......

14
1 : Abstract The paper is the first public presentation of the research programme Basic Corpus of Greek Texts , under the co-operation of the University of Athens and the University of Cyprus, aiming at building a new, extensive and representative corpus of Greek. In particular, the Corpus of Greek Texts (CGT) is envisaged as collecting a substantial amount of data (30 million words) in a short time span (1-2 years) as a basis for linguistic research and a resource for teaching applications. The scope and representativeness of the genres included, as well as free accessibility to it, will make CGT one of the most necessary tools for the study of Greek. The paper presents the research area, the aims and needs of the programme, the identity and structure of the CGT, as well as the methodological issues and linguistic implications and applications related with the compilation of the corpus. - , ( ) 1. « », . , . 1 ( ), ( ) . , , , . (computer corpus linguistics). , , , , . . ( 1999: 168, Leech 1992, Sinclair 1991). « , , » ( 1999: 170, . Aarts 1991)

Upload: vunhu

Post on 21-Apr-2018

223 views

Category:

Documents


7 download

TRANSCRIPT

Page 1: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

1

:

Abstract

The paper is the first public presentation of the research programme Basic Corpus of Greek Texts , under the co-operation of the University of Athens and the University of Cyprus, aiming at build ing a new, extensive and representative corpus of Greek. In particular, the Corpus of Greek Texts (CGT) is envisaged as collecting a substantial amount of data (30 million words) in a short time span (1-2 years) as a basis for linguistic research and a resource for teaching applications. The scope and representativeness of the genres included, as well as free accessibility to it, will make CGT one of the most necessary tools for the study of Greek. The paper presents the research area, the aims and needs of the programme, the identity and structure of the CGT, as well as the methodological issues and linguistic implications and applications related with the compilation of the corpus.

-

, ( )

1.

« »,

.

,

.1

( ), ( )

.

, ,

,

.

(computer corpus

linguistics).

,

, , ,

. . ( 1999: 168,

Leech 1992, Sinclair 1991). «

, , »

( 1999: 170, . Aarts 1991)

Page 2: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

2

( . .: 219). ,

,

(Leech 1992: 106).

Sinclair (1996),

,

,

.

, ,

,

.

,

( . ).

Kennedy, «

, » (1998:

291). ,

.

, , , , ,

, , ,

( 1999: 170).

.

2.

,

, ,

,

.

:

-

/

- ,

- ,

- ( . . , . .)

Page 3: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

3

-

.

, (30

) (1-2 )

.

,

.

.

,

( . Goutsos, King Hatzidaki 1994, ). ,

90,

( . .) 10 .

,

( . . ),

, , ,

.

, ,

. ,

, :

:

BNC Corpus: 100 . (10 . )

Bank of English Corpus: 329 .

(60 . )

Cancode corpus: 5 .

:

Ottawa-Hull: 3,5 .

ELRA Parole Corpus: 20 .

TLF: 150 .

:

Mannheim Corpus: 8 .

Muenster Textbank: 94 .

:

Pisa Corpus: 10 .

: Corpus Oral De Referencia Del Espanol: 1,1 .

Mark Davies Modern Newspapers: 35 .

: Mark Davies Modern Newspapers 26 .

: INL 1995 27 .

INL 1996 38 .

: 10 . .

( . Goutsos, Hatzidaki, King 1994)

( ) ,

Page 4: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

4

, ,

.

.

( . , . Renouf 1987 ,

). (30 )

, . ,

, Cobuild ,

, 20 (Sinclair

1987).

. ,

,

(Georgakopoulou Goutsos 1998).

3.

,

,

. ,

.

1990

. , .

:

: 30

:

3 . (10 %)

:

0,5 .

:

1,5 .

, :

1 .

:

27 . (90 %)

: :

5 .

: :

5 .

:

5 .

: :

5 .

: :

5 .

:

2 .

Page 5: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

5

3

(1 2

).

,

.

( , , , . .)

.

. ,

-

.

4.

,

« » , ,

Sinclair (1996), .2

« »

. Barnbrook (1996:

24),

.

, ,

,

BNC ( . )

,

, , , Bank of English

(monitor corpus),

( . Barnbrook 1996: 25).

(Biber, Conrad Reppen 1998: 248).

(

, , , . .),

( 1999: 56)

.

Ku era (2002)

,

Page 6: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

6

, , , « »

.

(30 ),

.

,

,

( . . , Cobuild ,

).

, ,

, . ,

( ) .

( .

) ( )

.

,

,

,

.3

, ,

,

.

.

,

, , ,

.4

,

,

. (

,

).

,

,

- .

,

( . )

,

,

. ,

,

. ,

.

Page 7: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

7

5.

:

/

( )

( )

,

.

, ,

. ,

, , ,

. , ,

( , , , . .).5

, , ( . .

, , . .) ASCII. ,

, , , . .,

. ,

. ,

, , ,

, . ., .

:

: -

: , , , , , , , ,

: -

: , , - , , , , , , ,

,

- : 01-99

: -

-

,

. ,

Page 8: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

8

- · ,

-

. , ,

( -

)

. - ,

.

,

.

. (2003),

,

,

. (2004)

, ,

.

. ,

. ,

.

12 . ,

:

www.ucy.ac.cy/ sek.

6.

,

. ,

:

) :

,

, ,

, -

( . Chafe, Du Bois Thompson 1991: 64-66).

,

, Goutsos, Hatzidaki King (1994), , King

(1995), Georgakopoulou Goutsos (1998) Goutsos (1999).

) :

( .

Wichmann, Fligelstone, McEnery Knowles 1997, , ).

Page 9: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

9

, ,

, CD .

) :

:

1) ,

,

2) ,

(1998),

3) ,

, ( . . Perseus Project

).

) :

.

,

(tagging)

(annotation) .

,

.

, , .

-

50.000

200.000

50.000

200.000

1 1

50.000

1 1

100.000

1 1

50.000

1 2-3

50.000

1

150.000

1

100.000

250.000

1.250.000

1 1

200.000

1 2-3

200.000

1 1

90.000

10.000

2.052.000

1.539.000

564.300

Page 10: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

10

410.400

359.100

102.600

51.300

51.300

270.000

/

1.166.400

1.166.400

.

583.200

/

1.270.080

1.270.080

.

635.400

64.800

1.382.400

/

1.382.400

691.200

259.200

/

259.200

129.600

86.400

/

86.400

43.200

777.600

/

648.000

-

388.800

- -

259.200

129.600

129.600

259.200

259.200

/

216.000

-

129.600

- -

86.400

43.200

43.200

86.400

/

777.600

777.600

-

518.400

- -

518.400

518.400

518.400

518.400

518.400

259.200

259.200

-

756.000

-

756.000

32.400

.

81.000

32.400

Page 11: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

11

.

81.000

32.400

64.800

108.000

216.000

1 « » :

:

( )

( )

:

( , )

(Department of Byzantine and Modern Greek Studies, King s College London)

Philip King (School of English

EISU, Birmingham)

-

-

- ( )

:

« »,

:

,

( )

:

,

, ,

, .

2 .

3 , (10 %), , ,

, (Goutsos, King & Hatzidaki 1994).

4 .

5 , ,

.

Aarts, Jan. 1991. Intuition-based and observation-based grammars . English Corpus Linguistics,

. Karen Aijmer Bengt Altenberg, 44-62. London: Longman.

Barnbrook, Geoff. 1996. Language and Computers. Edinburgh: Edinburgh University Press.

Page 12: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

12

Biber, Douglas, Conrad , Susan

Reppen, Randi. 1998. Corpus Linguistics. Investigating

Language, Structure and Use. Cambridge: Cambridge University Press.

Chafe, Wallace L., Du Bois, John W.

Thompson, Sandra A. 1991. Towards a new corpus of

spoken American English . English Corpus Linguistics, . Karen Aijmer

Bengt

Altenberg, 64-82. London: Longman.

Georgakopoulou, Alexandra

Goutsos, Dionysis. 1998. Conjunctions versus d iscourse

markers in Greek: The interaction of frequency, positions and functions in context .

Linguistics 36 (5). 887-917.

, . 1999. . :

.

Goutsos, Dionysis. 1999. Translation in bilingual lexicography. Editing a new English-Greek

Dictionary . Babel 45 (2). 107-126.

Goutsos, Dionysis, Hatzidaki, Rania

King, Philip 1994. A corpus-based approach to

Modern Greek language research and teaching . Themes in Greek Linguistics: Papers from

the First International Conference on Greek Linguistics. Reading, September 1993. . Irene

Philippaki-Warburton, Katerina Nicolaid is

Maria Sifianou, 507-513.

Amsterdam/Philadelphia: John Benjamins.

Goutsos, Dionysis, King, Philip

Hatzidaki, Rania. 1994. Towards a Corpus of Spoken

Modern Greek . Literary and Linguistic Computing 9 (3). 215-223.

, , King, Philip

, . 1995. corpus

. .

15

, 11-14 1994, 843-854. :

.

Kennedy, Graeme. 1998. An Introduction to Corpus Linguistics. London: Longman.

Ku era, Karel. 2002. The Czech National Corpus: Principles, design and results . Literary and

Linguistic Computing 17 (2). 245-257.

Leech, Geoffrey 1992. Corpora and theories of linguistic performance . Directions in Corpus

Linguistics, . Jan Svartvik, 105-122. Berlin/New York: Mouton de Gruyter.

, . 1998. . :

.

, . 1999. . . :

Gutenberg.

Renouf, Antoinette. 1987. Corpus development . Looking Up. . John Sinclair, 1-40. London

and Glasgow: Collins ELT.

Sinclair, John. ( .) 1987. Looking Up. London and Glasgow: Collins ELT.

Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.

Sinclair, John. 1996. Preliminary recommendations on corpus typology . EAGLES

( http: / / www.ilc.pi.cnr.it/ EAGLES/ corpustyp/ corpustyp.html).

Page 13: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

13

Wichmann, Anne, Fligelstone, Steven, McEnery, Tony Knowles,

Gerry ( .). 1997.

Teaching and Language Corpora. London: Longman.

, . ( ).

.

(19-22 2001).

Page 14: : Ottawa-Hull: 3,5 . TLF: 150 . : Pisa Corpus: 10 . : INL ... · Barnbrook, Geoff. 1996. ... Corpus, Concordance, Collocation. Oxford: Oxford University Press. Sinclair, John. 1996

This document was created with Win2PDF available at http://www.daneprairie.com.The unregistered version of Win2PDF is for evaluation or non-commercial use only.