-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
1/13
A corpus-based approach to argumentstructure
Jos M. Garca-Miguel
Universidade de [email protected]
2007 RRG International Conference (Mxico) RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Goals
To present some possibilities offered by asyntacto-semantic database (ADESSE) for thestudy of the interactions of verbs and constructionin Spanish
To introduce the main criteria used in the buildingof such a database
To overview some general features of argument
structure in Spanish
To compare both the approach and the results ofADESSE with some insightful proposals of RRGtheory
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 3
Functional Grammar(s)
This talk is not specifically about RRG, but it takes asbackground many ideas shared by most functionalists
Functionalist share the idea that grammar associates formswith meanings and discourse functions: language is a system of communicative social action in which
grammatical structures are employed to express meaning incontext (Van Valin 2005: 1)
They differ with respect to (among other things): Standards of adequacy (typological, psychological, ) How the conceive the relation system - use The role of formalization and the specifics of the formalism
Moderate functionalists (RRG) Extreme functionalists
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
LEXICON
CONSTRUCTIONAL
SCHEMAS
LANGUAGE
USE
'entrenchment'
(frequency memory)
(A simplified model of)Functional grammar
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 5
Variation and Grammar:
the 'emergentist' view
"Grammar is built up from specific instancesof use which marry lexical items withconstructions; it is routinized and entrenchedby repetition and schematized by thecategorization of exemplars" (Bybee 2006)
Grammar is not fixed and absolute with aittle variation sprinkled on the top, but it isvariable and probabilistic to its very core"(Bybee & Hopper 2001)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Corpus linguistics
Nowadays, to study language use and frequencyof words and constructions means corpuslinguistics
Computers have made possible the quick search oflarge bodies of real text and make easier the task ofanalysing, annotating and storing linguistic data
Some linguists (e.g. Butler think that functional linguistics should not only corpus-based but corpus-driven
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
2/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 7
Corpus linguistics
me problems:
The words and patterns that we find in the corpus should not beconfused with the words and patterns that are possible in thelanguage
A corpus cannot tell us what is not possible
Every (fragment in the) corpus needs some analysis andinterpretation by the linguist
"The conclusion is that 'intuition-based' linguists and
corpus-based' linguists need each other. Or better, that thetwo kinds of linguists, wherever possible, should exist in thesame body" (Fillmore 1992: 35)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Corpus linguistics and syntax
"Corpus linguistic research has been largely limited to phenomenathat can be accessed via searches on particular words ()
However, a (theoretical) syntactician is usually interested in moreabstract structural properties that cannot be investigated easily inthis way"(Manning 2003: 294)
I.e., it is not always enough with google search of raw text, nor even withmorphosyntactically annotated corpora (or with texts accompanied ofinterlinearized glosses)
We need detailed syntactic and semantic annotation ofcorpora
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 9
ADESSE
ESSE =Base de datos de verbos, Alternancias de Ditesis yEsquemas Sintactico-Semnticos del EspaolSyntactic Database of Verbs, Diathesis Alternations and
Constructional Schemas of Spanish]
http://adesse.uvigo.es/
An on-going project funded by Spanish MEC and EU funds
Goal
A database with syntactic and semanticnformation for all the verbs and clauses in acorpus of Spanish
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
BDSBase de Datos Sintcticos del espaol actualhttp://www.bds.usc.es/
A database with the (manual) syntactic analysis of 159,00clauses of the corpus ARThus
ARTHUS Corpus (Archivo de Textos Hispnicos de la Universidade Santiago de Compostela)
1.5 million words
Textual genres: narrative (37%), spoken (19%), essay (17%),theater (15%), journalistic (12%)
Origin: Spain (79%), Americas (21%)
ADESSE antecedent
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 11
BDS ADESSEUSC 1990-1999]
Syntacticinformation
Grammatical featuresof clauses, verbs andarguments of thecorpus[Each record (clause):64 fields]
[Univ. of Vigo 2002- ]
All the syntactic information fromBDS
+Semantic information
- Verb senses
- Verb classes
- Semantic roles
159.000 clauses
3.500 verbs
13.500 valency patterns
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
3/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 13
(part of) a record in ADESSE
IVDder
S D Int Schema
novioautomvil
ReceiverPossessionDonorm RoleNPNPnt Category
le3plreement
IObjDObjSubjnt Function
A2A1A0
Activeice
Transfer of possessionrb class
REGALARED
Al novio le regalaron un automvil convertible [CRO: 44, 1]'They gave the bridegroom a convertible car as gift'
XT
Valueeld
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
BDS/ADESSE
Other grammatical features
Clause: Clause Type (main, subordinate, ) Mood Tense Modal and Phase Auxiliaries Negation Illocutionary force Voice
Arguments: Definiteness Number Person
SENTENCE
CLAUSE
CORE
NUC
PRED
DP
NPNP
AGR AGR
novio le regala- ron un automvil
a: DAT ACTIVE PSA:NOM (ACC)
3pl], )] CAUSE [BECOME have (novio, automvil)]
0:donor>
Cat: < NP , NP>
SchemaSynt Funct: < S D I>
Order: < I V S >
Agr: < le, S3pl >
Sem roles
UNDERGOERACTORNMR
RRG representation and ADESSE features
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
BDS/ADESSE database aims to be theory-neutral
it only assumes common Basic Linguistic Theory(in the sense proposed by Dixon)
but is fairly compatible with functional andconstructional grammars
the approach is aimed to correct or complementbasic linguistic theory (or theories) in the light ofcorpus evidence
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 17
Basic strategies:
Verbs and arguments in ADESSEBDS provides a syntactic characterization ofarguments and constructions ESCRIBIR 'write'
Subj DO IO Subj DO
SUSTITUIR 'substitute, replace' Subj DO porNP Subj DO
ENSEAR
Subj DO IO Subj DO a Inf
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Verbs and arguments in ADESSEIn many cases, each syntactic construction selections asubset of the potential participants of the scene evoked bythe verb
a) Juan [0] le escribi una carta [1] a su madre [2] sobre susrecuerdos de infancia [3]John wrote a letter to his mother about his childhoodremembrances
b) Juan [0] escribi una carta [1]John wrote a letter
c) Juan [0] le escribi a su madre [2]John wrote to his mother
The task in ADESSE is to annotate which of the potentiparticipants is selected in each syntactic schema
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
4/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 19
Verbs and arguments in ADESSE
e same syntactic construction can be mapped with differentconfigurations of semantic arguments
Sustituir'replace'a) Deco [1] sustituy a Xavi[2]
Deco replaced Xavi
b) Rijkaard[0] sustituy a Xavi[2]Rijkaard replaced Xavi
c) Rijkaard[0] sustituy a Xavi[2]por Deco [1]Rijkaard replaced Xavi with Deco
e same set of semantic arguments can be linked to different
syntactic patterns Ensear'teach'
a) Ella [0] le [1] enseaba su idioma [2]She taught him her language
b) Ella [0] ense al nio [1] a caminar[2]She taught the baby how to walk
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 2
Verbs and Arguments in ADESSE
Valency potential of a lexical entry: which arguments can be selected by a given verb?
Valency realizations (diatheses): which arguments are actually expressed
which is syntactic realization of each argument
voice
(The strategy in ADESSE is to define the valency potential of each verbentry and to register in the corpus all the valency realizations)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 21 RRG 2007Garca-Miguel: Corpus-based approach to argument structure 2
Valency patterns and frequency
Enseaban a saber comer[MAD:277]2A2:Obl(a)A0:Subj
Enseaban cosas tiles [MAD:277]18A2:DObjA0:Subj
A coser la enseaban desde pequea [USOS:71A2:Obl(a)A1:DObjA0:Subj
Kant ense en Knigsberg[TIE:195]4A0:Subj
Eso me ensear a fiarme de ti [PAI:161]22A2:Obl(a)A1:IObjA0:Subj
Si su hijo no sabe, l le ensear [SON:99]28A1:IObjA0:Subj
Me enseaba su idioma [JOV:134]58A2:DObjA1:IObjA0:Subj
ExamplesNValency pattterns in Active Voice
A2
(Content)A1
(Learner)A0
(Teacher)
Valency potential of the verb ENSEAR 'teach'
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 23
Arguments and gradienceValency patterns occur in the corpus with differentfrequency, ranging from the more usual to the rare andunexpected.
As a consequence, verb arguments are not alwayssyntactically realized Obligatoriness optionality of arguments is not a yes or no matter,
but a gradient
Obligatory arguments are those (referential) elements morefrequently tied to the verb in texts
Because obligatoriness is one of the main criteria for theargument adjunct distinction, this is also a gradient
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 2
Arguments and gradienceArguments ofEnsear'teach' (N clauses =139)
(75.5 %)105Content2
(79.9 %)111Learner1
(99.3 %)138Teacher0
Arguments ofEscribir'write' (N clauses = 321)
(3.1 %)10Topic4(26.5 %)85Receiver2
(64.8 %)208Text1
(93.5 %)300Writer0
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
5/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 25
Frequency and argument structure
Argument structure' needs to be replaced by agreatly enriched probabilistic theory capturing theentire range of combinations of predicates andparticipants that people have stored as sorted andorganized memories of what they have heard andrepeated over a lifetime of language use
(Thompson & Hopper 2001: 47)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 2
Frequency and argument structure
Manning (2003)
Many subcategorization distinctions presented in thelinguistics literature as categorical are actuallycounterexemplified in studies of large corpora of writtenlanguage use.
We can get a much better picture of what is going on byestimating a probability mass function (pmf) over thesubcategorization patterns for the verbs in question.
we can put a probability function over the kinds ofdependents to expect with a verb or class, conditioned onvarious features. (302)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 27
Frequency and argument structure
Proposals
The meaning of a verb determines its contexts of use ands determined by its contexts of use
Argument structure is a generalization over registered use,where frequency/probability of cooccurrence is a speciallymportant factor in the entrenchment of a valency pattern.
nstead of obligatory arguments, or participants inherent toa scene, our past linguistic experiences provides us with
certain probability expectations about the reference to acertain participant type in the scenes evoked by a verb
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 2
Verbs and constructions
A Basic problem: Polisemy and contextualacommodation. Changes in meaning when a verbenters alternating valency patterns
The formalization of syntactic alternations different lexical entries: each different meaning is a
different verb entry
lexical rules (RRG), relating different LS
underspecification: only one verb meaning, thedifferences in meaning should be attributed to theconstructions
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 29
Verbs and constructionsStrategies in ADESSElexical rules or constructions?
Underspecification: Reduce lexical entries to aminimum (searching wider coverage of the corpus andless task consuming)
(new strategy in ADESSE-II): Levels of granularity in thedefinition of verb senses
Nevertheless, this is a 'false dichotomy' (Croft 2003), motivated either
by the level of granularity or by the perspective adopted when linkingsyntax and semantics (Van Valin 2004)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 3
Lexical entries in ADESSETwo levels: Level 1: Macro-aception [verb 'meaning'], associated with a semantic
domain and a set of participant roles Level 2: (Sub)aception [verb 'senses'] {work in progress}
Volver1.2 (met) 'return to a state or activity'
Volver-3 '(cause) become'
Volver-2 'turn round'
Volver1.1 (lit) 'return to a place'Volver-1 'return, go back'
Ensear-2 'teach'
Ensear-1 'show'
Conocer 1.3 'understand, know deeply'
Conocer 1.2 'recognize, distinguish'
Conocer 1.1 'get to know'Conocer-1 'know'
LEVEL 2LEVEL 1
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
6/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 31
Verb Entries
(At level 1) We distinguish verb entries when they areassociated with different sets of semantic roles that means differences in valency potential,
not in valency realization
We try to limit sense distinctions to a minimum, but we mustdistinguish senses that cannot be 'unified': partir 1 (go away) vs.partir 2(break) saber 1 (know) vs. saber 2(taste)
or (less clearly) senses which are related with different lexical
classes ensear 1 ('show') vs ensear 2('teach')
Anyway, there no clear boundaries between senses at anylevel (cf. Kilgarriff 1997)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 3
Unifying verb senses
Typically included in one single verb entry (level 1):
Diathesis alternations(causative / inchoative, locative alternation, ...)
Semantic differences are attributed primarily to the construction, nto the verb
Paradigmatic alternatives within an argument(For ex: write a letter / a novel / a musical work)
Meaning accommodation or co-compositionality, but not differentverb senses
Metaphoric and other figurative uses they are annotated as the literal uses, but marked as figurative
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 33
Paths of Schematization
IODOPredSubjENSEAR
ingls y francsenseabaMi padre
(to somebody)(a language)teaches(somebody)
OIndODirensearSuj
My father taught us English and French
[[SEVSEV: 277]: 277]
ching verbs/frame
nos
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 3
Syntactic Patterns Verbs
Once we have defined verb entries and syntactic patterns, we cantake two complementary (not necessarily incompatible) points ofview concerning the association of verbs and constructions:
P. of v. of the verb:alternations (for ex., Levin 1993) of valency patternskeeping, as far as possible, the lexical elements Me ense a cantar "She taught me to sing"
Me ense canto "She taught me singing"
P. de v. of the construction:'surface generalizations' concerning uses of a
constructional schema with different lexical elements(Goldberg 1995, 2000; also Dowty 1998)
Me ense a cantar "She taught me to sing"
Me oblig a cantar "She forced me to sing"
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 35
Verbs and alternationsany lexicalist approaches havecused on whether verbs admitscertain alternation, for ex. the
ausative alternation
r, alternatively, whether theyan be combined with a certainpe of argument) No
Yes
A0 A1
YesCrecer
'grow'
YesCambiar'change'
A1
YesYesEnsear'teach'
YesNoAprender'learn'
N1 N2A0 N1 N2
this is not a yes/no question,ough the syntagmatic axis
serve to describe theavioral profile of verbs
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 3
Arguments and behaviour profile:
Change of state verbs
ABRIR
CAMBIAR
CERR
AR
ROMPE
R
E
NCEN
DER
CREC
ER
AR
REG
LAR
APAG
AR
LIMPIAR
ORGANIZAR
RESO
LVER
AUME
NTAR
BOR
RAR
PROLON
GAR
ILUMI
NAR
0: AGT
1: PAT
88%100%
90%85%83%
100%100%99%94%99%100%100%100%100%98%
81%
38%
90%
81%
100%
0%
88%
75%
100%
88% 90%
45%
83%
54%
72%
0%10%20%
30%40%
50%
60%
70%
80%
90%
100%
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
7/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 37
Arguments and behavior profile:Verbs of knowledge
SABE
R
CONO
CER
RECORD
AR
COMPR
ENDER
ESTU
DIAR
OLVIDAR
ACORDA
R
APREN
DER
ENTE
RAR
IMAGIN
AR
ENSEAR
SOA
R
DEMOSTRA
R
0: Inducer
2: Content
1: Cognizer
0%
20%
40%
60%
80%
100%
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 3
Valency alternations and behavior profil
Each lexical element can be described in terms of thefeatures and contructions it combines with in context (itsyntagmatic profile)
A relevant part of the profile of a verb are its constructionaschemas, the realizations of its arguments, and thefrequencies of schemas and argument realizations
Two verbs of the same class may have similar sintagmaticcombination and differ in the relative frequency of each
combination, and as a consequence in the relativefrequency of their core arguments
It is hypothesized that the differences in frequency aremotivated by differences in meaning
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 39
"Surface generalizations"
(cf. Goldberg 1995, 2002; also Dowty 2000)
e can see the relation between verbs and constructionsfrom the point of view of the constructions.
That is, any constructional schema (vgr., passive, doubleobject, ) can be described by itself and not as derivedfrom any other alternating schema.
The meaning of a constructional schema is established(and learned) by generalizing from the meaning ofparticular utterances that instantiate the schema An essential part of the characterization of a constructional schema
comes from its association with a class of verbs. The strength of the association of a constructional schema and a
verb should be measured on the basis of a (syntactically annotated)corpus
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 4
Verbs in the schema
El profesor le dejaba la casa para que la habitara [HI96'Lend'DEJAR
Le trae un kilo de bombones a mam [BAI:424]96'Bring'TRAER
Si se te cae un ojo, te pondrn otro enseguida [MOR:101'Put'PONER
Le explic cmo funcionaba la imprenta [TER: 062]112'Explain'EXPLICAR
No puedo ofrecerles grandes comodidades [LAB: 115119'Offer'OFRECER
Me permite usar su telfono? [SON: 285]124'Allow'PERMITIR
Me pregunt que si tena dinero [LAB: 268]219'Ask, inquire'PREGUNTAR
Nos pidi que lo esperramos [HIS: 013]273'Ask, request'PEDIR
Rosetta le contaba que el otro iba empeorando [SON:308'Tell'CONTAR
Le hizo la promesa de llevarle al local[TER: 119]514'Make'HACER
Le dije que no quera volver[TER: 046]599'Say, tell'DECIR
Mi padre slo me da dinero para estudiar[SON:115]1272'Give'DAR
ExampleNMeaningVERB
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 41
Verb classes of clauses in the schema
Active
6158127(causarcause)20118EXISTENTIAL
costarcost17180ATTRIBUTIVE
tocartouch80305OTHER FACTS
abriropen140430CHANGES
recordarremember64615MENTAL
traerbring88686LOC & MOVEMENT
hacerdo, make18806MODULATION
decirsay, tell1082419COMMUNICATION
dargive802568POSSESSION
Example verbsVerbsClausesCLASS
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 4
S D/Ian a Inf
More frequent verbs in the constructional schema S D/Ian a Inf (ADESSE):
Me mand a comprar pan9DesplazamientoMANDAR
Me forz a abandonar el intento9ObligacinFORZAR
Me sac a bailar//lo sac a saludar10DesplazamientoSACAR
Me anim a escribir10InduccinANIMAR
(esto) Me impuls a escribir11InduccinIMPULSAR
Me ense a coser23ConocimientoENSEAR
Me invit a ver su casa42InduccinINVITAR
Me llev a ver una casa52DesplazamientoLLEVAR
Me ayud a triunfar70InduccinAYUDAR
Me oblig a salir94ObligacinOBLIGAR
ExampleNClassVERB
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
8/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 43
Paths of Schematization
Grouping verbs in classeso main paths of schematization / generalization:A verb is related, by its lexical meaning, with other partly similar verbs ensear-1 'show' mostrar'show', ver'see', mirar'look'
ensear-2'teach' aprender 'learn', estudiar'study', saber'know'
On the other hand, by being used in a syntactic schema, it issemantically construed as other verbs that realize the same pattern ensear dar'give', decir `say', contar 'tell', preguntar'ask',
ensear dar'give', decir `say', contar 'tell', preguntar,
That puts each verb in a complex network of semantic relations
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 4
S D I S D/I a infinitive
darhacer decir
ENSEAR
invitarayudar obligar
ver
mostrar
mirar
aprendersaber
conocer estudiar
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 45
Semantic Classes of Verbs/Preds
wo main criteria of semantic classification of verbs(both of them used in RRG)
Aktionsart classes (based on LS) State, Activity, Achievement, Accomplishment, etc
Ontological/conceptual classes(based on the 'constant' part of LS) Location, perception, cognition, consumption, etc.
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 4
ADESSE Verb Classes
Goal of ADESSE verb classification:to represent generalizations over types of conceptualframes evoked by individual verbs
It is a conceptual/ontological classification, inspired inlexical relations of synonymy and hyponymy/troponymy,not aspectual nor primarily syntactic
It is a hierarchical classification, with up to four levels at thpresent stage Top level classes 6 options
[~Hallidays 'process types'] Classes recognized so far 60 options
With the possibility of increase granularity in the future
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 47
ADESSE hierarchy of semantic classes
tocar33 Other facts
tratar, atreverse61 Dispositivehacer, obligar60 CausativeMODULATION
ser21 Attributive
pedir42 Command
estar, poner312 Location
creer132 Belief
haber, aparecer
criticardecir, hablar
destruirromper, cambiarhacer, crear
ir, llevartener, dar
saber, ensearver, escuchar
gustar, temerExamples
41 Judgement
XISTENTIAL
OMMUNICATION
323 Destruction
322 Modification
321 Creation32 Change
311 Displacement31 "Space"MATERIAL
22 Possession
ELATIONAL
131 Knowledge13 Cognition
12 Perception
11 FeelingMENTAL
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 4
ADESSE top-level classes
Hacermake
HaberexistDecirsay
RerlaughTocar 'touch'
Abriropen
Irgo
TenerhaveSerbe
SaberknowVerseeGustarlike
Exs
1583113
1171552399
109276413141761544131472
CLAUSE
3978TOTAL1396 MODULATION
2015 EXISTENTIAL3354 COMMUNICATION33134 Behaviour42933 Other facts92332 Change62931 "Space"3 MATERIAL18522 Possession23221 Attributive2 RELATIONAL17113 Cognition11512 Perception27111 Feeling1 MENTAL
VERBS
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
9/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 49
Semantic roles and verb classes
Each (sub)class is associated with a set of semantic rolesprototypical for the cognitive domain evoked
Source
--
Agent
Initiator (causer)
Initiator (causer)
Initial-possessor(Donor)
Initiator (causer)
Initiator (causer)
0
ReceiverMessageSayermunication
Patientge
LocativeThemeization
GoalThemeacement
PossessedFinal possessor(Receiver)
sfer
PossessedPossessoression
ContentCognizerition
PerceivedPerceivereption
StimulusSenserng
--21Class
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 5
Semantic roles
There is no single list of semantic roles, and role definitionis made at three levels:
Verb-specific roles(representing valency potential) EscribirA0:Writer, A1:Text, A2:Receiver, A3:Topic EnsearA0: Teacher, A1: Learner; A2: Content
Class-specific roles Communication Sayer, Message, Receiver, Topic
Cognition Cognizer, Content
SynSem Schemas (valency realizations), pointing toverb-specific roles Active Suj=A0 DObj=A1 IObj=A2
Le escriba canciones de amor
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 51
Semantic roles, verbs and semantic classes
Each verb entry is associated with a set of argumentsembracing any possible core participant with this verb
By default, verb arguments inherit the role labels from theclass(es) to which the verb belongs
A2TextWriter
A3A1A0cribir write
MMUNICATION Sayer Message Receiver Topic
utut,, sometimessometimes,, verbverb--specificspecific rolerole labelslabels areare alsoalso usedused
EATION Creator Effected
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 5
RRG: Generalized Semantic Roles
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 54
ADESSE roles
destruir.01Destroyed
romper.01, cambiar.01Affected
hacer.01, crear.01Created
Patient
tener.01, dar.01Possessor
saber.01, ensear.01, creer.01Cognizer
ver.01, mirar.01Perceiver
gustar.01, temer.01Emoter
Experiencer
saber.02, ensear.02, creer.02Content
ver.02, mirar.02Perceived
gustar.02, temer.02Emoted
Stimulus
tener.02, dar.02Possessed
CLASS-SPECIFIC ROLES VERB-SPECIFIC ROLES
ncreasing generalization
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 5
Generalized Semantic Roles
(still a tentative in ADESSE)Common LS templates of lexical entries
pred'(x) | BECOME pred'(x) pred'(x, y) | BECOME pred' (x, y) [do' (z)] CAUSE [BECOME pred'(x, (y))]
Default indices for arguments
z = A0 (initiator or causer) x = A1 (first argument ofpred') y = A2 (second argument ofpred')
that is only the default numbering, because many verbs have a more complexsemantic structure
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
10/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 56
Generalized Semantic Roles
z = A0 (initiator or causer)
x = A1 (first argument ofpred')
y = A2 (second argument ofpred')
A0, A1, and A2 are the closest Adesse's relatives ofmacro-roles Actor and Undergoer
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 5
Generalized Semantic Roles:A0 A1 A2 vs Actor Undergoer
The ADESSE hierarchy A0 A1 A2 is similar to the Actor-undergoer hierarchy:
ACTOR UNDERGOER
Arg of
DO
1st
arg. of
do(x,
1st
arg. of
pred (x,y)
2nd
arg of
pred (x,y)
Arg of
pred(x)
A0 A1 A2
Arg of[do(x,)] CAUSE 1
st
arg. ofdo(x, 1
st
arg. ofpred (x,y)orpred(x)
2
nd
arg ofpred (x,y)
But A0-A1-A2 should not be confused with macroroles themselves
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 58
Generalized Semantic RolesA0 A1 A2 vs Actor Undergoer
Learner /Thing learned
Thing learned /Learner
TeacherEAR(x, )] CAUSE [BECOME know(y,z)]
Thing learnedLearnerENDER
Thing knownKnowerER
A2A1A0
DESSE:Arguments of saber, aprenderand ensear:
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 5
GSR and Argument realization
A0 A1 A
DOSubj
2%Other
5%S=0
93%S=1Intransitive: Subj (+ oblique)
10%Other
3 %S=0 D=2
25 %S=0 D=1
61%S=1 D=2
Transitive: Subj DObj (+ oblique)
34%other / not set
30%S=1 D=2 I=3
36%S=0 D=2 I=1
Ditransitive: Subj DObj IObj (+ oblique)Frequency of argument realizations
Active voice
Subject is almost always higherthan DO in the hierarchy of GSRs
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 60
GSR and Indirect ObjectsA1 A2
DObj
10%Other
25%S=1 I 2
61%S=2 I = 1
Two participants:Subj IObj (+ oblique)
34%other / not set
30%S=1 D=2 I=3
36%S=0 D=2 I=1
Three participants:Subj DObj IObj (+ oblique)he status of IObj in this hierarchy is
nclear. In schemas, it can be
conceived as a middle (A1) or asan additional (A3) argument
In , IO usually outranks thesubject, as in psychological verbs[A1:Experiencer A2:Stimulus]
A Mara le gusta la msica
"Mary likes music"roblem: the nature of IO and DO
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 6
GSR and Argument realization"Inversion (?)"
Psychological verbs (feeling, desire, )
a) : querer, temer, amar, odiar, admirar,Ex: l la quiere "He loves her"
pred'(A1, A2); stative with human EXP as PSA
b) : gustar, interesar, importar, encantar, doler, ..Ex: A ella le gusta "She likes him/her"
pred'(A1, A2); stative with some PSA properties on IO
c) : sorprender, asustar, preocupar, impresionar, Ex: l la impresion "He impressed her"
but there are no clear limits between (b) and (c)
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
11/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 62
Psychological verbs and dative case
ercentage of dative case for 3rderson clitic Experiencer in two-articipant clauses with verbs ofeeling'
A Paula le/la alegr la noticia
'The news pleased Paula"
he main conditioning factors
eem to be dinamicity andfectiveness : calmar, for
xample, is more dynamic andfective than tranquilizar (1/6)17%Calmar'calm'
(3/7)43%Consolar'console'
(6/11)55%Tranquilizar'calm'
(4/7)57%Preocupar'worry'
(3/5)60%Distraer'amuse'
(2/3)67%Alegrar'please'
(14/19)74%Sorprender'surprise'
(9/12)75%Impresionar'impress'
(14/15)93%Molestar'bother'
(224/225)100%Gustar'like'
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 6
Objects
Subj & DO & IO = direct core arguments
Rough RRG equivalents Subject = PSA DO = Undergoer IO = NMR
Corpus data reveal a continuum DO IO withseveral factors at play: Case: accusative vs dative (3rd person clitic) Preposition a Cliticization and clitic-doubling
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 64
Variable case marking
Some Objects are referred by a dative clitic [le(s)] instead of anaccusative clitic [lo(s)/la(s)], even with the same verb and within thesame text
a. El padre le enseaba a conocer las hierbas (Jov: 023)The father taught him [dat] to know about herbs
b.A coserla enseaban desde muy pequea (Usos: 071)She was taught to sew since her early infance
a. Lo que realmente lopreocupaba era.. (Hist: 131)What actually worried him [ac] was
b. Esos bienes que tanto lepreocupan (Hist: 70)These goods that so much worry him [dat]
Le is the canonical form for masculine and feminine Indirect ObjectsR] in ditransitive clauses
Le regal un libro a Mara
n two-participant clauses we can rank the verbs from more dative-likeo less dative-like
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 6
Variable object marking
Some Objects are doubled by a cross-referencpronominal clitic
- Conocs a Elena Garro? []- Y de dnde la conocs vos a Elena Garro? (BAIRES:418, 24-
Doubling by clitic is usual with Indirect Objects
Le regal un libro a Mara "I gave Mary a book"
[besides syntactic function, the main conditioning factor is accessibilstatus (Belloro, yesterday)]
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 66
Variable object markingSome Objects are marked by preposition a
Encontr a un amigo I met a friend
Encontr un amigo id.
Preposition a is used obligatorily to mark fullndirect Objects
Le regal un libro a MaraI gave Mary a book
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 6
3-participantsSubj DO IO
2-participants:
Subj Obj
99.7%0.3%31.1%Dative clitic
[vs Accusative]
42.8%1.0%3.8%Clitic Doubling
[vs (a) NP]
100 %0.7%9.9 %A + NP[vs NP]
R = IOO = DOObject
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
12/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 68
truction Ditransitive (Mono)transitive Ditransitive
cipant O P R
NP
non-doubled
accusative
a NP
(doubled)
dative
ctic Function DO IO
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 6
0%
20%
40%
60%
80%
100%
% a 100% 100% 100% 90% 42,4% 2,7% 0,4% 0%
% doubling 99,5% 100% 99,2% 14,7% 2,8% 2,7% 0,3% 0,2%
% dative 100% 83% 73% 54% 2,7% 0,2%
1 2 Vd Pro 3 anim def animindef
inan def. inanindef.
Claus
Object variation and the animacy hierarchy
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 70
Object variation
No dialect of Spanish has categorical rules for the use of a, cliticdoubling or lesmo. Everywhere we have a gradient
The general tendency is to have unmarked nominals for O inditransitive clauses and for P low in the animacy hierarchy.In general,we have morphologically marked objects for referents high in theanimacy hierarchy (Silverstein 1976)
Other relevant factors (not explored today):
Clitic doubling is also goberned by the animacy hierarchy, butcorrelates more strongly with discourse status: topicality and
accesibility of the referent Case is also governed by dialect variation, gender, and type of
process (dinamicity, effectiveness).
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Subject and object in Spanish
Participants in SUBJ-PRED-OBJ[(+Obl)pattern[N=81954clauses]
SUBJ OBJ
Human 80.50 % 27.14 %Agreement/clitic only 63.80 % 25.90 %Definite (if NP) 90.00 % 66.33 %Preverbal (if NP or clause) 73.67% 3.84 %
Participantsin SUBJ-PRED-DO-IO (+Obl) pattern [N = 8455 clauses]
SUBJ DO IOHuman 84.18 % 2.25% 90.24 %
Agreement/clitic only 65.95 % 10.65% 74.14 %Definite (if NP) 90.06 % 53.57% 89.02 %Preverbal (if NP or clause) 74.50 % 2.40 % 9.50 %
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 72
Object and markednessTransitive clauses
SUBJECT DIROBJ
Human Non Human
Highly accessible Low accessibility
Definite less definite
Topic (Part of) Rheme[Agent] [Patient]
Objects
(unmarked) (marked)
Non Human Human a-marking, le
Indefinite Definite
Low accessibility More accessible
doubling(Part of) Rheme Topic
Patient Less affected lesmo
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
ConclusionsADESSE: What is it good for?
ADESSE is, above all, a database for the empirical study the interaction between verbs and constructions in Spanis
Constructional alternatives for a verb, a syntactic function or asemantic role (with frequencies in a corpus)
Verbs and syntactic constructions for a semantic domain Verbs and semantic domains for a particular construction ...
Additionally, it allows the search and study of manyimaginable correlations between syntactic and semanticfeatures (case, person, number, definiteness, tense, mood)
-
7/29/2019 A Corpus-based Approach to Argument Structure Goals Functional
13/13
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 74
Conclusions / final remarks
t is necessary to observe (spoken and written)anguage in use and to take into accountfrequency as an important factor of languagestructure and meaning
Many grammatical categories manifest as agradient and not as discrete categories. We haveseen that this is the case with argument structure,
argument realization, and grammatical relations.The task is to identify the factors influencing thechoice of a form, and the strength of each factor
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Conclusions / final remarks
It order to achieve descriptive adequacy, we needcorpora annotated with increasingly detailedsyntactic and semantic categories
But the categories used in corpus annotationshould not be taken for granted, and must berevised in the light of corpus evidence
Therefore, we have to move continually fromanalytical categories to corpus and from corpus to
analytical categories
(That is also a disclaimer:Researchers can get adesse data at with somanalysis, but the final analysis of each example is responsibility of the user)
RRG 2007Garca-Miguel: Corpus-based approach to argument structure 77
References
Belloro, Valeria (yesterday). The pragmatics of dative doubling inSpanish. 2007 RRG Conference. Mxico Butler, Christopher S. 2004.Corpus studies and functional linguistic theories. Functions ofLanguage 11/2: 147-86.
Butler, Christopher S. (2004) Corpus studies and functional linguisticheories. Functions of Language 11/2: 147-86.
Bybee, Joan. (2006) From Usage to Grammar The Mind's Response toRepetition. Language 82/4 711-33.
Bybee, Joan, and Paul Hopper (2001). Frequency and the emergenceof linguistic structure. Amsterdam: John Benjamins.
Company, Concepcin. (2001). Multiple dative-markinggrammaticalization. Spanish as a special kind of primary objectCroft, William. (2003). Lexical rules vs. constructions: a falsedichotomy. Motivation in Language. Studies in honor of GnterRadden. eds H. Cuyckens et al, 49-68. Amsterdam: John Benjamins.
RRG 2007Garca-Miguel: Corpus-based approach to argument structure
Dowty, David. (20009. 'The garden swarms with bees' and the fallacy'argument alternation'. Polysemy: theoretical and computationalapproaches. eds Y. Ravin, and C. Leacock, 110-128. Oxford: OxfordUniversity Press.
Fillmore, Charles J. (1992). "Corpus linguistics" or "Computer-aidedarmchair linguistics". Directions in Corpus Linguistics. ed Jan Svartv35-60. Berlin: Mouton de Gruyter.
Goldberg, Adele E. (1995). Constructions: A Construction GrammarApproach to Argument Structure. Chicago / London: University ofChicago Press.
Goldberg, Adele E. (2002). Surface generalizations: An alternative toalternations. Cognitive Linguistics 13/4: 327-56.
Gries, Stefan Th. (2003). Towards a corpus-based identification ofprototypical instances of constructions.Annual Review of CognitiveLinguistics 1: 1-27.
Halliday, M. A. K. (2004). An introduction to functional grammar(3rdedition, revised by Christian M. I. M. Matthiessen). London: Arnold.
Kilgarriff, Adam. (1997). "I don't believe in word senses". Computersand the Humanities 31: 91-113.
Langacker, Ronald W. (1991). Foundations of Cognitive Grammar,Vol. II: Descriptive Application. Stanford: Stanford Univ. Press.
Levin, Beth. (1993). English Verb Classes and Alternations: aPreliminary Investigation. Chicago: University of Chicago Press.
Manning, Christopher. (2003). Probabilistic Syntax. ProbabilisticLinguistics. eds Rens Bod, Jennifer Hay, and Stefanie Jannedy, 289-341. Cambridge (Mass.): MIT Press.
Nichols, Johanna. (1984). Functional theories of grammar.AnnualReview of Anthropology13: 97-117.
Silverstein, Michael. (1976). Hierarchies of features and ergativity.Grammatical categories in Australian languages. ed. R. M. W. Dixon,112-71. Canberra Australian Institute of Aboriginal Studies.
Thompson, Sandra A. and Paul Hopper. (2001). Transitivity, clausestructure, and argument structure: evidence from conversation.Frequency and the emergence of linguistic structure. eds Joan Bybeeand Paul Hopper, 27-60. Amsterdam: John Benjamins.
Van Valin, Robert D. (2001). Functional Linguistics. The Handbook oLinguistics. eds M. Aronoff & J. Rees-Miller, 319-36. Oxford: Blackwe
Van Valin, Robert D. (2005). Exploring the Syntax-Semantics InterfacCambridge: Cambridge University Press.
Van Valin, Robert D. (2004). Lexical representation, co-composition,and linking syntax and semantics. (Ms)http://linguistics.buffalo.edu/research/rrg/vanvalin_papers/LexRepCoCompLnkgRRG.p
Van Valin, Robert D., and Randy J. LaPolla. (1997). Syntax, Structuremeaning and function. Cambridge: Cambridge University Press.