andrey philippovich

45
Andrey Philippovich 04 February 2011

Upload: others

Post on 03-Feb-2022

16 views

Category:

Documents


0 download

TRANSCRIPT

Andrey Philippovich

04 February 2011

Ph.D., Prof. of Bauman Moscow State Technical University

◦ Courses: Artificial Intelligence, Computer linguistics and Semiotics, Architecture of Information systems, Design IT-Curriculums

Chief of Science Education Cluster CLAIM (Computer Linguistics, Artificial Intelligence, Multimedia and more)

Member of Leading Scientific School of Russia “Russian Language Person” (Head of School - Jury Karaulov)

Chief of Laboratory of Technical Education in Russia, BMSTU

Chief Executive Deputy of Multivendor and Academic ICT Consortium

Editor of the Rubric «ICT in Education», Magazine «Quality of Education»

IT-Consultant (20+ projects)

Design IT curriculums for HE and VET (2010) Bulletin of MAC ICT (2010) Integrating Microsoft official academic courses into Russian technical

universities’ IT curriculums (2008) Credit system in Education: automation aspect (2005) Integration of the Situation, Simulation, and Expert Modelling Systems (2003) Computer index of sources of The Hand-written Ancient Cardfile and The

Dictionary of Russian Language of the XI-XVII centuries (2002) Information volume of the Dictionary of Russian Language of the XI-XVII

centuries. 3 part. Backward wordlist (2001)

Information system of associative experimentsDesign methods and tools for automated building associative thesauri, modeling and analysis of language consciousness.

Computer SemiographyDecoding and systemize russian znamenny (semiographic) chants (XI-XVII centuries). Building musical and idiophone thesauri, ontologies.

Computer Historical Lexicography◦ The Hand-written Ancient Cardfile of Russian Language of the XI-XVII

centuries◦ The Dictionary of Russian Language of the XI-XVII centuries◦ The Dictionary of the Russian Academy (1789–94)

SIE-modelling Theory Design methodology of integration and convergence different decision support systems (Situation, Simulation, Expert, Neurogenetic, Fuzzy, Semiotic and other).

Gesture-mimic interface for interaction with computer ◦ Emotion Recognition based on facial gesture◦ Dictionary of gestures in the field of ICT (audiology)◦ Gesture language recognition

New methods of information visualization (Visual semiotics)Implementation and testing of new visualization concepts of familiar information to improve the visibility, convenience and efficiency of its perception and processing.◦ Domain-specific 3D social network ◦ Concept of visualization based on the theory of semiotic design of IT◦ Visualization of web-ontolologies

Semantic Web & Text Mining ◦ Ontology design◦ Text and Data clustering◦ Text’s tonality analysis

1. Theoretical background◦ Semiotics – main ideas◦ Thesaurus and Ontology◦ Cognitive thesauri

2. R&D Projects ◦ Associative-verbal thesaurus◦ Linguacultural thesaurus◦ Dictionary of metaphors◦ Musical and idiophone «Theons»

The sign is described as a "double entity", made up of the signifier, or sound image, and the signified, or concept.

The basic principle of the arbitrariness of the sign in the extract is: there is no natural reason why a particular sign should be attached to a particular concept.

Ferdinand de Saussure (1857–1913)was a Swiss linguist and is widely considered to be one of the fathers of 20th-century linguistics and of semiotics

The relationship between signifier and signified is, however, not quite that simple.

Saussure asserted that there are only two types of relations: syntagmatic and paradigmatic. The latter is associative, and clusters signs together in the mind, producing sets.

Sets always involve a similarity, but difference is a prerequisite, otherwise none ofthe items would be distinguishable from one another: this would result

in there being a single item, which could not constitutea set on its own.

Knowledge element

Paradigmatic relations

Knowledge element

Sign Syntagmaticrelations

Sign

Semiosis - conceptual (cognitive) process, continually unfolding and unending - the chain of meaning-making by new signs interpreting a prior sign or set of signs)

Semiosis as an irreducibly triadic process wherein something, as an object, logically determines or influences something as a signto determine or influence something as an interpretation or interpretant, itself a sign, thus leading to further interpretants.

Charles Sanders Peirce (1839 – 1914)was an American philosopher, logician, mathematician, and scientist. He is one of the fathers of semiotics.

Sign vehicle: the form of the sign Sense: the sense made of the sign Referent: what the sign 'stands for'

Sign functions: Representation Expression Knowledge

Three directions of semiotics Syntax: the relation between signs, how

signs are constituted Semantic: the relation between sign and

object, what the signs are conveying Pragmatic: the relation between signs and

the user, what for signs are used

Charles W. Morris (1901 – 1979)was an American semioticianand philosopher

Visual rhetoric is the fairly recent development of a theoretical framework describing how visual images communicate, as opposed to aural or verbal messages.

Visual semiotics - studies of meaning evolve from semiotics, a philosophical approach that seeks to interpret messages in terms of their signs and patterns of symbolism

Groupe µ (founded 1967) is the collective pseudonym under which a group of Belgian 20th-century semioticians wrote a series of books, presenting an exposition of modern semiotics

These concepts have a lot of definitions, which depends on current subject area, applications in the field of:◦ Linguistics◦ Information retrieval◦ Artificial Intelligence◦ Computer semiotics

Linguistics◦ Thesaurus is most complex

lexicographical object. It is something more then dictionary

Information retrieval◦ Thesaurus and ontology are some

kind of metadata (ISO2788, ISO5964 – thesauri; OWL - ontology)

Artificial Intelligence◦ Thesaurus and ontology

are some kind of formal models for knowledge representation and management

Computer semiotics◦ Thesaurus & Ontology

is centaur with name “THEON”

Person consciousness language consciousness (Linguistics) Logic consciousness (Pragmatics, AI) Expression consciousness (Psychology) etc.

Theon = {LCU} Thesaurus = {LU}

lexicographical (formal) model of language

Ontology = {WKU}Formal model of real world

LCU = WKU + LU LCU - language consciousness unit WKU - world knowledge unit LU - language unit

Thesaurus, Ontology ~ ok. What about epistemology? We need new linguistic (and lexicographical)

objects, models, grammars, etc. Cognitive science, cognitive psychology, cognitive

linguistics, cognitive semiotics study how our consciousness works

Cognizer – cognitive vehicle Cognitive thesaurus – representation of Cognizer’s

activity, structure and dynamics of cognitive processes. Associative-verbal thesaurus Linguacultural thesaurus Dictionary of metaphors Musical and idiophone «Theons» Gesture-mimic thesaurus

армия

1

стрельба

арбалет 1

охотавойна операция топор

кровь

оружие

ружье ствол

пушкаборьба

пистолет

орудие

перо

1 1 1

2

1

1

11

1121 5 3

1

1

1521

1

2 10

7

армия

1

стрельба

арбалет 1

охотавойна операция топор

кровь

оружие

ружье ствол

пушкаборьба

пистолет

орудие

перо

1 1 1

2

1

1

11

1121 5 3

1

1

1521

1

2 10

7

Mental association (Association of Ideas) - is a term used to refer to explanations about the conditions under which representations arise in consciousness.

One idea was thought to follow another in consciousness if it were associated by some principle. The three most commonly asserted principles of association were:◦ association by contiguity◦ association by contrast◦ association by similarity

The principles of associationism were fertile for the progress of psychological investigation - new methods of studying:◦ memory (mechanical—H. Ebbinghaus, Germany;

and figurative— F. Galton, England)◦ emotions (C. Darwin, England)◦ motivation (S. Freud, Austria; K. Jung, Switzerland)

Association – The formation of a bond or connection between ideas or stimulus and response [Banerjee J. C. Encyclopaedic Dictionary of Psychological Terms, 1994].

Mind mapping (Buzan 1993) Cognitive mapping (Eden 1988, 1998, Ackermann et al. 1992)

Association – undefined or unknown semantic relationship [Quillian, semantic networks]

In UML models, an association is a relationship between two classifiers that describes the reasons for the relationship and the rules that govern the relationship.

Associative experiment (AE) or association test - one of the most widespread methods psycholinguistics and used to study the organization of mental life, with special reference to the cognitive connections that underlie perception and meaning, memory, language, reasoning, and motivation.

{S} {R} In the free-association test the subject

is told to state the first word (Response or Reaction) that comes to mind in response to a stated word, concept, or other Stimulus.

In controlled association test a relation may be prescribed between the Stimulus and the Response (e.g., the subject may be asked to give opposites).

Interface (S) {R} 7: graphique 5: lien 4: homme-machine 2: écran; imprimante 1:

acquisition; arrivée; barrière; candide; communication; connexion; conviviale; de liquides; double; driver; entre; icônes; informatique; limite, séparation; liquide; milieux; MIS; moyen; musicale; ordinateur; parallèle; Pentium III; périphérique; prisme; processeur; raccord; sortie; thermodynamique; traitement de surface; utilisateur; visible; win 95; Windows (o68, d38, v15)

Internet (S) {R} 11: web 7: réseau 5: mail 3: e-mail; liberté 2: chat; Netscape 1: AOL; base

de données interactive; bof; communication; du cul, du cul, du cul; échange; espace; explorer; fouillis; gratuit; inet; info; intranet; le boom des années 90; lent; loin; MIS; modem; océan; oui; pénible; réseau des réseaux; site; surf; surfons; toile; un www personnel; voyager; www (o68, d36, v6)

Associations in information technologies: russian-french experiment (1999-2000)http://philippovich.ru/Library/Books/Association_IT/CONTENTS.HTM

The Dictionary of associative norms of Russian of A.A. Leontev (25 thousand records, 1967-1973)

The Russian associative dictionary / thesaurus (1,3 million records, 1988-1995)

The Slavic associative dictionary (Russian part consist of 66 thousand records, 1998-1999)

Associations of information technologies: experiment in Russian and French languages (12,6 thousand records, 1998-2000).

Word Association, rhyme and fragment norms (750 thousand records,1999 ,The University of South Florida)

The Edinburgh Associative Thesaurus (840 thousand records,1969-1971)

Associative-verbal field (AVF) set of all mental associations (reactions) for word AVF(S) = {Ri}, Ri ∋ S {R}

Associative-verbal Network (AVN) network, where nodes – stimuli and reactions, arcs – associative links.AVN = <{S}∪{R}, {←,→,↔}, ℛ(S→R)>

Associative-verbal model (AVM) phenomenological model of AVN of persons represented by associative thesaurus or dictionary

Hypothesis:AVM – is cognitive model of language consciousness, main part of semiosis

Слово

Дело

ВремяНомер

Телефон

35 [a4]

2 [a12]

2 [a5]

3 [a10]

12 [a16]

Один

3 [a15]

Речь

9 [a3]

Разговор

3 [a8]

3 [a17]

Говорить

3 [a18]

3 [a6]

14 [a2]

4 [a7]

Деньги

14 [a13]8 [a14]

Фраза

6 [a1]

Жизни

5 [a11]

2 [a9]

3 [a18] Ассоциативная связь <частота>[<Идентификатор дуги>]

Один Слово

1. Analysis prerequisites, conditions and goals of AE

2. Creating list of word-stimulus3. Generating the set of forms

(questionnaires)4. Questionnaire design5. Design and filling linguistics database

of AE6. Explore and identification

relationships between language constructions

7. Creating thesaurus and linguistics database of AE

8. Estimating thesaurus for necessity of additional questionnaire design (iterational repeating steps 2-7).

Step Stimuli Stimuli per subject

Forms

Printed Entered

I (+extra)

1277 760

100 ~400

1500 3500

~1300 ~3200

II 2685 100 3000 ~2800

III 2935 100 3500 ~3150

Type: one free associationNumber of stimuli: 6,624Number of forms: 11,500Number of stimuli per form: 100Number of participants: 11,000Time for filling one form: 7-10 minutesNumber of S->R pairs: 1,032,522Number of different S->R pairs: 462,500Number of different reactions: 102,926

Desktop application (Delphi + Paradox) Frequency/Alphabet order Forward / Backward mode (SR / R S ) Gender comparative mode Normalization for comparatives Gender/Age/Profession filters

The project of Russian Foundation for Humanities (2006-8)"The Automated system of scientific researches of dynamics of associative-verbal model of language consciousness of Russians as indicator of an image of Russia in the newest history and the present"

Theoretical basis - hypothesis that results of AE allow :◦ to estimate features of perception of the person◦ to study its language consciousness by means of AVM

Integrated linguistic database based on :◦ The Russian associative dictionary◦ The Dictionary of associative norms of Russian of A.A. Leontev◦ The Slavic associative dictionary◦ Associations of information technologies: experiment in Russian

and French languages

Web-services, program tools and resources:◦ Hypertext versions of dictionaries◦ Specialized web-version of Russian associative thesaurus◦ System for organizing interactive associative experiment

in web◦ Program modules for searching associative chains

New dictionaries & publications◦ The comparative associative dictionary of Russian◦ Gender Russian associative thesaurus /dictionary◦ 20+ scientific publications◦ 2 master degree projects

http://philippovich.ru/Projects/ASIS

http://philippovich.ru/Projects/ASIS

http://philippovich.ru/Projects/ASIS

http://philippovich.ru/Projects/ASIS

$query="SELECT *FROM settingsattribute sa, personattribute paWHERE pa.Attribute_ID = sa.Attribute_IDAND sa.Settings_ID=(SELECT Settings_ID FROM experiment WHERE Experiment_ID='".$_GET['testid']."')AND sa.Ask_Status =1ORDER by sa.Position";

$result=mysql_query($query);$num=mysql_num_rows($result);for ($i=1;$i<=$num;$i++){

$row=mysql_fetch_array($result); $name=$row['Attribute_Name']; $desc=$row['Attribute_Description']; $must=$row['MustStatus'];$redstar='';if ($must==1) $redstar="<font color=red>*</font>";$ID=$row['Attribute_ID'];print "<tr><td>

<b>$name</b>$redstar<br><small>$desc</small></td><td><INPUT TYPE=TEXT NAME=SA$ID size=50 value=\"".$val."\"></td></tr>";

}

AE makes possible to find non-evident key phrases and collocations

It’s possible to check sets of stimuli and reactions for keywords

Best results – using domain-specific AE, Interactive AE

Associative-verbal Network (AVN) network, where nodes – stimuli and reactions, arcs – associative links.AVN = <{S}∪{R}, {←,→,↔}, ℛ(S→R)>

Associative-verbal chain (AVC) ordered set of nodes and arcs based on AVN AVC = S1 (R1=S2) … Rn

Hypothesis: Producing AVC is a cognitive

process, main part of semiosis This process is not simple,

because of spreading activationS1 {(Ri=Si)} … {R}

Слово

Дело

ВремяНомер

Телефон

35 [a4]

2 [a12]

2 [a5]

3 [a10]

12 [a16]

Один

3 [a15]

Речь

9 [a3]

Разговор

3 [a8]

3 [a17]

Говорить

3 [a18]

3 [a6]

14 [a2]

4 [a7]

Деньги

14 [a13]8 [a14]

Фраза

6 [a1]

Жизни

5 [a11]

2 [a9]

3 [a18] Ассоциативная связь <частота>[<Идентификатор дуги>]

Один Слово

Слово

Дело

ВремяНомер

Телефон

35 [a4]

2 [a12]

2 [a5]

3 [a10]

12 [a16]

Один

3 [a15]

Речь

9 [a3]

Разговор

3 [a8]

3 [a17]

Говорить

3 [a18]

3 [a6]

14 [a2]

4 [a7]

Деньги

14 [a13]8 [a14]

Фраза

6 [a1]

Жизни

5 [a11]

2 [a9]

3 [a18] Ассоциативная связь. <частота>[<Идентификатор дуги>]

Один Слово.

Ассоциативная цепочка Сhn1

Hebb’s associative mechanism

Study structure of AVN Revealing «metaassociations» Analysis derivative associations

S1 R1N

S2 R2N

S1 R1N

R1 S1N

R11 = S2

R21S11

S1n

... ...

R2k

∑=

→= n

1i1i1iотн

1111отнRS

)RS(f

K*)RS(f1111

ω

Alexander Panchenko, 2008The program makes possible multifarious kinds of search for chains (AVC) in the associative networks (Russian, English).

The program is written in C# and make use of SQL Server DBMS.

Alexander Sirenko, 2008-2010 Search chains (AVC) in Russian AVN

(120 000 associations). AVS is stored in MySQL, data

processing with PHP5. http://tesaurus.ru/aweb.php

In the conception of J. N. KaraulovCognem — the elementary unit of knowledge:◦ Sign (word)◦ Sense (semantics)◦ Methods (for define sense) 40+◦ Two type of knowledge

functions: Recipe Retouch

Two modes of language consciousness activity (Cognizer):◦ Active (Sign Sense)◦ Passive (Sense Sign)

Across1. A 1977 movie5. Actor Garcia9. City in the Texas panhandle14. 82nd element15. Carpooled16. Mucous eye discharge17. Quadrennial prize19. Ahead of schedule20. Irish form of Helen

Formula of sense Area Function MethodTrailer truck Machines(Cars) Recipe SynonymA male cat Animals Retouch Mark

Self-satisfiedThe feature of the

human Recipe DefinitionA city in east-central France Cities Retouch Mark

A place for wives and concubines Constructions Retouch DescriptionProtagonist Narration Retouch Synonym

Formerly (archaic) Indexes of time Retouch SynonymSporting venue Sports Recipe Perifraz

The boundary of a surface Objects Recipe Synonym

Rodent Rodents RecipeHyperonym-

hyponymBlue-green Color Recipe Synonym

What doors swing on Building designs Retouch DescriptionSwerved Travel Recipe Synonym

Long period of time Indexes of time Retouch Description

Can. Recipe. milk container. <milk; container>→<can> = <CAN 1→ CONTAINER 1→ MILK; CAN 1→ SOUP 1→ MILK; CAN 1→ TEA 2→ MILK; CAN 1→ CONTAINER; > 4=1+3+0+0+0

<MILK 1→ COCOA 1 0→ BEANS 1→ CAN; MILK 5→ BOTTLE 1 7→ BEER 1→ CAN; MILK 1 1→ DRINK 9→ BEER 1→ CAN; MILK 1→ HEALTH 1→ BEER 1→ CAN; MILK 1→ JAR 3→ BEER 1→ CAN; MILK 1→ OFF 1→ BEER 1→ CAN; MILK 1→ PUDDING 1→ BOWL 1→ CAN; MILK 1→ SICK 1→ BOWL 1→ CAN; MILK 1 1→ DRINK 2→ COCA-COLA 1→ CAN; MILK 1→ COOL 1→ COKE 1→ CAN; MILK 1→ JAR 4→ CONTAINER 2→ CAN; MILK 3→ BABY 1→ DON'T 1→ CAN; MILK 1→ JAR 3→ LID 3→ CAN; MILK 1→ TOP 1→ LID 3→ CAN; MILK 1→ CALCIUM 1→ METAL 1→ CAN; MILK 1→ BUTTER 1→ OIL 2→ CAN; MILK 1→ LARD 1→ OIL 2→ CAN; MILK 3→ MACHINE 2→ OIL 2→ CAN; MILK 5→ BOTTLE 1→ OPENER 3 4→ CAN; MILK 1→ COCOA 3→ TIN 3 1→ CAN; CONTAINER 2→ CAN; > 21=1+0+20+0+0

Can. Recipe. milk container. <milk; container>→<can> = <CAN 1→ CONTAINER 1→ MILK; CAN 1→ SOUP 1→ MILK; CAN 1→ TEA 2→ MILK; CAN 1→ CONTAINER; > 4=1+3+0+0+0

<MILK 1→ COCOA 1 0→ BEANS 1→ CAN; MILK 5→ BOTTLE 1 7→ BEER 1→ CAN; MILK 1 1→ DRINK 9→ BEER 1→ CAN; MILK 1→ HEALTH 1→ BEER 1→ CAN; MILK 1→ JAR 3→ BEER 1→ CAN; MILK 1→ OFF 1→ BEER 1→ CAN; MILK 1→ PUDDING 1→ BOWL 1→ CAN; MILK 1→ SICK 1→ BOWL 1→ CAN; MILK 1 1→ DRINK 2→ COCA-COLA 1→ CAN; MILK 1→ COOL 1→ COKE 1→ CAN; MILK 1→ JAR 4→ CONTAINER 2→ CAN; MILK 3→ BABY 1→ DON'T 1→ CAN; MILK 1→ JAR 3→ LID 3→ CAN; MILK 1→ TOP 1→ LID 3→ CAN; MILK 1→ CALCIUM 1→ METAL 1→ CAN; MILK 1→ BUTTER 1→ OIL 2→ CAN; MILK 1→ LARD 1→ OIL 2→ CAN; MILK 3→ MACHINE 2→ OIL 2→ CAN; MILK 5→ BOTTLE 1→ OPENER 3 4→ CAN; MILK 1→ COCOA 3→ TIN 3 1→ CAN; CONTAINER 2→ CAN; > 21=1+0+20+0+0

армия

1

стрельба

арбалет 1

охотавойна операция топор

кровь

оружие

ружье ствол

пушкаборьба

пистолет

орудие

перо

1 1 1

2

1

1

11

1121 5 3

1

1

1521

1

2 10

7

орудие

1

стрельба

арбалет 1

ружьествол спорт войска стрела

двор

старинный оружие форма лук

311

1

1 1

1 11 1 1

1

Minimal passive proposition graph

стрельба

11

орудие

арбалет

выстрелпистолетпушкаубийство работа труд топор лопата

солдат

милиция

юмор капустафигура мукастрелаптицапатронружьестволядростол

службаотличная

форма лукоружиестаринный

2

1

4

5

2141114

1

24 4

1

6

1

1

111111

1

1

1

1

1

1

218

2

17

10

2

7

1 1 4 2 2 2 1

1

1 1 1 2 6

1 1 1

Active proposition graph (dimension 4)