semantic memory architecture for knowledge acquisition and management julian szymański &...

29
Semantic Memory Architecture Semantic Memory Architecture for Knowledge Acquisition and for Knowledge Acquisition and Management Management Julian Szymański Julian Szymański & & Włodzisław Duc Włodzisław Duc h h Department of Electronic, Department of Electronic, Telecommunication and Informatics Telecommunication and Informatics , , Gdańsk University of Technology Gdańsk University of Technology ; ; D D epartment of Informatics, Nicolaus epartment of Informatics, Nicolaus Copernicus University, Toru Copernicus University, Toru ń, Poland ń, Poland Google: Duch Google: Duch IMS Lhasa, 2007 IMS Lhasa, 2007

Post on 21-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Semantic Memory Architecture Semantic Memory Architecture for Knowledge Acquisition and for Knowledge Acquisition and

Management Management

Semantic Memory Architecture Semantic Memory Architecture for Knowledge Acquisition and for Knowledge Acquisition and

Management Management

Julian SzymańskiJulian Szymański & & Włodzisław Włodzisław DucDuchh

Department of Electronic, Telecommunication Department of Electronic, Telecommunication and Informaticsand Informatics, , Gdańsk University of Gdańsk University of

TechnologyTechnology;;

DDepartment of Informatics, Nicolaus epartment of Informatics, Nicolaus Copernicus University, ToruCopernicus University, Toruń, Polandń, Poland

Google: DuchGoogle: DuchIMS Lhasa, 2007IMS Lhasa, 2007

Page 2: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

PlanPlan

1. Motivation.

2. Semantic memories. 3. Talking head with memory. 4. Implementing 20 questions game.5. Implementing word puzzles.6. Data acquisition: active learning. 7. Future directions.

Page 3: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

MotivationMotivationLanguage understanding is one of the key issues on the road to intelligent information management.

Chatterbots are based on 40-years old template matching technology, started by ELIZA (Joseph Weizenbaum, MIT 1996).Popular: AIML language, used in Alice and many other bots.What is your favorite * ? with * = song, film, sport, team, thing, music … The “progress”: use 10.000 templates (Alice) instead of 100. Mindless cheating may get you surprisingly far … No understanding, just pre-processed answers. Sufficient for business transactions, but not for dialog.

Chatterbots are quite annoying because they do not understand anything but may try to sustain dialog to kill some time.Loebner prize competitions – still poor performance.

Page 4: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Talk to a bot …Talk to a bot …Human: There was a new earthquake in SumatraALICE: What happened to it?Human: Tsunami came.ALICE: I like the way you talk.Human: noALICE: Are you serious?Human: noALICE: Are you serious?Human: noALICE: Are you serious? (got into the loop … )

What is wrong and what is missing? Alice has only template memory, but there are infinitely many sentences.Understanding requires concept recognition, descriptions of objects, concepts, their properties, associations, relations and possible actions in the associative semantic memory, storing basic info about the world. Dialogue: building episodic relations among concepts.

Page 5: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Ambitious approaches…Ambitious approaches…CYC, Douglas Lenat, started in 1984. Developed by CyCorp, with 2.5 millions of assertions linking over 150.000 concepts and using thousands of micro-theories (2004).Cyc-NL is still a “potential application”, knowledge representation in frames is quite complicated and thus difficult to use.

Open Mind Common Sense Project (MIT): a WWW collaboration with over 14,000 authors, who contributed 710,000 sentences; used to generate ConceptNet, very large semantic network.

Some interesting projects are being developed now around this network but no systematic knowledge has been collected.

Other such projects: HowNet (Chinese Academy of Science), FrameNet (Berkley), various large-scale ontologies.

The focus of these projects is to understand all relations in text/dialogue.

Page 6: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Realistic goals?Realistic goals?Different applications may require different knowledge representation.

Start from the simplest knowledge representation for semantic memory.

Find where such representation is sufficient, understand limitations. Drawing on such semantic memory an avatar may formulate and may answer many questions that would require exponentially large number of templates in AIML or other such language.

Adding intelligence to avatars involves two major tasks:

• building semantic memory model; • provide interface for natural communication.

Goal: create 3D human head model, with speech synthesis & recognition, use it to interact with Web pages & local programs: a Humanized InTerface (HIT).

Control HIT actions using the knowledge from its semantic memory.

Page 7: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

DREAM architecture DREAM architecture

Natural input

modules Cognitive functions

Affectivefunctions

Web/text/databases interface

Behavior control

Control of devices

Talking head

Text to speechNLP

functions

Specializedagents

DREAM is concentrated on the cognitive functions + real time control, we plan to adopt software from the HIT project for perception, NLP, and other functions.

Page 8: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Types of memoryTypes of memoryNeurocognitive approach to NLP: at least 4 types of memories.Long term (LTM): recognition, semantic, episodic + working memory.

Input (text, speech) pre-processed using recognition memory model to correct spelling errors, expand acronyms etc.

For dialogue/text understanding episodic memory models are needed. Working memory: an active subset of semantic/episodic memory.All 3 LTM are coupled mutually providing context for recogniton.

Semantic memory is a permanent storage of conceptual data.

• “Permanent”: data is collected throughout the whole lifetime of the system, old information is overridden/corrected by newer input.• “Conceptual”: contains semantic relations between words and uses them to create concept definitions.

Page 9: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Semantic memorySemantic memoryHierarchical model of semantic memory (Collins and Quillian, 1969), followed by most ontologies.

Connectionist spreading activation model (Collins and Loftus, 1975), with mostly lateral connections.

Our implementation is based on connectionist model, uses relational database and object access layer API. The database stores three types of data: • concepts, or objects being described; • keywords (features of concepts extracted from data sources);• relations between them.

IS-A relation us used to build ontology tree, serving for activation spreading, i.e. features inheritance down the ontology tree.Types of relations (like “x IS y”, or “x CAN DO y” etc.) may be defined when input data is read from dictionaries and ontologies.

Page 10: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Creating SMCreating SMThe API serves as a data access layer providing logical operations between raw data and higher application layers. Data stored in the database is mapped into application objects and the API allows for retrieving specific concepts/keywords.

Two major types of data sources for semantic memory:

1. machine-readable structured dictionaries directly convertible into semantic memory data structures;

2. blocks of text, definitions of concepts from dictionaries/encyclopedias.

3 machine-readable data sources are used:

• The Suggested Upper Merged Ontology (SUMO) and the the MId-Level Ontology (MILO), over 20,000 terms and 60,000 axioms.

• WordNet lexicon, more than 200,000 words-sense pairs.• ConceptNet, concise knowledgebase with 200,000 assertions.

Page 11: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Creating SM – free textCreating SM – free textWordNet hypernymic (a kind of … ) IS-A relation + Hyponym and meronym relations between synsets (converted into concept/concept relations), combined with ConceptNet relation such as: CapableOf, PropertyOf, PartOf, MadeOf ... Relations added only if in both Wordnet and Conceptnet.

Free-text data: Merriam-Webster, WordNet and Tiscali. Whole word definitions are stored in SM linked to concepts. A set of most characteristic words from definitions of a given concept. For each concept definition, one set of words for each source dictionary is used, replaced with synset words, subset common to all 3 mapped back to synsets – these are most likely related to the initial concept. They were stored as a separate relation type. Articles and prepositions: removed using manually created stop-word list.Phrases were extracted using ApplePieParser + concept-phrase relations compared with concept-keyword, only phrases that matched keywords were used.

Page 12: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Concept Description Concept Description VectorsVectors

Drastic simplification: for some applications SM is used in a more efficient way using vector-based knowledge representation. Merging all types of relations => the most general one: “x IS RELATED TO y”, defining vector (semantic) space.

{Concept, relations} => Concept Description Vector, CDV.Binary vector, shows which properties are related or have sense for a given concept (not the same as context vector).Semantic memory => CDV matrix, very sparse, easy storage of large amounts of semantic data.

Search engines: {keywords} => concept descriptions (Web pages). CDV enable efficient implementation of reversed queries:

find a unique subsets of properties for a given concept or a class of concepts = concept higher in ontology.

What are the unique features of a sparrow? Proteoglycan? Neutrino?

Page 13: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

wCRK representationwCRK representationwCRK (weighted Concept - Relation Type - Keyword): Semantic memory theories are based on atomic logical elements: a typed relation (predicate) between concept (object) and keyword (feature) with weight value expressing the strength of this relation. wCRK forms a “word layer” of the system.Example: interactive exploration of the semantic space.

Page 14: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Interactive semantic spaceInteractive semantic space

Page 15: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Concept Description VectorsConcept Description Vectors

Cobra

is_a animalis_a beastis_a beingis_a bruteis_a creatureis_a entityis_a faunais_a objectis_a organism

Cobra

is_a reptileis_a serpentis_a snakeis_a vertebratehas bellyhas body parthas cellhas chesthas costa

Cobra

has digithas facehas headhas ribhas tailhas thorax

Page 16: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Talking HeadTalking HeadSM is the brain, HIT needs a talking head and voice interface.Haptek’s PeoplePutty tools have been used (inexpensive) to create a 3-D talking head; only the simplest version is used. Haptek player is a plugin for Windows browsers, or embedded component in custom programs; both versions were used.

High-fidelity natural voice synthesis with lips synchronization may be added to Haptek characters. Free MS Speech Engine, i.e. MS Speech API (SAPI 5) has been used to add text to speech synthesis and speech to text voice recognition.OGG prerecorded audio files may be played. Haptek movements, gestures, face expressions and animation sequences may be programmed and coordinated with speech using JavaScript, Visual Basic, Active-X Controls, C++, or ToolBook.

Result: HIT that can interact with web pages, listen and talk, sending information both ways, hiding the text pages from the user. Interaction with Web pages is based on Microsoft .NET framework.

Page 17: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

HIT the WebHIT the WebHaptek avatar as a plug-in in WWW browser.Connect to web pages, read their contents, send queries and read answers from specific fields in web forms.

Access Q/A pages, like MIT Start, or Brainboost that answer reasonably to many questions.

“The HAL Nursery”, “the world's first Child-Machine Nursery”, Ai Research www.a-i.com, is hosting a collection of “Virtual Children”, or HAL personalities developed by many users through conversation.

HAL is using reinforcement learning techniques to acquire language, through trial and error process similar to that infants are using. A child head with child voice makes it much more interesting to play with.

Haptek heads may work with many chatterbots. There are many similar solutions, so we concentrated on the use of SM.

Page 18: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Word gamesWord gamesWord gamesWord games

Word games that were popular before computer games took over. Word games are essential to the development of analytical thinking skills. Until recently computer technology was not sufficient to play such games.

The 20 question game may be the next great challenge for AI, because it is more realistic than the unrestricted Turing test; a World Championship with human and software players in Singapore?

How good are people in playing 20Q game?

Performance of various models of semantic memory and episodic memory may be tested in this game in a realistic, difficult application.

Asking questions to understand precisely what the user has in mind is critical for search engines and many other applications.

Creating large-scale semantic memory with sufficient knowledge about all concepts is a great challenge.

Page 19: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

20Q20Q20Q20Q

The goal of the 20 question game is to guess a concept that the opponent has in mind by asking appropriate questions.

www.20q.net has a version that is now implemented in some toys!Based on concepts x question table T(C,Q) = usefulness of Q for C. Learns T(C,Q) values, increasing after successful games, decreasing after lost games. Guess: distance-based.

SM does not assume fixed questions. Use of CDV admits only simplest form “Is it related to X?”, or “Can it be associated with X?”, where X = concept stored in the SM. Needs only to select a concept, not to build the whole question.

Once the keyword has been selected it is possible to use the full power of semantic memory to analyze the type of relations and ask more sophisticated questions. How is the concept selected?

Page 20: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Semantic Space explorationSemantic Space exploration

• Binary dictionary search for 20 questions with yes/no answers:

220 = 1 048 576 can be distinguished in principle, in practice search tree is never balanced.

• Binary search – not acceptable in complex semantic applications• Semantic space can be search using context – based algorithm.

Similar to a word game.• Concept space is narrowed by subsequent answers of the player.

Information gained by using some keywords (features) is:

0

( ) ( ) log ( )K

i i

i

I keyword p keyword v p keyword v

where where pp((keyword keyword = = ii)) is the fraction of concepts for which is the fraction of concepts for which

this keyword has value this keyword has value ii

Subspace of candidate concepts O(A) is selected by looking Subspace of candidate concepts O(A) is selected by looking at the distance in the subspace of answers/keywords A used at the distance in the subspace of answers/keywords A used so far.so far.

Page 21: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Distance calculationDistance calculationDistance calculationDistance calculation

x and y represent weights for CDV and ANS vectors and ||ANS|| is the Euclidean length of the answer vector.

CDVn is a vector representing concept n and ANS is a partial vector of retrieved answers.O(A) is the subspace of candidate vectors, close to the ANS subspace

We can deal with user mistakes including concepts with d > minimal

1

1 ( )( , ) ,

0

0: ( , )

2

N

n nn

dist CDV ANSd CDV ANS

ANS

if y NULL

ywhere dist x y if x NULL

x y

( ) || , || minO A CDV ANS

Page 22: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Humanized interface

Store

Applications, eg. 20 questions game

Query

Semantic memory

Parser

Part of speech tagger& phrase extractor

On line dictionaries

Manual

verification

Page 23: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

How to obtain semantic data?How to obtain semantic data?• CDV matrix for a single ontology reduced to animal kingdom was

initially used to avoid storage size problems. • The first few steps find keywords with IG≈1.

CDV vectors are too sparse, with 5-20, average 8, out of ~5000 keywords. In later stages IG is small, very few concepts eliminated.

More information is needed in the semantic memory! • Wordnet is one of the best lexical sources

– Relations for Semantic category: animal• 7543 objects and 1696 features

– Truncated using word popularity rank: IC – information content is an amount of appearances of the

particular word in WordNet descriptions GR - GoogleRank is the number of web pages returned by Google

search engine for a given word BNC are the words statistics taken from British National Norpus. - Semantic Space reduced to 889 objects and 420 features

( )max( )

IC GR BNCRank word

Rank

Page 24: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Active learningActive learning

• Data from Wordnet are:– Not complete – Not common sense– Sometimes specialized concepts that a few people understand

• Basic dialogs for obtaining new relations– I give up. Tell me what did you think of? – Tell me what is characteristic for <concept> ?

• Knowledge correction:

where:w0 – initial weight, initial knowledge

ANS – answer given by userN – number of answersβ - parameter scaling the importance of initial knowledge

0 *N

w ANSw

N

Page 25: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

The gameThe game• Giraffe:

[is vertebrate] Y, [is mammal] Y, [has hoof] Y, [is equine] N, [is bovine] N, [is deer] N, [is swine] N, [has horn] N, [has horn] N, [is sheep] N, [is antelope] N, [is bison] N. System correctly guess concept giraffe.

- Yuppi i’ve won! Let’s talk about giraffe. Tell me what is characteristic for giraffe?

After entering keyword. Semantic memory is reorganised, and ready to play new games. • Lion:

[is vertebrate] Y, [is mammal] Y, [has hoof] N, [has paw] Y,[is canine] N, [is cat] Y, [is wildcat] Y

The different way for organizing concept lion in WordNet taxonomy, causes the game goes in wrong way and system fails guess this concept:

[is leopard] N, [is painter] N, [is puma] N, [is lynx] N, [is lynx] N. I give up. What it was? Lion …

After giving right answer system reorganizes its knowledge and next game for searching concept lion is finished with success:

[is vertebrate] Y, [is mammal] Y, [has hoof] N, [has paw] Y, [is canine] N, [is cat] Y , [is wildcat] Y, [is leopard] N, [has mane] Y, I guess it is lion.

Page 26: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Experimental resultsExperimental results• How many games are needed to clarify semantic space?• Proportion of failed games Nf performed to achieve first success.

• The semantic memory error:

where Ns is amount of the games finished with success and N is total number of games; for searching first 10 concepts ~ 0.22

• How does it change during the learning process?

Avg Density features / object

2.39fNQN

N

23

24

25

26

27

28

29

30

0 1 2 3 4 5 6 7 8 9 10

lp

CDVdens

1 SNE

N

Page 27: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

Puzzle generatorPuzzle generatorPuzzle generatorPuzzle generator

Semantic memory may be used to invent automatically a large number of word puzzles that the avatar presents. This application selects a random concept from all concepts in the memory and searches for a minimal set of features necessary to uniquely define it; if many subsets are sufficient for unique definition one of them is selected randomly.

It has charm, it has spin, and it has charge. What is it?

It is an Amphibian, it is orange and has black spots. How do you call this animal?

A Salamander.

If you do not know, ask Google!Quark page comes at the top …

Page 28: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

HIT – larger view … HIT – larger view … HIT – larger view … HIT – larger view …

HIT projectsHIT projects

T-T-S synthesisT-T-S synthesis

Speech recognitionSpeech recognition

Talking headsTalking heads

BehavioralBehavioralmodelsmodels

GraphicsGraphics

Cognitive ArchitecturesCognitive Architectures

Cognitive Cognitive sciencescience

AIAI

A-MindsA-MindsLingu-botsLingu-bots

KnowledgeKnowledgemodelingmodelingInfo-retrievalInfo-retrieval

VR avatarsVR avatars

RoboticsRobotics

Brain modelsBrain models

Affective Affective computingcomputing

EpisodicEpisodicMemoryMemorySemantic Semantic

memorymemory

WorkingWorkingMemoryMemory

LearningLearning

Page 29: Semantic Memory Architecture for Knowledge Acquisition and Management Julian Szymański & Włodzisław Duch Department of Electronic, Telecommunication and

FutureFutureLanguage understanding is one of the key issues on the road to intelligent information management.

Key problem identified: building good semantic memory! Many hierarchical ontologies exist, but no description of even simple objects.

Wordnet definition of a cow: mature female of mammals of which the male is called

‘bull‘.Other dictionaries/info sources are not that useful too. One solution is an active search for correlations between possible properties and objects.

Building good SM will enable many applications. Talking heads with artificial minds are the future of the Cyberspace!

“A roadmap to human level intelligence” and many other initiatives discussing how to get there ...