the cooperative web a step towards web intelligence

Post on 22-Feb-2016

46 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

The Cooperative Web A Step towards Web Intelligence. Daniel Gayo Avello University of Oviedo. Web Intelligence?. Multidisciplinary effort Artificial Intelligence Information Retrieval Software Agents ... Early stages Goal  The Wisdom Web New web. More useful. Truly “intelligent”. - PowerPoint PPT Presentation

TRANSCRIPT

The Cooperative WebA Step towards Web

IntelligenceDaniel Gayo Avello

University of Oviedo

Web Intelligence?• Multidisciplinary effort

– Artificial Intelligence– Information Retrieval– Software Agents– ...

• Early stages• Goal The Wisdom Web

– New web.– More useful.– Truly “intelligent”

The Semantic Web (in a nutshell)• Standardized conventions (ontologies)

– objects– attributes– relations

• Semantic tags– Document authors mark up– Software agents (basic) reasoning

So...• Semantic Web ~ Web

Intelligence Approach• Cooperative Web ~ Web

Intelligence Approach

Is the Cooperative Web just-another-proposal?• Not really...• Semantic Web

– beginning... – human made (ontologies - at this moment)– time to reach the whole Web (5-10 years?)

• “I know what I want and I want it now!”• The Web ~ Legacy System• Something...

– fully automatic– simple– built on top of the current web (legacy)– between the current web (legacy) and The Wisdom Web (future)

• ...wouldn’t be nice?

Cooperative Web proposal (in a nutshell)• Simple, cheap, automatic • Intermediate: Web ¿? Wisdom Web• “Squeeze out” the current Web a little

more...• Main ideas:

– Concept extraction– Automatic document taxonomies– Computational biology

Concepts• Let’s study these samples...

...Betelgeuse, a red supergiant star about 600 light years distant, is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Concepts• They’re results from the Google

query star......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Concepts• But they talk about different kinds

of “stars”......Betelgeuse, a red supergiant star about 600 light years distant,

is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Concepts• From those (and other) documents we could

extract something like these “word bags”...0:{red supergiant, star, Sun, ...}1:{actor, actors, celebrity, films, star, ...}

• Plenty of techniques to obtain these “word bags” or “concepts”, for instance:– Latent Semantics (Foltz, 1990)

– Concept Indexing (Karypis and Han, 2000)

Conceptual related documents• Documents shown before...

...Betelgeuse, a red supergiant star about 600 light years distant, is seen in this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

Conceptual related documents• Could be transformed in something like

this......Betelgeuse, a red supergiant star about 600 light years distant, is seen in

this Hubble Space Telescope image - the first direct picture of the surface of a star other than the Sun...

...Designer Jim Wallace, who is developing the PlayStation 2 fighting title "Rise to Honor" with martial-arts star Jet Li, said celebrity involvement boosts the reputation of gaming in general...

...His reputation as one of America's greatest actors secured, Hoffman proceeded to star in a series of films that disappointed at the box office...

...The actor Arnold Schwarzenegger has signed for a record-setting $30 million to star in "Terminator 3"...

• by dropping the “stop words”...

Conceptual related documents• And then into this...

?00???????00????????1?1??????1??11??1???1?

• Last three documents are closely related while the first one has nothing to do...

Text strings...• This way of representing free text...

?00???????00????????1?1??????1??11??1???1?

• ...could be well-suited to determine the distance between documents.

• Let’s see a simpler technique to get the distance between text strings...

Text strings...• Three simple strings:

– BENJI– DANI– HENRY

• How closely are they related?• Let’s define a distance between two strings as

the number of letters to delete +the number of letters to change +the number of letters to insert...

• ...to transform one string into the another.

Text strings...• Distance between BENJI and DANI: 3

BENJI DENJI (1), DENJI DANJI (2), DANJI DANI (3)

• Distance between DANI and HENRY: 4DANI HANI (1), HANI HENI (2), HENI HENRI (3), HENRI HENRY (4)

• Distance between BENJI and HENRY: 3BENJI HENJI (1), HENJI HENRI (2), HENRI HENRY (3)

• This is known as Levenshtein distance and will allow us to better understand next step...

Someone’s in the kitchen with DNA• DNA highly complex molecule made from only 4

different kinds of components:– Adenine - A– Cytosine - C– Guanine - G– Thymine - T

• So, DNA molecules ~ simple (but huge) text strings– CCAAGGA...– CCAAGGAAACTCACTA...– GATTACA...

Someone’s in the kitchen with DNA

• If DNA ~ text string then distances between two or more strings can be easily computed...

(Ursing and Arnason, 1998)

What if...

Could be possible to adapt computational biology

algorithms to distill semantics from the web in

an automatic fashion?

Cooperative Web architecture

Œ

Ž

User

Software agent

Browsinghistory

Documenttaxonomy

?

Œ

So, the Cooperative Web would be...

A layer over the Webto provide semantics

in an automatic fashion“inspired” by

computational biology

Work in progress...

•Cooperative Web is just a proposal (at this moment)

•Some prototypes soon (I hope...)

The Cooperative WebA Step towards Web

Intelligence

Thank you!Any question?

References• Foltz, P.W. (1990), "Using Latent Semantic Indexing for Information

Filtering", Proceedings of the ACM Conference on Office Information Systems, Boston, EE.UU., pp. 40-47.

• Karypis, G., and Han, E. (2000), "Concept indexing: A fast dimensionality reduction algorithm with applications to document retrieval and categorization", Technical Report TR-00-0016, University of Minnesota.

• Ursing, B.M., and Arnason, U. (1998), "Analyses of mitochondrial genomes strongly support a hippopotamus-whale clade", Proceedings of the Royal Society of London. Series B, Biological Sciences, 265:2251-2255.

top related