Download - Sw semantic web

REYKJAVIK UNIVERSITY

The Future Of The Web The Semantic Web

Bergur Páll Gylfason

[email protected]

March 25, 2010

1

TABLE OF CONTENTS

Introduction ............................................................................................................................................. 2

History of the world wide web ................................................................................................................ 2

History of the Semantic Web .............................................................................................................. 4

The Concept Behind The Semantic Web ................................................................................................. 4

What is the Semantic web? ................................................................................................................. 4

How does the Semantic web work? .................................................................................................... 5

What is Semantic Web doing for you? ................................................................................................ 6

Disruption ............................................................................................................................................ 6

What happens after web 3.0? ............................................................................................................. 7

The Technology behind the Semantic Web............................................................................................. 8

Linked Data .......................................................................................................................................... 8

The Linked Open Data Project ......................................................................................................... 9

URIS ..................................................................................................................................................... 9

Uri Reference -URIref .................................................................................................................... 10

XML .................................................................................................................................................... 10

Resource Description Framework RDF .............................................................................................. 10

Vocabularies and Ontologies ............................................................................................................. 11

Friend of a Friend FOAF ................................................................................................................. 12

Queries SPARQL ................................................................................................................................. 13

Examples of Semantic Webs ................................................................................................................. 14

Linked Data Search Engine ................................................................................................................ 14

Medical – HealthBase ........................................................................................................................ 14

DBPedia ............................................................................................................................................. 14

BBC Music .......................................................................................................................................... 15

eTourism ............................................................................................................................................ 15

EveryBlock – Mashup Site ................................................................................................................. 15

Conclusion ............................................................................................................................................. 16

Bibliography ........................................................................................................................................... 17

2

INTRODUCTION

The World Wide Web is getting 22 years old this year and it has gone through a lot of changes. At

first it was just documents with text and hyperlinks and then came photos, videos, search engines,

social networks and so what comes next? I think the next step is that the web is getting much more

Semantic. The Semantic Web is all about connecting data and making it readable by machines, we

have all this useful information in different databases all over the world, medical-, Pharmaceutical -,

government-, personal information and so much more and none of it is connected. What if we could

connect all that information? How would we do it? We use Semantic technologies to connect the

data and make machines understand it. To do that we use technologies like RDF, OWL and SPARQL.

It is totally going to change how people use the World Wide Web, how they shop, how they make

travel arrangements, how they search. It is going to change how people interact and use the Web. In

this paper I’m going to try to find out what the Semantic Web is, how does it work, where it stands

and what opportunities it will bring for people and organizations.

First I will look at the history of the web, and then I will find out what it is all about quickly how it

works and what possibilities it will bring, and then I will go into the technical details of the semantic

HISTORY OF THE WORLD WIDE WEB

If it wouldn’t have been for Tim Berners-Lee, A scientist at

CERN the Web would not be where it is today. He saw that

scientist had no way of sharing information easily. So he

had the idea of creating „a large hypertext database with

typed links“ (1) aka the World Wide Web (WWW) (will be

referred as the web in this paper), but people had little

believe in that idea at the time, so it wasn’t until 1990

when he had developed all the tools required for a

working web (HTTP, HTML, Web browser, Web Server and

the first Web pages) that people saw the huge potential

that the Web had. From there on the internet use was

growing each day. But Berners-Lee browser was very primitive, and in 1993 there came new

browsers such as Cello from Microsoft and Mosaic which was the foundation for Netscape whom

would own the market for the years to come. (1)

In 1994 Berners-Lee founded the World Wide Web Consortium (W3C) which is the main

international standards organization for the Web (2) . That was absolutely necessary because, as

happens with all new technologies that are being developed, people who are working on it have

different ideas how to implement the same thing and therefore they come up with different formats

so there often is inconsistency in how people to things e.g. VHS and Betamax, Blue-Ray and HD-DVD.

But in the Webs case there was inconsistency in the HTML. At that time the Gopher protocol (3)

ruled the market and was in a format war with HTML. Which they lost because they were greedy,

they started to make people pay for using Gopher while Berners-Lee made his Web protocol free for

Picture 1: The first photograph on the Web (43)

http://en.wikipedia.org/wiki/Pharmaceutical_industry

3

everyone to use and as always when something is free and “good enough” the users switched to

HTTP. And as we have seen over time that when things are open and free more people use it and

there is more variety and faster evolution of the technology.

In 1996 normal companies started to get involved with the

Web, they saw the potential in advertising their products for

free where everybody could see them. That was the start of

the so called Dot-com boom (4), it was a stock market bubble

which popped in 2001. The cause of the dot-com bubble was

mostly because companies had daring business policies,

growth over profit. They were hoping if they built up their

customer base, their profits would also rise. Investors

pumped money into the internet market with false and

hyped up hopes of profit mostly in e-commerce. But when

the bubble burst in 2001 there was a short downtime in the Web business and many companies

went bankrupt. But not everybody because in this time came the biggest internet companies today

such as: Google, E-bay and Amazon. In this time also started the social network mania that would put

its mark over the next ten years. First there came MySpace which was the most popular social

network in 2006 (5) and then came Facebook which was the most used social network in the world in

2009 (6).

The era after the dot-com boom has been called web

2.0 by many people because of apposed of how

content was distributed in Web 1.0(the web from birth

to the dot com boom) it dramatically changed. Web 1.0

was very primitive and linear. There was a webmaster

that controlled the webpage and him and some experts

generated content on the webpage. So there where

millions of users and growing, but only a thousands of

webmasters and experts, all that the users could do

was browse what the webmasters generated, the web

was read-only for the users and therefore content was

very limited. The Web 2.0 is all about gathering

information the main thing that changed in Web 2.0 is

that users started to generate content themselves. With

sites like Wikipedia where they trusted the users to put

in accurate information and it has grown incredibly huge

with over 3.000.000 articles (7). Blogs also played an important role where people could easily get up

their own blog with only a three clicks, where they could say whatever they wanted. And YouTube

where users have uploaded more than 80 million videos and it would take you more than 600 years

to see them all (8).

This is causing us to stretch the limits of the web, because of the huge amount of data being

generated. In 2007 the total volume of digital information that is created and replicated globally is

281 billion gigabytes (281 Exabyte’s) in 2007. (9) And it is supposed to grow to astonishing 667

Picture 2: The Dot-Com Bubble (44)

Picture 3: Difference between Web 1.0 and Web 2.0

(45)

4

Exabyte’s by 2013. (10) That means that the data created in five years is more than all the data

created from the beginning of the web. That also means it gets harder to find the information you

need with the keyword based search engines and by browsing.

HISTORY OF THE SEMANTIC WEB

As most people know, Tim Berners-Lee invented the Web but the web as it is today is not his original

vision of the web. The web didn’t quite evolve the way Berners-Lee envisioned it because he thought

of the Web as a Semantic Web from the beginning. Even though the Web didn’t evolve as he would

have hoped he continued to fight for it by publishing materials and making statements about the

evolution of the Semantic Web. In 1998 he started defining a roadmap for the semantic web or an

attempt to give a high-level plan of the architecture of the Semantic web (11) and in 1999 he said:

“I have a dream for the Web in which computers become capable of analyzing all the data on

the Web – the content, links, and transactions between people and computers. A ‘Semantic

Web’, which should make this possible, has yet to emerge, but when it does, the day-to-day

mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to

machines. The ‘intelligent agents’ people have touted for ages will finally materialize.” (12)

THE CONCEPT BEHIND THE SEMANTIC WEB

WHAT IS THE SEMANTIC WEB?

The first thing to know about the Semantic Web is that it is a Web of data, where the data becomes

part of the Web. This is in contrast to the Web as we know it today it. Today it is full of information in

the form of documents and we use search engines to search these documents, but they still have to

be read and interpreted by humans to get the information from the documents. So while computers

present you with the information it can‘t understand it. For Example when you are looking for the

band Oasis on Google, it shows on the first page Oasis clothing store, OASIS (Organization for the

Advancement of Structured Information Standards), Oasis gift shop and Oasis Hong Kong Airlines.

That is not at all what i was looking for. If it were a Semantic Search i would search for Oasis where

Oasis is a type of rock band and i would only get results with Oasis the band. This would save a lot of

time.

Let’s take another example. You‘ve decided to take your wife on a romantic date – a movie and a

restaurant. You know she loves romantic comedies and a good steak. What you would do is go to

Google to find what movies are showing at theaters close to you, an then you spend some time

trying to find a good movie by reading descriptions and reviews. Then you check out the steakhouses

in close perimeters to the theatres. Of course you want to make sure the steakhouse is good so you

try to find reviews about them. This is going to take you a lot of searches, a lot of visits to webpages

and the worst way much time.

5

The Semantic Web will make tasks like these so much faster and easier. Instead of multiple searches

you would rather search for a complex sentence in the Semantic Web like: “I want to see a romantic

comedy and then go to a steakhouse in close proximity. The Semantic Web will analyze your

response, search the web for possible combinations, and then organize the results to be best suited

for you. And this will only take you a few minutes as the Semantic Web will do all the heavy lifting for

you.

All this is going to get possible by linking data together and turning the Web into an open database

and making it understandable my machines. And that is only going to happen if people release their

raw data on the internet. They are working on this in the Linking Open Data Project (13).

HOW DOES THE SEMANTIC WEB WORK?

The W3 (World Wide Web Consortium) is an international standard community that is led by the

inventor of the web Tim Berners-Lee. It develops standards for the web to ensure the long term

growth on the web. They have made standards for HTML, HTTP, XML and much more. (2) They have

already made some standards for Linked Data, Ontologies and Queries.

To be able to Link Data together W3 made the standard RDF which is the core of the semantic web

making possible to make metadata (data about data) about resources on the web so that machines

can understand them. What the resource is and how it connects to another resources. For Example

you want to make RDF about yourself and you have the properties that you are a type of Person with

the full name John Doe and mailbox [email protected]. Now the machine knows that you are a type of

Person with some specific name and some mailbox.

They have also made standards for Ontologies which

define the concepts and relations used to describe and

represent an area of concern. They are used to classify

the terms that can be used in a particular application,

characterize possible relationships and to help data

integration when ambiguities may exist on terms in

different data sets or when an extra bit of knowledge

may lead to the discovery of new relationship. It is best

to describe it by an example: if you have the person

above John Doe and then someone else makes a Person

but doesn’t use the term mailbox instead he uses the

term email. Then an extra definition should be, describing the fact that the relationship “email” is the

same as “mailbox”. This Extra piece of information is an extremely simple ontology. They can also be

extremely complicated, with many ontologies connected together to describe e.g. The Aquatic

world. To do this they have made standards like RDF and RDF Schemas, SKOS(Simple Knowledge

Organization System), OWL(Web Ontology Language) and RIF(Rule Interchange Format). (14) We will

take a closer look at the most common one OWL later.

Picture 4: The Technology Stack for The Semantic

Web (42)

mailto:[email protected]

6

Finally they have made SPARQL standard so that people can make a Query which means technologies

and protocols that can retrieve information from the Web of Data. The Web of Data is represented in

RDF so we need some RDF specific query language. This is provided by SPARQL and it makes it

possible to get results through HTTP or SOAP. When people use SPARQL they can extract complex

information which is returned in a table format. This table can be incorporated into another Web

page, using this approach SPARQL provides a powerful tool to build, for example, complex mash-up

sites or search engines that include data streaming from the Semantic Web. (15)

WHAT IS SEMANTIC WEB DOING FOR YOU?

The Semantic Web is going to change how you use the web. It is going to make it much simpler and

thus making your life simpler because peoples internet use is growing fast, it has grown 380% since

the year 2000 (16) and people are starting to think that the web access is a fundamental right (17).

The purpose of the computer is to do the repetitive and boring things for you, doing the boring,

complex and hard work is exactly what the Semantic Web is supposed to do. It is going to take fewer

clicks to get to the data you are looking for. Collect your Interests and share them with others with

an application like Twine (18) you share your interests with other people and they share with you.

You can organize your travel plans better with applications like TripIt (19) which lets you combine

bookings made from different web pages into a single site. Finally you can pinpoint the exact news

you want to see. (20 bls. 22)

There are a lot of opportunities for companies and governments in getting involved in the Semantic

Web. International banks could drill into accounts, transactions and financial histories without

requiring many months of expensive IT projects to do so. Financial institutions could assess risk with

greater accuracy because of more relevant data available. Pharmaceutical companies could lower the

cost of drug development if they could easily combine open-source web data with their own data

and therefore save money by using information other people has found instead of doing it yourself.

The Semantic Web could also make huge impact on things like National Security, Disaster

Preparedness and Military Operations. The Government collects enormous volumes of data every

day and by linking them to gather more effectively, they can see national security threads forming

before they become a reality. Disasters rarely happen when you’ve planned for them, so being able

to access all data quickly and being able to mash it up on the fly to get more out of the information

could save a lot of lives when there is little time to get information. (20 bls. 55)

DISRUPTION

The Semantic Web will no doubt bring some

Disruption to the market, making web companies

with bad business models go bankrupt. And

companies that are doing nothing instead of

investing in innovation might fall behind.

Or the companies that have inconvenient

Picture 5: The Technology Life Cycle (41)

7

Resources, Processes or Values like Western Union had could also fall behind. But there is really no

accurate way to tell if there will be any Disruption because I think the semantic web is still early in

the Early Adaptors stage in the Technology Life Cycle. They are still waiting for the killer app and for

more people to join the Semantic Web. There are so many possibilities that we don’t see yet that

might cause disruption. I think only time will tell.

WHAT HAPPENS AFTER WEB 3.0?

If Web 2.0 is about web application and social

networking, and web 3.0 is about

incorporating the semantics of data

interpreted by machines. What happens

next? Nova Spivack a technology visionary

and entrepreneur in web development and

developer of Twine thinks that the web

develops in 10 year cycles and that 2010-

2020 is Web 3.0 and that it is supposed to lay

the groundwork for web 4.0 that is

scheduled for 2020-2030. Just like Web 1.0

laid the groundwork for web 2.0. He thinks

that the Web 4.0 will be something like WebOS and will work like middleware, where the web will

start functioning like an OS or what he calls, “The Intelligent Web”. This might work as your own

personal assistant. (21)

Nova Spivack isn’t the only one with a vision about the future. Raymond Kurzweil is an inventor and a

pioneer in text-to-speech synthesis, speech recognition technology and more also predicts that there

will be a WebOS by 2029. But he thinks it will be parallel to the human brain and he said:

“Intelligent machines will combine the subtle and supple skills that humans now excel in

(essentially our powers of pattern recognition) with ways in which machines are already

superior, such as remembering trillions of facts accurately, searching quickly through vast

databases, and downloading skills and knowledge.” (21)

Picture 6: The evolution of the web according to Nova Spivack (21)

8

THE TECHNOLOGY BEHIND THE SEMANTIC WEB

LINKED DATA

We have already talked about the Semantic Web as a Web of Data, data of all possible types from

personal data to financial data to pharmaceutical data and just about all kinds of data you can think

of. Linked Data is about using the Web to create typed links between data from different sources

with collection of Semantic web technologies e.g. RDF and OWL which we will examine later.

Berners-Lee made set of rules (22) known as the Linked Data principles. They are about publishing

data on the web in a way that all published data becomes part of a single global data space. They

provide a basic recipe for publishing and connecting data:

1. Use URIs as names for things

2. Use HTTP URIs so that people can look up those names

3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)

4. Include links to other URIs, so that they can discover more things

To be able to make the Web of Data we need huge

amount of raw data openly available on the web. The

amount of raw data on the web has been rapidly

growing over the years from Universities and

corporations to governments. Berners-Lee held a Ted

conference in March 2009 asking people to put raw

data out on the web. (23) People did what he asked

and started to put more data free and what happened?

Other people reused the data and did some interesting

things as Berners –Lee showed in March 2010 Ted

Conference (24). For example a lawyer in Zanesville Ohio who took the information from the

Government to check out which houses had water and in which houses lived black people. He found

out that there was only water with the white people. The County had to pay $10.9 million in

compensations.

Another example is when the earthquake in Haiti

happened, the Google map was not very good. So

GeoEye a satellite image company put images of Fiji

free on the Web. Then people started to update the

Google map according to GeoEye images. It even

showed blocked roads, refugee’s camps, hospital

ship and other very important things. This became

the best map to use if you were involved in relief

work in Haiti and probably saved some lives. (24)

Picture 7: Water available for white people (24)

Picture 8: Haiti map after people on the web updated it

(24)

9

T H E L I N K E D O P E N DA T A PR O J E C T

The objective of the project is to identify and connect all the existing data sets that are available

under open licenses, converting them to linked data with RDF and linked data principles, and

publishing them on the web. In the beginning there where mostly researchers in university research

labs and small companies involved in the project when it was founded in January 2007, but since

then it has grown fast. Several big organizations have got involved like US government, BBC, Flickr

and New York Times. The reason because it is growing so fast is that it is open, anyone can publish a

dataset and connect it to other datasets. (25) In Picture 9 is the Linked Data Project in July 2009

where each cloud is a dataset and the lines show the connections between them, the darker the line

the more connections are between the dataset.

Picture 9: The Linked Open Data Project (13)

URIS

When we surf the web we use Uniform Resource Identifiers, because we use a uniform system of

identifiers and each item is a resource and we use them to identify items on the web. The URI is the

foundation of the Web, it holds it together. Almost everyone knows one type of URI although they

don’t know it is a URI, that’s URL (Uniform Resource Locator). Which is an address that lets you visit a

webpage? (26) URL is a character string that identifies a web resource by representing its network

10

location. However, it is also important to be able to record information about many things that,

unlike Web pages, do not have network location or URL.

That is where URI comes in because it is more general form of identifier. All URIs share the property

that different persons or organizations can independently create them, and use them to identify

things. A URI can be created to refer to anything that needs to be referred to in a statement. For

Example: network accessible things (e.g. document, image, a service or other resources), things those

are not network-accessible such as human being, books in a library and abstract concepts that do not

physically exist, such as the concept of an „author“ (27).

U R I RE F E R E N C E -URIR E F

Uriref is another type of string that represents a URI, and represents the resource identified by that

URI. It is a URI, together with an optional fragment identifier at the end separated by #. For example,

the URI reference http://www.example.org/index.html#section2 consists of the URI

http://www.example.org/index.html and the fragment identifier Section2. URIrefs can contain

Unicode characters allowing many languages to be reflected in URIrefs (27).

XML

XML was designed to be a simple way to send documents across the Web. It allows anyone to design

their own document format and then write a document in that format. These document formats can

include markup to enhance the meaning of the document’s content. This markup is “machine-

readable” that is, programs can read and understand it. Which is the whole idea with the Semantic

Web, make the web readable by machines. Instead of only one application being able to use it, they

can be used by many applications, where each application interprets the markup the best way for it.

For Example if words were marked as “emphasized” a normal web browser might display them bold

and a voice browser might use higher volume. (26) XML is used in combination with other semantic

technologies like RDF to connect data.

RESOURCE DESCRIPTION FRAMEWORK RDF

RDF or Resource Description Framework dose exactly

what the name indicates, it is a language to provide a

simple way to describe resources on the Web. RDF is at

the core of the Semantic because it makes statements

about things, for example if you make a statement in

English that John Doe created a particular webpage:

„http://www.example.org/index.html has a creator

whose value is John Doe” Then you have three parts of

this statement. First you have the thing, the statement is

about called the subject and in this case it is the webpage.

Next you have the part that identifies the property or

Picture 10: A Simple RDF Statement (18)

11

characteristic of the subject in this case it is creator is called the predicate. Last we have the value of

the predicate and that is the name John Doe. (27)

While English is good for communicating

between humans, it is not good for

communicating between machines. RDF is about

making statements that a machine can process

and understand. To do so we need two things: A

system of machine processable identifiers for

identifying a subject, predicate or an object in a

statement without any possibility of confusion

with a similar-looking identifier that might be

used by someone else on the web. Secondly we

need a machine-processable language for

representing these statements and exchanging

them between machines. Those are both in

place in the web today. RDF uses the Uniform

Resource Identifier (URI) because it is so generic to identify the subject, predicates and objects in

statements. RDF defines a resource as anything that is identifiable by a URI reference, so using

URIrefs allows RDF to describe practically anything, and state relationships between those things as

well. Statements can also have a string instead of an URIref for the Object but not subjects or

predicate but then they are called Literals. To represent RDF statements in machine-processable way

it uses XML. RDF defines a specific XML markup language, referred to as RDF/XML, for use in

representing RDF information, and for exchanging it between machines. (27)

VOCABULARIES AND ONTOLOGIES

Computers don’t understand our language, they don’t see

the connections between words and things as we do. For

example if you have an uncle you know that it is the

brother of one of your parents. But a machine doesn’t

know that you must have a parent that has a brother to

have a cousin and that is exactly what ontologies do. It

makes the machine understand and know these things

that are so natural for us humans. For example if we had

the ontology in picture 7, and someone would ask some

Star Wars semantic website: “who is the father of Lela

Organa and where is she from?” The website would know

from its ontology and linked data that Lela is the daughter

of Anakin Skywalker and she is from Alderaan.

Vocabularies and Ontologies are used to express extra

constraints and logical relationships among resources. For

example to help data integration for example when

Picture 12: Small Ontology about Star Wars (37)

Picture 11: A more Complicated RDF Statement (18)

12

different words are used to describe the same thing in different data sets, or when a bit of extra

knowledge may lead to the discovery of new relationships. (28)

The best way to dig deeper into the Ontologies is to take a look at one of the standards w3 has been

working on. Let’s take a look at Web Ontology Language or OWL. OWL is a language for expressing

ontologies. It is a Semantic Web language designed to represent rich and complex knowledge about

things, groups of things, and relations between things. OWL is a computational logic-based language

such that knowledge expressed in OWL can be reasoned with by computer programs. OWL

documents, known as ontologies, can be published in the Web and may refer to or be referred from

other OWL ontologies. (29)

Ontology is a set of precise descriptive statements about some part of the world usually called

domain of interest. Precise descriptions prevent misunderstanding in human communication and

they ensure that software behaves in a uniform, predictable way and works well with other software.

To precisely describe a domain of interest, you come up with a set of central terms often called

vocabulary and describe what they mean. Both with a natural language definition and how this term

in connected to other terms. This is called a terminology and it combined with the vocabulary is an

essential part of a typical OWL document. (29)

To understand how knowledge is represented in OWL we first must check out some fundamental

notions. Axioms are the basic statements that OWL ontology expresses. Entities are elements used to

refer to real world objects. For example “Star Wars is a Movie” or “Kenny is Spenny’s Uncle”.

Expressions are combinations of entities to form complex descriptions from basic ones. For Example

we have the classes “female” and “professor” could be combined to describe the class of female

professors. (29)

Each OWL ontology is a collection of basic statements like “Coca Cola is a Drink” or “it is Cloudy” and

these statements are Axioms. These statements can both be true or false. This distinguishes them

from entities and expressions. An important feature of OWL is that a statement is true when the

other statements are. For Example a set of statements A entails a statement b if in any state of affair

wherein all statements from A are true, also b is true. (29)

F R I E N D O F A F R I E N D FOAF

FOAF is a machine-readable ontology describing person, their activities and their relations to other

people and object and is a part of the open linked data project. It is considered to be the first Social

Semantic Web application. Anyone can use FOAF to describe himself by creating their own FOAF

profile. FOAF allows groups of people to describe social networks without the need for centralized

database. Computers my use these FOAF profiles to find, for example, to list all people both you and

a friend of yours know.

13

QUERIES SPARQL

We have this huge Web of data which is growing very fast and we need some way of getting the

information from the data. To do that we use queries, just like relational databases uses SQL and

XML uses XQuery to get information from the data. But in the Semantic Web we use RDF-Specific

query language SPARQL that makes it possible to send queries and receive result through HTTP or

SOAP. SPARQL queries are based on triple patterns, it provides patterns against RDF triplets. These

triple patterns are similar to RDF triplets, except that one or more of the constituent resource

references are variables. A SPARQL engine would return the resources for all triples that match these

patterns. (15)

SPARQL allows users to write queries that consist of triple patterns with conjunctions (and), and

disjunctions (or). In SPARQL the query is actually specifying a pattern in the data that should be

matched in a result set. Given a particular triple pattern in a query, a SPARQL processor considers

sets of triplets in the target RDF model that matches the pattern. Let’s take an example:

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX owl: <http://www.w3.org/2002/07/owl#>

PREFIX books:http://www.dummies.com/books#

SELECT ?title

WHERE {

?book rdf:type books:Books .

?book books:author

http://me.jtpollock.us/foaf.rdf#me .

?book dc:title ?title .

}

ORDER BY ?title

This query looks for books with the author Jeff Pollock and orders the results in a list. In the Where

clause, we are specifying triple patterns. The first pattern matches on all RDF instances that are of

the rdf:type Book, The second pattern matches on all those RDF instances that have a book: author

relationship to Jeff Pollock. These two patterns are a conjunction (and). The “?” in front of the word

book, indicates a variable, the thing we are looking for. And finally we order the results by title. This

will give us all the books with this specific author. (20 bls. 230-232)

http://www.dummies.com/books

14

EXAMPLES OF SEMANTIC WEBS

LINKED DATA SEARCH ENGINE

Semantic search engines can be divided into two groups Human Oriented Search engines and

application oriented search engines. The human oriented search engines are similar to Google with

keyword based search, but rather than simply provide links to the pages the Semantic search will

provide a more detailed interface where the user can exploit the underlying structure of the data. It

provides the option to search for object, concepts and documents where each one lead to a different

results. The application oriented search was developed to serve the needs of applications built on

top of linked data. They provide APIs through which linked data application can discover RDF

documents on the web. (25)

MEDICAL – HEALTHBASE

HealthBase is like your own personal doctor, it’s a new

semantic web page that allows you to search for a

condition, treatments and drugs and it performs

semantic search on all other health-related sites on the

web. That means it doesn’t just take a look at the titles

of the articles and give you he results, it reads into the

actual text to deliver some useful information to you

without all the disturbing advertisement and junk the

follows. The page is very simple and useful. The

HealthBase database drills into over 10 million health

documents from all over like, WebMD and Yahoo Health, so you are getting the same information as

you would get in the other places but you get them all categorized without the ads, animations and

all the extra garbage. Because HealthBase doesn’t build its own resources but reuses other sites

resource, it is only as good as the sites it searches. (30)

DBPEDIA

DBPedia is a project developed by OpenLink Software and Universities of Leipzig and Berlin. Its

objective is to extract structured information from the information in Wikipedia. Then the structured

data is made available on the Web. It is one of the most important datasets in the Linked Open Data

Project as it describes more than 2.9 million things with more than 3.7 million interlinks with other

datasets like Freebase, GeoNames and CIA World Fact Book. DBPedia includes at least 292,000

persons, 339,000 places, 8,000 music albums, 44,000 films, 15,000 video games, 119,000

organizations, 130,000 species and 4400 diseases. (31) And it is growing as Wikipedia grows. With

Wikipedia you only have keyword search but with DBPedia you are going to be able to search for

queries like “Give me all Italian musicians from the 18th century”. You can also go through the

Wikipedia where you choose your search criteria (32) For Example you can search for all films that

Martin Scorsese has directed that starred Leonardo DiCaprio and Alec Baldwin. That gives you the

Picture 13: HealthBase (30)

15

films The Departed and The Aviator. These are just two examples of what they are doing, there is a

lot more.

BBC MUSIC

BBC has been developing a new project about

semantically link web pages about artist and singers

whose songs are played on BBC radio stations. Within

these pages, collections of data are enhanced and

interconnected with semantic metadata, letting music

fans explore connections between artists that they may

have not known existed. The Semantic technology adds

additional context to data about the artists which can

include anything from previous bands, venues played,

instrument played and more. Most of the information comes from MusicBrainz (33) and DBPedia

(31). MusicBrainz is an open content metadatabase that lists information for over 400,000 artists.

BBC uses the information from MusicBrainz about the artists and then adds additional information

from DBPedia, like the biography. By reusing the content that is already on the open web saves

money, time and energy. (34)

ETOURISM

Today the web is a big showcase for cities that want

to build and expand their tourism industry. With all

the information available on the web people are

starting to plan their trips on the web in advance, so

cities are competing against each other to offer the

best information and services through the tourism

on their web sites. While most cities just have a few

options like “a Weekend in Vestmannaeyjar” or

“Sport Week in Manchester” the City of Zaragoza in

Spain had developed a web application called

CRUZAR that uses expert knowledge (in form of

rules and ontologies) and a comprehensive

repository of relevant data gets from databases about events and places of interests in the city

and builds unique route for each visitor based on his hobbies and interest. Let’s say you want to see

a football game, have some spicy food, look at some museums and go on a nice and hot beach, the

CRUZAR will make the best route for you. (35)

EVERYBLOCK – MASHUP SITE

Picture 14: BBC Music Covering David Bowie (40)

Picture 15: Ontology for CRUZAR (35)

16

A mashup is a web page or application that uses or combines data or functionality from two or many

more external sources to create a new service. And with semantic technology this is becoming much

more common. A good example is EveryBlock (36) it shows Civic information like building permits,

crimes, restaurants inspections, and also news articles from major newspapers, TV and radio stations

and blogs. It even includes fun stuff like pictures from Flickr, user reviews of businesses and lost and

found listing. It shows all these information on a map you can choose a town, district, Zip code or

even street address to see what is going on in your neighborhood. It is pretty awesome to be able to

see how much crime is in your area, what restaurants didn’t pass health inspections and even

checking if there is some Hollywood filming in your neighborhood. They use semantic search engines

to crawl the web and government data for information. (36)

CONCLUSION

It’s getting pretty clear that the Semantic Web is the next step in the web evolution. Finally the

original idea of the Web of Berners-Lee is coming to life. It has taken over ten years to design and

build good standards like RDF and OWL for the Semantic Web but it is not over yet. There are still no

standards for Logic, Truth and Proof. I didn’t find much information about those aspects of the

Semantic Web in my research so I think there are still some years until they got that covered. In the

meantime we just work on linking more data.

The Open Linked Data Project is doing a good job connecting datasets which is very good because it

is a fundamental thing for the Semantic Web: “To have lots and lots and lots of Data” as Berners-Lee

said. The amount of connected data is growing each year so people are picking up and putting their

data out there. This is a clue that the Web is already changing to a more open, more accessible and a

better Web.

I found some very interesting things that people are doing with the linked data as I showed in the

examples. Search Engines, Self Diagnosing Medical Site, and a music site connecting information

from different places, Tourism site making a special route for each person, and a site that can show

how much crime is in your neighborhood! Those are some pretty cool sites and I Think this is just the

tip of the iceberg of what is going to happen, there are probably lots of other ideas in the

development out there in some basement s or universities. It will be very interesting following this

technology in the next few years because it is going to be huge. Just imagine if you have all this linked

data out there, in one big database. The possibilities would be endless.

17

BIBLIOGRAPHY

1. The History Of World Wide Web. Wikipedia. [Online] [Cited: Mars 2, 2010.]

http://en.wikipedia.org/wiki/History_of_the_World_Wide_Web.

2. W3. W3 All standards. W3. [Online] 2010. [Cited: March 18, 2010.] http://www.w3.org/TR/.

3. Gopher Protocol . Wikipedia. [Online] June 2009. [Cited: March 23, 2010.]

http://en.wikipedia.org/wiki/Gopher_(protocol).

4. Dot.Com Bubble. Wikipedia. [Online] [Cited: March 23, 2010.] http://en.wikipedia.org/wiki/Dot-

com_boom.

5. MySpace. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Myspace.

6. Facebook. Wikipedia. [Online] [Cited: March 2, 2010.] http://en.wikipedia.org/wiki/Facebook.

7. Wikipedia Size Comparison. Wikipedia. [Online] [Cited: March 2, 2010.]

http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons.

8. white, phillip. How many videos are on youtube. associatedcontent. [Online] July 9, 2009. [Cited:

March 2, 2010.]

http://www.associatedcontent.com/article/1927414/how_many_videos_are_on_youtube.html

http://logicerror.com/semanticWeb-long.

9. Paul, Ryan. Study: amount of digital info > global storage capacity. Ars Technica. [Online] March

12, 2008. [Cited: March 25, 2010.] http://arstechnica.com/old/content/2008/03/study-amount-of-

digital-info-global-storage-capacity.ars.

10. Data, data everywhere. Economist. [Online] Febuary 25, 2010. [Cited: March 25, 2010.]

http://www.economist.com/specialreports/displaystory.cfm?story_id=15557443.

11. Berners-Lee, Tim. Semantic Web Road Map. w3. [Online] Okt 14, 1998. [Cited: March 25, 2010.]

http://www.w3.org/DesignIssues/Semantic.html.

12. Questioning Semantic Web History. Zimbio. [Online] Jan 22, 2009. [Cited: March 25, 2010.]

http://www.zimbio.com/Semantic+Web+-+Web+3.0/articles/3/Questioning+Semantic+Web+history.

13. Linking Open Data. W3. [Online]

http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData.

14. Ontologies. W3. [Online] 2010. [Cited: March 18, 2010.]

http://www.w3.org/standards/semanticweb/ontology.

15. Query. W3. [Online] 2010. [Cited: March 18, 2010.]

http://www.w3.org/standards/semanticweb/query.html.

18

16. Internet World Statistics. Internet World Statistics. [Online] September 30, 2009. [Cited: March

18, 2010.] http://www.internetworldstats.com/stats.htm.

17. Asay, Matt. Is Internet access a fundimental right. news.cnet.com. [Online] may 6, 2009. [Cited:

march 18, 2010.] http://news.cnet.com/8301-13505_3-10234555-16.html.

18. Twine. Twine. [Online] [Cited: March 24, 2010.] http://www.twine.com.

19. TripIt. TripIt. [Online] [Cited: March 25, 2010.] http://www.tripit.com/.

20. Pollock, Jeffrey T. Semantic Web For Dummies. Indianapolis, Indiana : Wiley Publishing, Inc, 2009.

21. Callari, Ron. Web 4.0 Trip Down The Rabbit Hole Or Brave New World. zmogo. [Online] June 3,

2009. [Cited: March 25, 2010.] http://www.zmogo.com/web/web-40trip-down-the-rabbit-hole-or-

brave-new-world/.

22. Linked Data. w3.org. [Online] June 18, 2009. [Cited: March 17, 2010.]

http://www.w3.org/DesignIssues/LinkedData.html.

23. Berners-Lee, Tim. On The Web Next. Ted.com. [Online] March 2009. [Cited: March 17, 2010.]

http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html.

24. —. _the_year_open_data_went_worldwide.html. Ted.com. [Online] March 2010.

http://www.ted.com/talks/tim_berners_lee_the_year_open_data_went_worldwide.html.

25. Christian Bizer, Tom Heath, Tim Berners-Lee. Linked Data - The Story So Far. s.l. : Special Issue on

Linked Data, International Journal on Semantic Web and Information Systems, 2009.

26. Swartz, Aaron. The Semantic Web In Breadth. Logicerror. [Online] [Cited: March 2, 2010.]

http://logicerror.com/semanticWeb-long.

27. RDF Primer. W3. [Online] February 10, 2004. [Cited: March 2, 2010.] http://www.w3.org/TR/rdf-

primer/.

28. W3C Semantic Web Frequently Asked Questions. W3C. [Online] November 12, 2009. [Cited:

March 22, 2010.] http://www.w3.org/2001/sw/SW-FAQ#whont.

29. OWL 2 Primer. W3. [Online] October 27 , 2009. [Cited: March 22, 2010.]

http://www.w3.org/TR/owl2-primer/.

30. Dannen, Chris. Self Diagnosing Web. FastCompany. [Online] FastCompany, September 2, 2009.

[Cited: March 14, 2010.] http://www.fastcompany.com/blog/chris-dannen/techwatch/self-

diagnosing-web.

31. DBPedia. Wikipedia. [Online] [Cited: March 24, 2010.] http://en.wikipedia.org/wiki/DBpedia.

32. DBpedia Faceted Browser. DBPedia. [Online] Open Link Software, November 16, 2009. [Cited:

March 24, 2010.] http://dbpedia.neofonie.de/browse/.

33. musicbrainz. musicbrainz. [Online] [Cited: March 24, 2010.] http://musicbrainz.org/.

19

34. Perez, Sarah. BBCs Semantic Music Project. Read Write Web. [Online] January 21, 2009. [Cited:

March 24, 2010.] http://www.readwriteweb.com/archives/bbcs_semantic_music_project.php.

35. Case Study: CRUZAR — An application of semantic matchmaking for eTourism in the city of

Zaragoza. w3 case studies. [Online] August 2008. [Cited: March 24, 2010.]

http://www.w3.org/2001/sw/sweo/public/UseCases/Zaragoza-2/.

36. EveryBlock. EveryBlock. [Online] [Cited: March 24, 2010.] http://www.everyblock.com/.

37. Wilson, Tracy V. How Semantic Web Works. How Stuff Works. [Online] [Cited: March 22, 2010.]

http://computer.howstuffworks.com/semantic-web4.htm.

38. Data Set Sizes. w3.com. [Online] March 9, 2010. [Cited: March 24, 2010.]

http://esw.w3.org/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics.

39. http://www.pharmasurveyor.com/. http://www.pharmasurveyor.com/. [Online]

http://www.pharmasurveyor.com/.

40. BBC Music. BBC. [Online] [Cited: March 24, 2010.] www.bbc.co.uk/music.

41. Technology Life Cycle. WIkipedia. [Online] [Cited: March 24, 2010.]

http://en.wikipedia.org/wiki/Technology_lifecycle.

42. Semantic Technology Primer. Semantic-Confrence. [Online] 2006,2007. [Cited: March 18, 2010.]

http://www.semantic-conference.com/primer.html.

43. Les Horribles Cernettes. Wikipedia. [Online] [Cited: March 2, 2010.]

http://en.wikipedia.org/wiki/Les_Horribles_Cernettes.

44. Dot Com Bubble. Wikipedia. [Online] [Cited: March 2, 2010.]

http://en.wikipedia.org/wiki/Dot_Com_Bubble.

45. MsJosay. The Difference between Web 2.0 and Web 1.0. hubpages. [Online] [Cited: March 2,

2010.] http://hubpages.com/hub/The-Difference-between-Web-20-and-Web-10.

Download - Sw semantic web

Top Related