seo for the semantic web

42
How do the machines know what Tasty Wheat what Tasty Wheat tasted like? Mouse – The Matrix

Upload: mihai-gheza

Post on 12-Jan-2015

6.087 views

Category:

Technology


0 download

DESCRIPTION

A brief history of SEO from WWW to RDF, Microformats and SPARQL. First presented at GeekMeet #2 in Cluj Napoca on Mar 1st 2008

TRANSCRIPT

Page 1: SEO for the Semantic Web

How do the machines know what Tasty Wheat what Tasty Wheat

tasted like?Mouse – The Matrix

Page 2: SEO for the Semantic Web

Short SEO HistoryShort SEO History• Web1 0• Web1.0

• Web2.0Web2.0

• Web3.0

Page 3: SEO for the Semantic Web

GenesisGenesis

• A story of the Internet byA story of the Internet, by

• Solving the most important problems

l i fl d b• Greatly influenced by one man…

Page 4: SEO for the Semantic Web

Tim Berners‐LeeTim Berners Lee

“the World Wide Web is Berners-Lee's alone. He designed it. He loosed it on the gworld. And he more than anyone else has fought to keep it open, nonproprietary and free.”

Time Magazine 1999Time Magazine, 1999

Page 5: SEO for the Semantic Web

The ProblemThe Problem

• Where can I find the information?Where can I find the information?

“Our ineptitude in getting at the record is largely caused by the artificiality of the systems of indexing ”systems of indexing.

The Atlantic Monthly, 1945

Page 6: SEO for the Semantic Web

Archie, 1990Archie, 1990

• Indexed file names andIndexed file names and

• Returned results based on pattern matching

Page 7: SEO for the Semantic Web

Web1 0Web1.0

Page 8: SEO for the Semantic Web

Web1.0Web1.0

• Means HTMLMeans HTML

• Is born in 1991, with the help of

i ( ) h l f d d• Tim Berners‐Lee (TBL), who also founded

• WWW Consortium (W3C) at MIT, and also

• Created WWW Virtual Library – the 1st catalog

Page 9: SEO for the Semantic Web

Yahoo Directory, 1994Yahoo Directory, 1994

• Vertical = categories is likeVertical = categories... is like

• “Show me all the stuff and I’ll handle it”

ll i d d ff hi h• Manually indexed stuff, which was

• OK for starters, but…

• Websites quickly grew in number and

• Y! started charging money for one listingY! started charging money for one listing

• Increasingly more money...

Page 10: SEO for the Semantic Web
Page 11: SEO for the Semantic Web

,1994,1994

• First SE to fully search textFirst SE to fully search text

• Bought by AOL, then

S ld i hi h• Sold to Excite, which

• Excite went bankrupt and

• WebCrawler ends up bought by InfoSpace

Page 12: SEO for the Semantic Web

Other “Search Engines”Other  Search Engines

• 1994 reaches 60mil pages in ‘961994, reaches 60mil pages in  96

• 1995, bought by Overture, bought by Y!

996 h b h b• 1996, meta search, bought by Lycos

• 1997, bought by IAC/InterActiveCorp

• 1999, bought by Overture, meaning Y!

Page 13: SEO for the Semantic Web

Shopping fun, right?Shopping fun, right?

Page 14: SEO for the Semantic Web

, 1998, 1998

• Open Directory ProjectOpen Directory Project

• Each listing is checked and certified by a volunteervolunteer

• The main source for Google Directory

Page 15: SEO for the Semantic Web

Current State of Search IndustryCurrent State of Search Industry

Page 16: SEO for the Semantic Web

Web1.0 ProblemsWeb1.0 Problems

• SE couldn’t understand text soSE couldn t understand text, so 

• They said “why don’t you implement some meta tags (description & keywords) so we canmeta tags (description & keywords) so we can get a glimpse of what you’re saying”

Th l f i h• The relevancy of a page with respect to a keyword was determined by a few factors, so

• It was very easy to abuse and spam, therefore

• Search Results had poor qualityp q y

Page 17: SEO for the Semantic Web

Web2 0Web2.0

Page 18: SEO for the Semantic Web

Web2.0Web2.0

• Is coined by Tim O’Reilly yetIs coined by... Tim O Reilly, yet

• TBL later said that “web2.0” is a stupid, meaningless term and that he thought of itmeaningless term and that he thought of it first in ’96 anyway

Page 19: SEO for the Semantic Web

Web2.0 meansWeb2.0 means

• which grew apart because ofwhich grew apart because of

• PageRank (1998) invented by

& S i h d d h l f• Larry & Sergei who adapted the algo from

• An MIT professor who had developed

• A nasty mathematical formula for positioning keywords in a 3d space model based on the y prelevancy that one kw holds … whatever

Page 20: SEO for the Semantic Web

PageRank actually meansPageRank actually means

• That a link is a vote andThat a link is a vote and

• Not all links are created equal, so

h li k• It matters who links to you

• Just like in our real life society

Page 21: SEO for the Semantic Web

• Read the content of pages really well just thatRead the content of pages really well, just that

• Pages were crappy:N t d d di– Non‐standard coding

– Ugly tech (like applets)

– Senseless IA

• So Google said: “don’t do evil and try to nicely format the info, according to W3C standards”(remember TBL)

Page 22: SEO for the Semantic Web

Enter the SEOEnter the SEO

Page 23: SEO for the Semantic Web

SEOSEO

• Is a multitude of practices aimed at facilitatingIs a multitude of practices aimed at facilitating the indexing of pages by search engines

• Evolves as the ranking algorithm changes and• Evolves as the ranking algorithm changes, and

• Of course, the algorithm is kept secret.

Page 24: SEO for the Semantic Web

SEO actually meansSEO actually means

Courtesy of Kelly Ishikawa

Page 25: SEO for the Semantic Web

SEO actually meansSEO actually means

• An on‐going battle between bots & SEO guysAn on going battle between bots & SEO guys

• Now 100+ factors influence ranking

d ’d lik k h i lk b h• And I’d like to take the time to talk about each one of them in the following…

Page 26: SEO for the Semantic Web

Just kiddingJust kidding

Page 27: SEO for the Semantic Web

My SEO Cheat SheetMy SEO Cheat Sheet

• Consider:Consider:1. Page Titles2. URLs (mod_rewrite)3. Anchor Text4. Website Architecture (IA)5. Link Title & Alt Images6. Relevant content (text)7 Sitemap xml7. Sitemap.xml8. Hosting9. Freshness9. Freshness

Page 28: SEO for the Semantic Web

ResourcesResources

Matt Cutts Blog

Mihai’s SEO Cheat Sheet :D

Page 29: SEO for the Semantic Web

Web2.0 ProblemsWeb2.0 Problems

• © for pictures articles books etc© for pictures, articles, books, etc

• PPC fraud

i• Privacy

• Search Engine SPAM

• Link bombing

• Paid linksPaid links

• But more important...

Page 30: SEO for the Semantic Web

Web2.0 ProblemsWeb2.0 Problems

• SE still don’t understand what the $#%@SE still don t understand what the $#%@ you’re talking about

• Crawling a website’s interface to extract info is• Crawling a website s interface to extract info is almost insane

Page 31: SEO for the Semantic Web

Web3 0Web3.0

Page 32: SEO for the Semantic Web

Web3.0Web3.0 

• Means semantic webMeans semantic web

• Attention migrates from syntax/formatting to semantics andsemantics and

• Meta Data (data about the data) becomes...

Page 33: SEO for the Semantic Web

Web3.0Web3.0

&

Resource Description MicroformatsResource DescriptionFramework

Microformats

Page 34: SEO for the Semantic Web

Resource Description FrameworkResource Description Framework

• A kind of XMLA kind of XML

• RDF = Subject + Predicate + Object

S O i l hi h• S + P + O creates a Triple which

• Can describe almost anything in the universe

• Triples are connectable (eg: FOAF)

• RDFa = XHTML + RDF (W3C compliant)RDFa  XHTML + RDF (W3C compliant)

Page 35: SEO for the Semantic Web

MicroformatsMicroformats

• hCalendar • hCard• rel‐tag• VoteLinks• XFN• Geo• hResumehR i• hReview

• etc

Page 36: SEO for the Semantic Web

Case StudyCase Study

Page 37: SEO for the Semantic Web

SPARQLSPARQL

• SPARQL Protocol and RDF Query LanguageSPARQL Protocol and RDF Query Language

• Standardized on 15th Jan 08 (1 month ago) and

d d b ?• Endorsed by?... TBL

"Trying to use the Semantic Web withoutSPARQL is like trying to use a relational Q y g

database without SQL“

TBLTBL

Page 38: SEO for the Semantic Web

PotentialPotential

• With SPARQL you skip the presentation layerWith SPARQL you skip the presentation layer

• You can query ad‐hoc any API, so

d ’ d l i d h f• You don’t need to crawl in advance, therefore

• Information will be as fresh as it gets

Page 39: SEO for the Semantic Web

And possibilitiesAnd possibilities

• Query: “I can has pizza?”Query:  I can has pizza?  

• Returns: A f i d f (XFN F b k)– A friend of yours (XFN ‐ Facebook) 

– has a colleague (FOAF ‐ LinkedIN) who

( )– said that they make good pizza (hReview ‐ yelp) at

– a restaurant nearby (geo – Gmaps)

– Tip: U2 in concert today (hCalendar ‐ upcoming)

Page 40: SEO for the Semantic Web

Perhaps now we can seePerhaps now we can see

• Why Social Networking Communities areWhy Social Networking Communities are worth so much, even though most of them don’t have a revenue model– Facebook– LinkedIN– Meebo– Beebo – Pipu...

• They/We are the databases of the future

Page 41: SEO for the Semantic Web
Page 42: SEO for the Semantic Web

Thanks!Thanks!

“Most of the right choices in SEO come from asking: What’s the best thing for the user?”g g

Matt Cutts

Mih i GhMihai Gheza 

Creative Commons Attribution‐Noncommercial‐Share Alike 3.0 Unported License.