smx west barbara starr, 2013 schema 201
DESCRIPTION
Schema 201: Real World SuccessTRANSCRIPT
Real World Markup For success: From a Search Engine Perspective.
Schema 201
By: Barbara StarrTwitter: @BarbaraStarrEmail: [email protected]
• Pursued a doctorate in Artificial Intelligence from South Africa in the 80's.
• Recruited to build intelligent/predictive trading systems on Wall Street
• Migrated to government-based contracts, several of which turned into real world products like
– SIRI (PAL from DARPA)– WATSON (Acquaint - IBM Watson Labs was a team
member)• From the vantage of a semantic technologist, I keenly
watched the evolution of the Semantic Web.• “Shocked into the real world” when working as a
consultant @ Overstock• Today - Educator, Consultant, Developer, Strategist
Meta Information
ME
By: Barbara StarrTwitter: @BarbaraStarrEmail: [email protected]: http://www.linkedin.com/in/barbarastarr
My favorite author:Isaac Asimov
Favorite book:I Robot
Favorite character:MULTIVAC
Additional MetainformationFor the purpose of this talk:
same-as
MY ROBOT or Artificially Intelligent Entity or Search Engine
OWL
Metadatasame-as
OWLStructured Markup
same-as
OWLSemantic Markup
By Barbara StarrTwitter: @BarbaraStarrLinkedin: http://www.linkedin.com/in/barbarastarr
SEARCH ENGINE POINT OF VIEW
Structured markup for real world success! Just
some background first!
Sorry guys, but it’s back to me. He is a
headless browser (like googlebot and bingbot)
And does not talk to humans!
SEARCH ENGINE POINT OF VIEW
When listening to me, bear in mind that if you make
me happy, you will be too!
There are many means by which I can exploit structured markup!
SEARCH ENGINE POINT OF VIEW
RICH SNIPPETS 2009
tiles
Searchmonkey 2008
I can directly extract information from
structured markup to enhance SERP displays
SEARCH ENGINE POINT OF VIEW
I can search directly on consumed metadata!
SEARCH ENGINE POINT OF VIEW
I can provide direct answers to queries by
searching on consumed, verified and validated information
SEARCH ENGINE POINT OF VIEWI can even aggregate answers or deduce
them (like a timeline of events)
I can also leverage it to expose more relevant answers in the long tail
of search
SEARCH ENGINE POINT OF VIEW
I can even use it in conjunction with machine learning techniques- to eg.
Train other components
I can detect relevancy
signals: i.e what content to show
to what audience
I can use it to Assist in
interpreting a user query
Penn Treebank tagset
?
SEARCH ENGINE POINT OF VIEW
I can leverage metadata for better image
search
SIRI
I can combine it with computer vision techniques.
I can enhance user’s shopping experience.
SEARCH ENGINE POINT OF VIEW
I’m a Search Engine Robot
I could really use this stuff. And it is like the tower
of babel out there!
MicrodataMicroformatsRDFa
Multiple conflicting vocabularies that I will have to align internally
and multiple syntax formats as well.
Prior to Schema.org
Goodrelations for e-commerce
?
Timeline of RDFa and Semantic Web Adoption
As of Semtech 2011 June 2 – Schema.org announced
Inevitable passage of Semantic Web adoption –culminating in schema.org
Prolific growth of the LOD Cloud
SEARCH ENGINE POINT OF VIEW
Align and consume many vocabularies that may not be of interest to search
engines?
Rather mandate vocabulary And Syntax - microdata
A Search Engine alliance has the power
to MANDATE vocabulary and syntax!
17
\
Bringing Order from Chaos
On subjects Search Engines areInterested in!
With great:• Tools• Mappings• And more • From the W3C
SEARCH ENGINE POINT OF VIEW
Symbolic reasoning vsprobabilistic reasoning!
INTRODUCING THE KNOWLEDGE GRAPH
“Know” rather than “recognize”
SEARCH ENGINE POINT OF VIEW
♫
Folks finding answers on my page never even have to click through to yours!
And speaking of the knowledge
graph or knowledge carousel!
I can even now start to derive associations or relationships
between entities.
SEARCH ENGINE POINT OF VIEW
I find it so helpful that I would really like to be able to keep all that
validated verified information to myself!
SEARCH ENGINE POINT OF VIEW
Check out this great data highlighter. The information is available only to me and
not to any other search bots! Can you believe I have been
accused of hijacking structured markup?
I find it so helpful that I would really like to be able to keep all that
validated verified information to myself!
SEARCH ENGINE POINT OF VIEW
How do I make this information findable and visible to users?
I could use your assistance as follows!
SEARCH ENGINE POINT OF VIEW
Ensure the following match:• on page markup • data in any feeds you
submit• information visible to the
user/human!
Enrich your content/data.
Rich markup sends rich signals to search
engines.
Mark up as much
information as you can.
SEARCH ENGINE POINT OF VIEW Clearly, if you do not populate the “color”
attribute, it is not possible for your product to show
up in that filter.
As an example, look at the filters that show up on the left hand side in
Google Shopping.
SEARCH ENGINE POINT OF VIEW
For example, searching in the recipe vertical, if you have not entered recipe
information, your results will be “filtered out” from that
SERP result set.
This same type of logic also applies to the various verticals
(however at a higher level in the “search
taxonomy” so to speak)
SEARCH ENGINE POINT OF VIEW
Adding context in search verticals really
helps me serve up relevant information
(Seriously increases my recall), as does
geospatial information.Consumed information -Structured Data Dashboard
Google’s “SearchVerticals”
Notice any correlations?I would advise you to!
Consequently, drilling down into a query using more and more filters, enables me to better
refine my understanding of the intent of your
search .
RewardRISK
Visibility and misperceived information exposureoutweighs “Risk” as the exposure is controllable
Visibility overpowers RiskIn fact, if correctly done, Risk is completely controllable!
Determining what data to expose is optional, controllable and a business dependent decision.
Quantifiable, Measurable, Avoidable
• Fine line between visibility & exposing information?• Completely controllable
• Completely Avoidable
• Business dependent solution
• Level 5 Place Holder
SEARCH ENGINE POINT OF VIEW
They are also leveraging it in their newly released
graph search!Not only that, they are even building an entity graph not
dissimilar from my knowledge graph!
My social counterparts have been leveraging
structured markup (rdfa) for their
opengraph protocol for quite some time.
The Open Graph Protocol enables you tointegrate your Web pages into the social graph Example of crowdsourced
entity graph info source - places
SEARCH ENGINE POINT OF VIEW
Advice Summary!
Running Out of time
SEARCH ENGINE POINT OF VIEW
Make sure that everything you mark up is also visible to the human end user. If not, you are
cloaking!
Mark up everything you can. (within reason and your
business priorities)
SEARCH ENGINE POINT OF VIEW
Don’t try to spam me. You will not only run the risk of a penalty, but you will also lose my trust. (the latter is an important signal in and
of itself!)
Make sure your information is fresh and there are no stale links.
SEARCH ENGINE POINT OF VIEW
Ensure your data is of the highest possible quality (Cleaned and scrubbed)
and richly attributed. That will ensure your maximum visibility in my verticals and
search filters.
SEARCH ENGINE POINT OF VIEW
Check the list to see what is coming out next! Schema.org is
dynamic and is growing!
Mark up information not yet consumed by search
engines to get the advantage of extra lift
when it is adopted.
SEARCH ENGINE POINT OF VIEW
Stay tuned for way more to come in the not
too distant future!
Ensure your images are
enhanced and also marked up.
By Barbara StarrTwitter: @BarbaraStarr
Linkedin: http://www.linkedin.com/in/barbarastarr
E-mail: [email protected]
Bye for now