diagnosing technical issues with search engine optimization

Post on 14-Sep-2014

23.008 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

If your site is having trouble ranking well in search engines such as Google, you've lost ranking, or you've having trouble with a site move or migration, the trouble could be with the site's technical architecture.View checklists to help diagnose issues with crawling, indexing, and ranking your site's content.

TRANSCRIPT

Tools and Tactics for Diagnosing Technical Search Issues

Vanessa Fox

Diagnostic Checklists and Resources

• Search Accessibility Checklist• Search Discoverability Checklist• Diagnostic Tools

janeandrobot.com

Search Engine Tools

Created by NineByBlue.com

Google Webmaster Centralhttp://www.google.com/webmasters

Microsoft Live Search Webmaster Centerhttp://webmaster.live.com

Yahoo! Site Explorerhttp://siteexplorer.search.yahoo.com

Google Analyticshttp://www.google.com/analytics

Google Searchhttp://www.google.com

Ranking and Diagnostic Tools

Created by NineByBlue.com

SEOBook Rank Checkerhttp://tools.seobook.com/firefox/rank-checker/

Firefox Web Developer Toolbarhttps://addons.mozilla.org/en-US/firefox/addon/60 Firefox Firebughttp://getfirebug.com/

Firefox Live HTTP Headershttps://addons.mozilla.org/en-US/firefox/addon/3829

Google Searchhttp://adlab.msn.com/Keyword-Forecast/default.aspx

http://janeandrobot.com/resources

How Search Engines Work

Crawling

Discover linksCheck robots rulesBandwidth considerationsURLs

Indexing

CanonicalizationContext extractionTopic associationWeb-wide value

Ranking

RelevanceValueUniquenessDisplay

Search Engine Crawlers Haven’t Quite Grown Up Yet

Crawling

Lack of discoveryCrawl inefficiencyURL issues (infinite, redirects, dynamic)Inaccessible links

Indexing

DuplicationExtraction issuesLack of exposed contentNon-optimized media

Ranking

Display issuesLack of quality linksGuidelines violationsNon-focused content

Step 1: Get the Data

Pages crawledPages indexedWeb trafficKey ranking metrics

Crawling

Indexing

Ranking

Which pages have the search engines crawled?

What kind of pages are they?

Has the search engine indexed all of the crawled pages?

How’s the search engine traffic?

Benchmarking

Top ten queries that bring search trafficSearch results positionURL that ranks

Crawl Issues

Crawl Log Example: Apache Log Analyzer 2 Feed

1 /** 2 * @see ApacheLogAnalyzer2Feed 3 */ 4 require_once 'ApacheLogAnalyzer2Feed.php'; 5 6 // create a new instance, parse access.log and 7 // write test.xml 8 $tool = new ApacheLogAnalyzer2Feed('access.log', 9 'test.xml'); 10 // select entries matching Googlebot useragent 11 $tool->addFilter('User-Agent', 'Mozilla/5.0 12 (compatible; Googlebot/2.1; 13 +http://www.google.com/bot.html)'); 14 // run 15 $tool->run(); 12

http://code.simonecarletti.com/wiki/apachelog2feed

1 /** 2 * @see ApacheLogAnalyzer2Feed 3 */ 4 require_once 'ApacheLogAnalyzer2Feed.php'; 5 6 // create a new instance, parse access.log and write test.xml 7 $tool = new ApacheLogAnalyzer2Feed('access.log', 'test.xml'); 8 // select entries matching Googlebot useragent with a regular 9 expression pattern 10 $tool->addFilter('User-Agent', 'regexp:Googlebot'); 11 // select entries with Request matching a regular expression 12 // pattern 13 $tool->addFilter('Request', 'regexp:/site/profile\.php'); 14 // run 15 $tool->run(); 16

All Pages Google’s Crawled

All Profile Pages Google’s Crawled

Communicating with Search Robots

Extractable Link Issues: Flash

Extractable Link Issues: Images

Extractable Link Issues: AJAX

Extractable Link Issues: URL Errors

Extractable Link Issues: URLs That Expire

Comprehensive external links At least one internal link to every

page XML Sitemap referenced in

robots.txt with the comprehensive list of canonical URLs

Comprehensive HTML sitemap Ensure links load without

JavaScript, images, or other rich media

Ensure robots.txt and meta robots tag is used correctly

URL Discovery Checklist

http://janeandrobot.com/library/managing-robots-access-to-your-website

URL Structure Checklist

Keep number of parameters in dynamic URLs shortDon’t use temporary URLs that expire Ensure redirects are 301 and are shortUse dashes rather than underscores when separating wordsUse keywords in URLs for higher click through and better anchor text

Canonicalization Checklist

Have only URL for each pagePut all unneeded details in cookies, rather than URLs

(session IDs, tracking parameters)Don’t allow infinite parametersUse 301 redirects for any URL changes301 redirect www/non-wwwUse absolute URLs for internal linksEnsure canonical version is in XML SitemapUse rel=canonical attribute for optional parametersBlock print and other versions with robots.txt

http://janeandrobot.com/library/url-referrer-tracking

http://searchengineland.com/canonical-tag-16537

Crawl Efficiency Checklist

Ensure page load times aren’t slow as to reduce number of pages crawledEnsure server is responsiveReturn a 304 for unchanged contentUse compressionReturn a 404 for not found contentEnsure each page has at least one linkAvoid infinite redirects and redirect loopsEnsure most important pages are linked from home pageNo JavaScript redirects or meta refresh redirects (if possible)Reasonable crawl-delay setting (if used at all)Reasonable use of Google Webmaster Tools crawl setting

Indexing Issues

Indexing Example: XML Sitemaps

http://sitemaps.org

XML Sitemap

<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">    <url>       <loc>http://www.example.com/</loc>    </url><url>       <loc>http://www.example.com/page1.php</loc>  </url> <url>       <loc>http://www.example.com/page2.php</loc>  </url> </urlset>

http://www.google.com/webmasters

Pages Indexed From Sitemap

Duplicate Content Issues

Partner Content

http://www.google.co.uk/search?q=%22The+Radisson+Edwardian+Vanderbilt+Hotel+stands+among+a+row+of+Victorian+townhouses+located+in+the+fashionable+Kensington+district+of+London,+England%22&hs=cN0&filter=0

Indexing Diagnostic Checklist

Have the pages ever been indexed?

If deindexed, are you sure they are no longer in the index?

Is the indexing loss across all engines?

What was percentage of loss?

Is there a pattern?

Check Google Webmaster Tools for errors/blocking

Did you change infrastructure/CMS/implement redirects?

What’s the linking pattern?

Indexing Checklist: Content Extraction

Ensure content is in text wherever possible Ensure text isn’t hidden in:

JavaScript/AJAXFlashVideoImages

Avoid multiple URLs for the same page and very similar pages

Indexing Checklist: Semantic MarkupUse keywords in title tagEnsure each page has a unique meta description tagUse keywords in (single) H1 tagAppropriate use of H2 – H6 tagRelevant anchor text in a href tagsPut Javascript in .js file (except onclick event functions)

and style details in .cssValidate HTML to ensure it rendersProvide focus for each pageEnsure pages provide unique and valuable content

beyond boilerplate template and reused content

Optimizing Images Don’t put text in images Use descriptive ALT text Use descriptive filenames Provide caption and surrounding text Be cautious about logo images Consider blocking non-useful images with robots.txt Don’t provide alternate text using CSS that styles the text off

the page (such as -9999)

http://janeandrobot.com/post/Effectively-Using-Images.aspx

RankingIssues

How’s the Search Engine Traffic?Overall Percentage Percentage Non-Branded

Do You Rank For the Right Things?arbor snowboards snowboard

Google 1 49 500+

Yahoo 1 80 500+

Live Search 3 128 500+

If ranking loss…

Drop For All Keywords

Does the site rank for different queries than before?

Did you substantially change the site content?

Did you change the underlying site infrastructure?

Was there a large change in linking behavior?

Could there be a penalty?

Drop For Only Some Keywords

Do different pages rank highest than used to rank before?

Are the pages that used to rank still indexed?

Ranking Checklist

Relevance What is the page about? Are the pages ranking for the desired query more relevant? Do the pages use the language of the searcher?

Value How many relevant links (and how authoritative are they?) What’s the value of the page? (do more useful pages rank above

yours?) SERP display

Are the title and snippet compelling? Do Sitelinks appear for navigational queries? What universal elements appear on the page?

Does the site rank for non-branded queries?

The Webmaster GuidelinesCommon Definition of Spam

On page schemes Keyword stuffingFake/ stolen contentHidden textHidden linksCloaking

Linking schemes Paid LinksLink exchangesDoorway pagesDeceptive redirects

http://google.com/support/webmasters/bin/answer.py?answer=35769

Getting Out of the Penalty Box

1. Check if you’ve been penalized– Live Search: http://webmaster.live.com – Google: http://google.com/webmasters ** maybe **

2. Review the webmaster guidelines– Google, Live Search, Yahoo

3. Identify the issue4. Fix it!5. Request re-evaluation– Google: http://google.com/webmasters – Live Search: http://webmaster.live.com

Traffic Issues

Traffic Drop

Display Issues

Would you click this link?

Does the Result Inspire Clicks?

First step in diagnosis: find the root

Ninebyblue.comTwitter.com/vanessafox

Jane and Robot Developer SummitJune 12th, 2009 – San FranciscoFREE for SMX attendees!

janeandrobot.comTwitter.com/janeandrobot

top related