search engine optimization (seo) from endeca & atg

35
Search Engine Optimization (SEO) implementation in Endeca and ATG Vigneswaran Sitaraman ([email protected] m)

Upload: vignesh-sitaraman

Post on 12-Aug-2015

74 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Search engine optimization (seo) from Endeca & ATG

Search Engine Optimization (SEO)implementation in Endeca and ATG

Vigneswaran Sitaraman([email protected])

Page 2: Search engine optimization (seo) from Endeca & ATG

Agenda

• Introduction to SEO

• SEO Techniques

• Implementation of SEO in Endeca

• Implementation of SEO in ATG

Page 3: Search engine optimization (seo) from Endeca & ATG

SEO Introduction

• Search Engine Optimization (SEO) is a term used to describe a variety of techniques for making pages more accessible to web spiders (also known as web crawlers or robots), the scripts used by Internet search engines to crawl the Web to gather pages for indexing. The goal of SEO is to increase the ranking of the indexed pages in 'natural' search results.

Page 4: Search engine optimization (seo) from Endeca & ATG

SEO Introduction (contd)

• Google’s search process:

• http://static.googleusercontent.com/media/www.google.ca/en/ca/insidesearch/howsearchworks/assets/searchInfographic.pdf

Page 5: Search engine optimization (seo) from Endeca & ATG

SEO Introduction (contd)

• How Crawlers find pages:

• Web Crawlers uses sophisticated computer programs to determine the list of urls, how many and when from hundreds and thousands of webservers.

• It begins the crawling process with the list of urls, generated from past crawl process and augmented with sitemap urls. As it crawls, it detects new links (hrefs , image SRC) and adds it to list of urls to crawl further.

Page 6: Search engine optimization (seo) from Endeca & ATG

SEO Introduction (contd)• Indexing & Search

• As web crawlers visits pages, it gathers information from pages and keywords, locations are indexed, so enabling search and lookup faster.

• Just like index of a book with keywords and page numbers.

• As you search using keywords, search engine using sophisticated ranking algorithms to determine best possible matches and retrieves the search results.

Page 7: Search engine optimization (seo) from Endeca & ATG

SEO Introduction (contd)• Disallow crawlers from indexing your pages

• Using robot.txt –

• A file used to specify the urls of the site that should not be crawled. (eg: service agreement, terms and conditions).

• Also used to Specify the location of the sitemap xml files

• Should be placed in root of the web site folder.

Page 8: Search engine optimization (seo) from Endeca & ATG

SEO Introduction (contd)• Robot.txt format

• User agent: *

• Disallow: /terms.html

• Sitemap:http://www.example.com/sitemap.xml

• Exclude individual page or links from indexing:

• <meta name="robots" content="noindex"/>

• <a href="www.example.com" rel="nofollow"/>

• <meta name="robots" content="nofollow"/>

Page 9: Search engine optimization (seo) from Endeca & ATG

SEO Techniques

• URL Recoding

• Canonical links

• SEO Tagging

• SiteMaps

Page 10: Search engine optimization (seo) from Endeca & ATG

• Way of increasing the page ranking in search engine results.

• Make it look more like static URL.

• short friendly urls <2048 charac. with minimum query parameters

• Include as much information in URL in the form of key words to increase the ranking of the indexed page.

URL Recoding

Page 11: Search engine optimization (seo) from Endeca & ATG

• Examples:• (Bad SEO links for product pages – Rogers.com)

http://www.rogers.com/web/link/wirelessBuyFlow?forwardTo=PhoneThenPlan&productType=normal&productId_Detailed=IP6PL64GRY

http://www.rogers.com/web/link/wirelessBuyFlow?forwardTo=PhoneThenPlan&productType=normal&productId_Detailed=IP6PL64GLD

URL Recoding (contd)

Page 12: Search engine optimization (seo) from Endeca & ATG

URL Recoding (contd)Customer searched for specific GOLD Iphone but search result not matching the content

Page 13: Search engine optimization (seo) from Endeca & ATG

• There is no differentiation between Gold and Gray models, due to same dynamic URL for both Gray and Gold phones varying only in query parameters.

• Doesn't satisfy customers of what he looks for due to unfriendly SEO urls.

URL Recoding (contd)

Page 14: Search engine optimization (seo) from Endeca & ATG

• Good examples: (to be recoded to below URLs)

http://www.rogers.com/wireless/phones/IPhone6-36GB-Grayhttp://www.rogers.com/wireless/phones/IPhone6-36GB-Gold

• Benefits:

• Improved page ranking• Customers got what he looked for in the very first search result.

URL Recoding (contd)

Page 15: Search engine optimization (seo) from Endeca & ATG

Customers got what he looked for in the very first search result.

Good image

URL Recoding (contd)

Page 16: Search engine optimization (seo) from Endeca & ATG

Different URLs pointing to same page, will reduce the ranking for the particular page. Eg: www.rogers.com www.rogers.com/web/Rogers.portal www.example.com/phones www.example.com/phones.jsp Using Link tag with proper Urls:

Using link tag under <head> tag in html, use a single consistent url as a single url.

<link rel=”canonical” href=”www.example.com/phones”/>

Canonical Links

Page 17: Search engine optimization (seo) from Endeca & ATG

• Semantic HTML markup

• Avoid flash, javascript output, as crawlers are good at parsing HTML

• proper <title>

• Meta tags <meta> tags, alt attribute for images

• help including keywords which ultimately increases page ranking.

SEO Tagging

Page 18: Search engine optimization (seo) from Endeca & ATG

:

SEO Tagging (contd)

Page 19: Search engine optimization (seo) from Endeca & ATG

• Site map helps web crawlers to access our site pages for indexing. It includes URL paths to various site pages in the application to index.

• Specified using XML file defined by www.sitemaps.org schema.• Can contain multiple xml files linked using index sitemap file.• Usually stored in root of the web application. Can be specified in

robot.txt. • Sitemap xml can be submitted to search engines to validate.

Example Sitemap XML file: <?xml version="1.0" encoding="UTF-8"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"><url><loc>http://www.example.com/</loc></url><url><loc>http://www.example.com/contact/</loc></url></urlset>

SiteMap

Page 20: Search engine optimization (seo) from Endeca & ATG

• Used In Category pages, Faced Navigation pages, also be used in Product detail pages

• URL Recoding: Non SEO or Traditional Endeca Urls: (constructed using

BasicUrlFormatter Endeca assembler API)

eg: rogers.com (category page, faceted navigation page)

http://www.rogers.com/web/link/wirelessBuyFlow?forwardTo=PhoneThenPlan&productType=normal&N=11+52+4294948709&Nr=AND%28Language%3AEN%2CProvince%3AON%29

Implementing SEO techniques in Endeca

Page 21: Search engine optimization (seo) from Endeca & ATG

• Optimized SEO friendly Endeca Urls: (using SEOUrlFormatter Endeca Assembler API)

• Include keywords in Urls, to make it SEO friendly

http://www.rogers.com/wireless/smartphones/_/N-11+52+4294948709?Nr=AND%28Language%3AEN%2CProvince%3AON%29

http://www.rogers.com/wireless/smartphones/Android/_/N-11+52+4294948709+277?Nr=AND%28Language%3AEN%2CProvince%3AON%29

Implementing SEO techniques in Endeca (contd)

Page 22: Search engine optimization (seo) from Endeca & ATG

Parts of the optimized Endeca Urls:

misc-path - /wireless/smartphones/Android

path-param-separator - _

path-params - N-11+52+4294948709

query string - ?Nr=AND%28Language%3AEN%2CProvince%3AON%29

Implementing SEO techniques in Endeca (contd)

Page 23: Search engine optimization (seo) from Endeca & ATG

Configuring SEO friendly URLs in Endeca • XML Configuration file – eg: endeca-seo-Config.xmlEasily configured using Spring frameworkCore API classes – Endeca Assembler API – SEO classes -

BasicQueryBuilder, SeoUrlFormatter, SeoNavStateFormatter, SeoNavStateCanonicalizer along with Endeca Experience Manager Cartridge Handlers core APIs.

• Will be able to configure every aspect of URLs, like formatting, canonicalizing, encoding.

• Sample reference XML configuration in appendix.

Implementing SEO techniques in Endeca (contd)

Page 24: Search engine optimization (seo) from Endeca & ATG

• Site Map:

• Sitemap xml files generated using standalone batch command RunSitemapGen.bat

• Configured using XML file specifying MDEX host, port, URL format

• Uses same xml file used for configuring application

• <URL_FORMAT_FILE>C:\Endeca\ToolsAndFrameworks\...\WEB-INF\endeca-seo-config.xml</URL_FORMAT_FILE>

* Sample configuration file included in appendix.

Implementing SEO techniques in Endeca (contd)

Page 25: Search engine optimization (seo) from Endeca & ATG

ATG Driven pages (usually in Product Detail pages )URL Recoding:

• ATG SEO module detects Visitor either Human or Crawler using User Agent Header from request and generates the page links accordingly.

• Core API provided by ATG is in atg.repository.seo.* packages. • Core ATG APIs: atg.repository.seo.ItemLink servlet bean– translates to static

urls based on user agent. atg.repository.seo.JumpServlet - incoming static URLs (for example, if a user clicks a link returned by a Google search), and translates these URLs into their dynamic equivalents.

Implementing SEO techniques in ATG

Page 26: Search engine optimization (seo) from Endeca & ATG

URL Configuration: Done using template classes provided Core ATG SEO. URL Templates: atg.repository.seo.DirectUrlTemplate atg.repository.seo.IndirectUrlTemplate

# Url template formaturlTemplateFormat=/jump/{item.displayName}/productDetail/{item.parentCategory.displayName}/{item.id}/{item.parentCategory.id}/{locale}# Forward Url templateforwardUrlTemplateFormat={item.template.url,encode=false}?productId\={item.id}\&categoryId\={item.parentCategory.id}\&locale\={locale}\&productPage\=true

Implementing SEO techniques in ATG (contd)

Page 27: Search engine optimization (seo) from Endeca & ATG

/atg/repository/seo/CatalogItemLink $class=atg.repository.seo.ItemLink # Map of UrlTemplateMapper components by item descriptor name for this dropletitemDescriptorNameToMapperMap=\ product=/atg/repository/seo/ProductTemplateMapper # Default parameter valuesdefaultRepository=/atg/commerce/catalog/ProductCatalogdefaultItemDescriptorName=product

Implementing SEO techniques in ATG(contd)

Page 28: Search engine optimization (seo) from Endeca & ATG

/atg/repository/seo/ProductTemplateMapper $class=atg.repository.seo.IndirectUrlTemplate # Url template formaturlTemplateFormat=/jump/{item.displayName}/productDetail/{item.parentCategory.displayName}/{item.id}/{item.parentCategory.id}/{locale} # Regex that matches above formatindirectRegex=.*/jump/[^/]*?/productDetail/[^/]*?/([^/].*?)/[^/]*?/([^/]*)(/.*?)*$ # Regex elementsregexElementList=\

item | id | /atg/commerce/catalog/ProductCatalog\:product,\

locale | string # Forward Url templateforwardUrlTemplateFormat={item.template.url,encode=false}?productId\={item.id}\&categoryId\={item.parentCategory.id}\&locale\={locale}\&productPage\=true

# Supported Browser TypessupportedBrowserTypes=\ robot

Implementing SEO techniques in ATG(contd)

Page 29: Search engine optimization (seo) from Endeca & ATG

ATG – Endeca Integration:

• Used in Experience Manager Cartridge Handlers• ATG Nucleus component access Endeca SEO configuration spring

beans using ATG Spring Integration feature

• atg.nucleus.spring.NucleusPublisher

• <bean name="/NucleusPublisher" class="atg.nucleus.spring.NucleusPublisher" • scope="singleton">• <property name="nucleusPath">• <value>/atg/spring/FromSpring</value>• </property>• </bean>• • <import resource="endeca-seo-url-config.xml"/>

• Now you use /atg/spring/FromSpring/[spring Bean Id] in your Nucleus component

Implementing SEO techniques in ATG(contd)

Page 30: Search engine optimization (seo) from Endeca & ATG

ATG – Endeca Integration:

A key bean in Endeca is com.endeca.soleng.urlformatter.seo.SeoUrlFormatter

• Configure the /atg/endeca/assembler/cartridge/manager/NavigationStateBuilder component using the property

• urlFormatter = /atg/spring/FromSpring/seoUrlFormatter

Implementing SEO techniques in ATG(contd)

Page 31: Search engine optimization (seo) from Endeca & ATG

Canonical Links

Using OOTB ATG /atg/repository/seo/CanonicalItemLink <link rel="canonical" ref="http://www.example.com:80/crs/storeus/jump/Dotted+Repp+Tie/productDetail/For+Him/xprod1001/cat50067 " /> <dsp:droplet name="/atg/repository/seo/CanonicalItemLink"> <dsp:param name="id" param="productId"/> <dsp:param name="itemDescriptorName" value="product"/> <dsp:param name="repositoryName" value="/atg/commerce/catalog/ProductCatalog"/> <dsp:oparam name="output"> <dsp:getvalueof var="pageUrl" param="url" vartype="java.lang.String"/> <link rel="canonical" href="${pageUrl}"/> </dsp:oparam></dsp:droplet>

Implementing SEO techniques in ATG(contd)

Page 32: Search engine optimization (seo) from Endeca & ATG

SiteMap Generation:

• Sitemap files are XML documents that contain URLs for the pages of your site. These can be generated either manually using Dynamo Admin console or in a scheduled mapper.

• Sitemap xmls are kept in root of the web application ATG uses following OOTB components for sitemap generation:Components is in folder /atg/sitemap/*

Implementing SEO techniques in ATG(contd)

Page 33: Search engine optimization (seo) from Endeca & ATG

SiteMap Generation:

StaticSitemapGenerator - generates sitemap xml files for static urlsDynamicSitemapGenerator - generates files for dynamic urls.SitemapIndexGenerator - generates index files for various sitemap files

generated by SiteMapGenerator components.SitemapGeneratorService - Used for scheduled generation of sitemap xml files

and inserting entries in SiteMapRepositorySitemapWriterService - writes sitemap xml files using contents from

SiteMapRepository, should be run on every ATG instance.

Implementing SEO techniques in ATG(contd)

Page 34: Search engine optimization (seo) from Endeca & ATG

SEO Tagging:• ATG uses SEO tag repository for storing content of title, meta – keywords, description

tags.

• Register SEO tag repository using initialRepositories property of /atg/repository/ContentRepositories component

<dsp:droplet name="/atg/dynamo/droplet/RQLQueryRange"><dsp:param name="repository" value="/atg/seo/SEORepository" /><dsp:param name="itemDescriptor" value="SEOTags" /><dsp:param name="howMany" value="1" /><dsp:param name="mykey" value="featured" /><dsp:param name="queryRQL" value="key = :mykey" /><dsp:oparam name="output"><title><dsp:valueof param="element.title"/></title><dsp:getvalueof var="description" param="element.description"/><dsp:getvalueof var="keywords" param="element.keywords"/><meta name="description" content="${description}" /><meta name="keywords" content="${keywords}"/></dsp:output></dsp:droplet>

Implementing SEO techniques in ATG(contd)

Page 35: Search engine optimization (seo) from Endeca & ATG

Appendix

endeca-seo-url-config.xml

Sitemap-conf.xml

Sitemap-XML-template.xml