structured data and metadata evaluation methodology for organizations looking to improve image...
TRANSCRIPT
Structured Data and Metadata Evaluation Methodology for
Organizations Looking to Improve Image Findability on the Web
School of Library and Information Studies
LIS 5733 Taught by Dr Susan Burke
Research Proposal Written by Emily Kolvitz
Research Setting Primarily Geared Towards Online EcommerceBusiness Organizations but methodology could
easily translate to Galleries Museums Archives Libraries (GLAMs) or any institution looking to evaluate their
structured data and metadata practices on the world wide web in an effort to improve findability of product offerings
general information or services
Introduction
The current state of findability on the web for many organizations is incipient Search
Engine Optimization (SEO) techniques change frequently and remain much a mystery
to many companies The one variable in the equation of web findability that remains a
staple is good quality metadata under the hood of the website
This research methodology will allow for
An assessment of findability maturity on the web from an image-centric viewpoint
Help improve findability on the web by establishing a baseline for where your
organization is at in terms of structured data content and visualize gaps or areas
for improvement from a search engine neutral perspective
Introduction
Most Searches Start with Google now (Holman 2011) (Lippincott 2013)
Search Algorithms Shaping what is most Easily Accessible (Connaway Dickey amp
Radford 2011) and they are subject to change frequently (Kritzinger 2013)
Search Algorithms Look for Your Structured Data and in the future and possibly
your embedded metadata (Cazier 2014) (Beall 2010)
Literature Review
Marshall Breeding (2013) assesses the limitations of the major search engine algorithms
ldquoBut even with the most sophisticated relevancy
algorithms index-based search and retrieval lacks the
ability to lead users to the potential related content
Semantic web technologies in conjunction with
repositories of open linked data promise to deliver
significant new capabilities in exploring and exploiting
information resources on the webrdquo
Literature Review
Semantic web is founded on good high-
quality structured data
Future technologies could potentially utilize
embedded metadata in search (Cazier 2014)
(Beall 2010) but there is authenticity
provenance and ldquobreadcrumbsrdquo value now
(Reicks 2013)
Literature Review
Most users donrsquot go past the first page of
search results (Paz 2013)
Structured Data Practices can help your
organization stay relevant (and findable) in
the age of information overload
Keeping it Search Engine Neutral is
advisable (Paz 2013)
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Introduction
The current state of findability on the web for many organizations is incipient Search
Engine Optimization (SEO) techniques change frequently and remain much a mystery
to many companies The one variable in the equation of web findability that remains a
staple is good quality metadata under the hood of the website
This research methodology will allow for
An assessment of findability maturity on the web from an image-centric viewpoint
Help improve findability on the web by establishing a baseline for where your
organization is at in terms of structured data content and visualize gaps or areas
for improvement from a search engine neutral perspective
Introduction
Most Searches Start with Google now (Holman 2011) (Lippincott 2013)
Search Algorithms Shaping what is most Easily Accessible (Connaway Dickey amp
Radford 2011) and they are subject to change frequently (Kritzinger 2013)
Search Algorithms Look for Your Structured Data and in the future and possibly
your embedded metadata (Cazier 2014) (Beall 2010)
Literature Review
Marshall Breeding (2013) assesses the limitations of the major search engine algorithms
ldquoBut even with the most sophisticated relevancy
algorithms index-based search and retrieval lacks the
ability to lead users to the potential related content
Semantic web technologies in conjunction with
repositories of open linked data promise to deliver
significant new capabilities in exploring and exploiting
information resources on the webrdquo
Literature Review
Semantic web is founded on good high-
quality structured data
Future technologies could potentially utilize
embedded metadata in search (Cazier 2014)
(Beall 2010) but there is authenticity
provenance and ldquobreadcrumbsrdquo value now
(Reicks 2013)
Literature Review
Most users donrsquot go past the first page of
search results (Paz 2013)
Structured Data Practices can help your
organization stay relevant (and findable) in
the age of information overload
Keeping it Search Engine Neutral is
advisable (Paz 2013)
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Introduction
Most Searches Start with Google now (Holman 2011) (Lippincott 2013)
Search Algorithms Shaping what is most Easily Accessible (Connaway Dickey amp
Radford 2011) and they are subject to change frequently (Kritzinger 2013)
Search Algorithms Look for Your Structured Data and in the future and possibly
your embedded metadata (Cazier 2014) (Beall 2010)
Literature Review
Marshall Breeding (2013) assesses the limitations of the major search engine algorithms
ldquoBut even with the most sophisticated relevancy
algorithms index-based search and retrieval lacks the
ability to lead users to the potential related content
Semantic web technologies in conjunction with
repositories of open linked data promise to deliver
significant new capabilities in exploring and exploiting
information resources on the webrdquo
Literature Review
Semantic web is founded on good high-
quality structured data
Future technologies could potentially utilize
embedded metadata in search (Cazier 2014)
(Beall 2010) but there is authenticity
provenance and ldquobreadcrumbsrdquo value now
(Reicks 2013)
Literature Review
Most users donrsquot go past the first page of
search results (Paz 2013)
Structured Data Practices can help your
organization stay relevant (and findable) in
the age of information overload
Keeping it Search Engine Neutral is
advisable (Paz 2013)
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Literature Review
Marshall Breeding (2013) assesses the limitations of the major search engine algorithms
ldquoBut even with the most sophisticated relevancy
algorithms index-based search and retrieval lacks the
ability to lead users to the potential related content
Semantic web technologies in conjunction with
repositories of open linked data promise to deliver
significant new capabilities in exploring and exploiting
information resources on the webrdquo
Literature Review
Semantic web is founded on good high-
quality structured data
Future technologies could potentially utilize
embedded metadata in search (Cazier 2014)
(Beall 2010) but there is authenticity
provenance and ldquobreadcrumbsrdquo value now
(Reicks 2013)
Literature Review
Most users donrsquot go past the first page of
search results (Paz 2013)
Structured Data Practices can help your
organization stay relevant (and findable) in
the age of information overload
Keeping it Search Engine Neutral is
advisable (Paz 2013)
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Literature Review
Semantic web is founded on good high-
quality structured data
Future technologies could potentially utilize
embedded metadata in search (Cazier 2014)
(Beall 2010) but there is authenticity
provenance and ldquobreadcrumbsrdquo value now
(Reicks 2013)
Literature Review
Most users donrsquot go past the first page of
search results (Paz 2013)
Structured Data Practices can help your
organization stay relevant (and findable) in
the age of information overload
Keeping it Search Engine Neutral is
advisable (Paz 2013)
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Literature Review
Most users donrsquot go past the first page of
search results (Paz 2013)
Structured Data Practices can help your
organization stay relevant (and findable) in
the age of information overload
Keeping it Search Engine Neutral is
advisable (Paz 2013)
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
TopicProposed Research
Methodology for establishing a baseline or benchmark of where an organization is at
in terms of structured data pertaining to image records that ultimately helps findability
on the web
By utilizing the proposed methodology for gathering this data for an organization
data-informed decisions can be made about structured data strategy going forward to
maintain relevancy on the web
Many structured data elements can affect online findability from file-naming
standards presence of alt text tags in html markup html markup itself embedded
metadata schemaorg markup and rich snippets text description at or nearby images
and more IEEE uses metadata or full-text for search (IEEE Xplore offers this--see
next slide)
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Full Text Search amp Metadata Search
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
TopicProposed Research
It is also noteworthy that there are additional factors that affect findability on
the web that do not involve structured data but this research focuses solely on
structured data techniques within the control of individual organizations
All of these structured data techniques pertaining to image records will be
utilized in conjunction with the relevancy of onsite and offsite search results
Image search and information retrieval is a more difficult area than text search
and retrieval because accessibility to the image content is largely dependent on
side-car text (or metadata if you will) that describes the aboutness and
(hopefully) the context for the image record
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Questions
Research Questions Addressed in this Study
1 What methods of search are available on the organizationrsquos online website
1 What is the file-naming structure for images on the website
1 What is the quality of search engine (onsite and offsite) results
1 What kinds of search results appear in Image Search when searching by the
organizationrsquos name and product description both with onsite search and offsite
search
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Questions
Research Questions Addressed in this Study
5 What kinds of search results appear in Google Image Search when searching
by images taken from the organizationrsquos website
5 What kinds of search results come up when looking for specific products
(measure of structured data) through onsite search and offsite search
5 What are the results when looking for specific products on the offsite search
engine
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Questions
Research Questions Addressed in this Study
8 What kinds of structured data are near or around the images on the organizationrsquos
website Alt Text Other
9 What file types appear on the organizationrsquos website (JPEG TIFF PNG)
9 What embedded metadata is available in images on the website
11 What does the XMPXMLRDF for these images look like and how robust is it
What does the graph look like
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Variables
These measures are operationalized by utilization of likert scales applied by the human researcher For
example when rating the level of description for the file-name a research could conclude that the
filename sp_18379847923jpg is not very descriptive filename for a human let alone for a search engine
(unless of course this is a product sku) The researcher would then choose to assign it a low value on
descriptiveness on a 1-5 likert scale
Type of page
the image was
on
The image file naming
conventionfilename
Level of description for the
filename
Quality and number
of alt text tags
Quality and number
of embedded
metadata tags
Quality and number of structured
data tags pertaining to the images
Quality and number of search
results for onsite search
utilizing filename or alt text
Quality and number
of relevant search
results utilizing
offsite image search
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Data Collection Methods
ParticipantsParticipants will include a single institution anonymized for the protection of their business The sample of image records utilized
in this study will be limited to image assets appearing on the organizationrsquos website domain Most data collection can take place
from the organizationrsquos website itself Some procedures will take place on external sites services or programs
Randomization of SampleThe sample of images utilized in this study can be randomized by extracting a site map of the particular organization of interest
using xsitemapcom After the site map is constructed the list of URLs should be inputted into a spreadsheet program and a record
number should be assigned to each URL From there the researcher can use a randomizer program to select the order of pages to
utilize in the study (ie Research Randomizer Available at httpwwwrandomizerorgformhtm) This method will be utilized for
taking a random sample of pages from the organization of interest
ConsentAll data collected in this study are publicly available and freely available on the web
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Data Collection Methods
Obtaining Data on the website
Navigate to the URL
Right Click Image(s) and ldquoSave Asrdquo
Right Click Page and ldquoView Sourcerdquo Save as
txt file
Collect raw data from image by either
opening in Photoshop and Navigating to Raw
Data Column or utilize Phil Harveyrsquos
ExifTool
Obtaining Data through Structured Data Linter
Navigate to the Linter website
Enter URL
Screenshot Structured Data Results -or- save
as webpage
Obtaining Data through W3C RDF validator
Copy raw data xml extracted earlier and input
into RDF Validator
Select Graph Only on the Options
Parse RDF
Save Graph or Screenshot Graph
Store in Folder with other Data
Answer Research Questions
Systematically go through the collected data
and input findings into spreadsheet
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Data Analysis Methods
Descriptive Statisticso Bell Curve - measures
towards a central tendency
using likert scale data
Bell Curve Image By Vierge Marie
(Own work) [Public domain] via
Wikimedia Commons
httpuploadwikimediaorgwikipe
diacommonsff6Gaussian_Filter
svg
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Data Analysis Methods
Graphical Analysis
(Charts and Graphs)
Summary Report
Discussion of Findings
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Visualizing the Results
The Structured Data Linter
utilizing URLs to display
structured data around the images
Available at
httplinterstructured-dataorg
Summary analysis will be
crafted utilizing all of these data
points to show what we are able
to understand about an image
versus what a machine or search
engine is able to know about an
image
W3C RDF Validator Graph
Visualization utilizing the raw
data markup extracted from the
image
Available at
httpwwww3orgRDFValidator
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Structured Data Linter
Shows all
structured Data
Tags around the
images and in
the page markup
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
RDF Validator
Visualization of
embedded data
for images and
their subsequent
relationships to
other data
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Summary Report
Complete Picture of Structured
Data Metadata and Analysis
of Study
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
Expected Outcomes
The anticipated results of this project include a benchmark for where this specific
organization is at in terms of structured data in the online environment and a
methodology for other organizations looking to assess their structured data maturity in
the digital space These results will be used to create a roadmap for improving resource
findability both on the web and within websites Other organizations may also aspire to
reuse this methodology for assessing their own current state of structured data Future
areas of research could include utilizing metadataRDF-driven search engines in
conjuncture with Vector Space Models to assess findability of image records on the
web and within websites
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
References (Slides amp Full Paper)
Algebraix Data Corporation 0005 Algebraix Data Launches Industryrsquos First Cost-Effective Automated Implementation
of Schemaorg Business Wire (English) 5
Beall Jeffrey 2010 How Google Uses Metadata to Improve Search Results Serials Librarian 59 no 1 40-53
Breeding Marshall 2013 Linked Data The Next Big Wave or Another Tech Fad Computers In Libraries 33 no 3
20-22
Cafarella MJ Halevy AY Zhang Y Wang DZ and Wu E Uncovering the relational Web In Proceedings of the
11th International Workshop on the Web and Databases (Vancouver BC June 13 2008)
httpwebeecsumichedu~michjcpaperswebtables_webdb08pdf
Connaway Lynn Sillipigni Timothy J Dickey and Marie L Radford 2011 ldquoIf it is too inconvenient Im not going after itrdquo
Convenience as
a critical factor in information-seeking behaviors Library amp Information Science Research (07408188) 33 no 3 179-190
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
References (Slides amp Full Paper)
Cazier Clay 2014 PM Digital Marketing Blog ldquoThe Future of Exif Image Datardquo Last accessed November 20 2014
httpwwwpmdigitalcomblog201404future-exif-image-data
Diagram Center Digital Image and Graphic Resources for Accessible Materials 2014 ldquoContent Modelrdquo Last Accessed
November 23 2014 httpdiagramcenterorgstandards-and-practicescontent-modelhtml
Google 2014 ldquoImage Publishing Guidelinesrdquo Last accessed November 21 2014
httpssupportgooglecomwebmastersanswer114016hl=en
Holman Lucy 2011 Millennial Students Mental Models of Search Implications for Academic Librarians and Database
Developers Journal Of Academic Librarianship 37 no 1 19-27
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
References (Slides amp Full Paper)
International Business Times 0006 BingGoogle and Yahoo merge to make search easier with schemaorg
International Business Times April
IPTC International Press Telecommunications Council 2014 ldquoEmbedded Metadata Manifestordquo Last accessed November
20 2014 httpwwwembeddedmetadataorgsocial-media-test-resultsphp (Embedded Metadata Manifesto 2014)
Kritzinger W T Search Engine Optimization and Pay-per-Click Marketing Strategies Journal of Organizational
Computing and Electronic Commerce no 3 (2013) 273-86
Lippincott Joan K ldquoNet Generation Students and Librariesrdquo EDUCAUSE (2005) accessed November 19 2014
httpwwweducauseeduresearch-and-publicationsbookseducating-net-generationnet-generation-students-and-libraries
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml
References (Slides amp Full Paper)
Nakanishi T Semantic Context-Dependent Weighting for Vector Space Model Semantic Computing (ICSC) 2014
IEEE International Conference on vol no pp262266 16-18 June 2014 doi 101109ICSC201449
Paz Anita 2013 In search of Meaning The Written Word in the Age of Google Italian Journal Of Library amp
Information Science 4 no 2 255-266
Priebe T Schlager C Pernul G A search engine for RDF metadata Database and Expert Systems Applications
2004 Proceedings 15th International Workshop on vol no pp168172 2004 doi 101109DEXA20041333468
Reicks David 2010 ldquoWhy Embedded Metadata Wonrsquot Help Your SEOrdquo Last Updated December 30 2013 Last
Accessed November 23 2014 httpwwwcontrolledvocabularycomblogembedded-metadata-wont-help-seohtml