heuristic role detection of visual elements of web pages
DESCRIPTION
Presented in ICWE 2013, Aalborg.TRANSCRIPT
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Heuristic Role Detection ofVisual Elements of Web Pages
M. Elgin Akp�nar1 Yeliz Ye³ilada2
[email protected], Middle East Technical University, Ankara, Turkey
[email protected], Middle East Technical UniversityNorthern Cyprus Campus, Kalkanl�, Güzelyurt,
Mersin 10, Turkey
ICWE, 2013
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Outline
1 Introduction
2 Ontology Based Heuristic Role Detection
3 Evaluation
4 Conclusion
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
MotivationRelated Work
Problem De�nition
Accessibility issues in interactive webpages
Problems with accessing in alternativeforms such as audio with assistivetechnologies
Problems with mobile devices
Screen size problems
Limited resources
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
MotivationRelated Work
Problem De�niton (cont.)
Compatibility issues
Development of new web technologies
Dynamic web content, HTML5, etc.
Flexible syntax of HTML and CSS
Ability to create the same visual layoutwith di�erent underlying coding
Inability to fully describe web elements
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
MotivationRelated Work
Requirements
Propose a method to automatically identify visual elements in webpages;
Serving di�erent purposes
Providing better accessibility for disabled people and mobiledevicesImproving the accuracy of information retrieval and datamining applicationsTranscoding or reorganising web page structure for betterpresentation
Adapting to new technologies
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
MotivationRelated Work
Recent Application Fields
Web page adaptation for small screen devices[Yin & Lee, 2005, Ahmadi & Kong, 2008, Chen et al., 2005,Xiao et al., 2008, Chen et al., 2001]
Intelligent user interface creation [Xiang & Shi, 2006]
Information retrieval and web data mining[Kovacevic et al., 2002, Lin & Ho, 2002, Liu et al., 2003,Yi et al., 2003]
Web accessibility [Takagi et al., 2002]
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
MotivationRelated Work
Drawbacks
Simplistic sets of roles
Narrow understanding of web page elements
Inability to describe a web page semantically
Static de�nition of roles and attributes
Maintenance problems
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
System Architecture
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
Vision Based Page Segmentation Algorithm (VIPS)
Aims to extract the block structure by using some visual cuesand tag properties of the nodes.
Visual Cues: Tag, color, text and size of a node[Cai et al., 2003]
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
System Architecture
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
Knowledge Representation
Systematic characterisation of roles of visual elements
De�nition of properties which a�ect how visual elements areused and presented
Visual styles, speci�c keywords, relation between parent andchildren elements
eMine Ontology
Based on WAfA Ontology [Harper & Yesilada, 2007]
Iterative knowledge base construction:
Comparison with ARIA Ontology [Craig & Cooper, 2010]
Factor annotations
Object property classi�cation
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
An object property for Header role...<owl:Restriction><owl:onProperty rdf:resource="emine#has_tag" /><owl:allValuesFrom><owl:Class><owl:oneOf rdf:parseType="Collection"><owl:Thing rdf:about="emine#Header" /><owl:Thing rdf:about="emine#Div" /></owl:oneOf></owl:Class></owl:allValuesFrom></owl:Restriction>
...
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
System Architecture
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
Role Detector
Jess, a Java based rule engine and scripting environment
Initial state: a set of rules, which are converted from eMineOntology and a tree of unlabeled visual elements
Process of role detection:
1 Rule engine object construction2 Load of template de�nitions and initial variables3 Assertion of facts (properties of visual elements)4 Firing of prede�ned rules over visual elements
Final state: a tree of labeled visual elements
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
Jess rules for Header role...(defrule Header06 (block (has_tag $? /.*header.*/ $?))=>(bind ?*Header* (+ "2" ?*Header*)))...(defrule Header07 (block (has_tag $? /.*div.*/ $?))=>(bind ?*Header* (+ "2" ?*Header*)))
...
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Visual Element Identi�erRule GeneratorRole Detector
Labeled Block Structure
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
EvaluationResults
Evaluation
User Evaluation
Online survey based evaluation
Given a list of roles, participants were asked to assign a role togiven visual blocks
Nine randomly chosen web pages from a group of 30 pages
25 participants evaluated
Technical Evaluation
Technical feasibility of the proposed approach and itsimplementation
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
EvaluationResults
User Evaluation Results
Complexity
Group
System-Expert
Evaluation
Receptive
Evaluation
Block
Count
Low 79.82 % 73.68 % 65
Medium 88.28 % 79.77 % 237
High 88.47 % 85.53 % 569
Overall 86.83 % 80.82 % 298
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
EvaluationResults
Technical Evaluation Results
Complexity
Group
Total
Memory
Total
Time
Avr. Memory
per Block
Avr. Time
per Block
Block
Count
Low 8,369 KB 6,576 ms 244.29 KB 102.29 ms 65
Medium 7,013 KB 23,799 ms 36.44 KB 102.12 ms 237
High 9,165 KB 54,837 ms 34.28 KB 101.95 ms 569
Overall 8,176 KB 29,157 ms 100.20 KB 102.11 ms 298
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Conclusion
Ontology based heuristic approach
Probabilistic model
Automatic identi�cation and classi�cation of web elements
Visual element identi�erKnowledge baseHeuristic role detector
Adaptable to di�erent domains, purposes and requirements
Modi�able knowledge base
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Future Work
Improvements to our system
Knowledge base improvementWeb service implementation
Reengineering web pages
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
IntroductionOntology Based Heuristic Role Detection
EvaluationConclusion
Thank you for listening!
For further informationContact: [email protected]
Project Page: http://emine.ncc.metu.edu.tr/1
Thanks to
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
References
Ahmadi, H. & Kong, J. (2008).E�cient web browsing on small screens.In Proceedings of the working conference on Advanced visual
interfaces (pp. 23�30).: ACM.
Cai, D., Yu, S., Wen, J. R., & Ma, W. Y. (2003).Vips: a vision based page segmentation algorithm.Technical Report MSR-TR-2003-79, Microsoft Research.
Chen, J., Zhou, B., Shi, J., Zhang, H., & Fengwu, Q. (2001).Function-based object model towards website adaptation.In WWW '01 (pp. 587�596).: ACM.
Chen, Y., Xie, X., Ma, W.-Y., & Zhang, H.-J. (2005).Adapting web pages for small-screen devices.IEEE Internet Computing, 9(1), 50�56.
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
References
Craig, J. & Cooper, M. (2010).Accessible rich internet applications (WAI-ARIA) 1.0.http://www.w3.org/TR/2010/WD-wai-aria-20100916/com-plete.
retrieved on 15.01.2013.
Harper, S. & Yesilada, Y. (2007).Web authoring for accessibility (WAfA).Journal of Web Semantics (JWS), 5(3), 175�179.
Kovacevic, M., Diligenti, M., Gori, M., & Milutinovic, V.(2002).Recognition of common areas in a web page using visualinformation: a possible application in a page classi�cation.
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
References
In Proceedings 2002 IEEE International Conference on Data
Mining (pp. 250�257). Washington, DC, USA: IEEE ComputerSociety.
Lin, S.-H. & Ho, J.-M. (2002).Discovering informative content blocks from web documents.In KDD '02 (pp. 588�593).: ACM.
Liu, B., Chin, C. W., & Ng, H. T. (2003).Mining topic-speci�c concepts and de�nitions on the web.In WWW '03 (pp. 251�260).: ACM.
Takagi, H., Asakawa, C., Fukuda, K., & Maeda, J. (2002).Site-wide annotation: reconstructing existing pages to beaccessible.In ASSETS '02 (pp. 81�88).: ACM.
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
References
Xiang, P. & Shi, Y. (2006).Recovering semantic relations from web pages based on visualcues.In IUI '06 (pp. 342�344).: ACM.
Xiao, Y., Tao, Y., & Li, W. (2008).A dynamic web page adaptation for mobile device based onweb2.0.In Proceedings of the 2008 Advanced Software Engineering and
Its Applications (pp. 119�122). USA: IEEE Computer Society.
Yi, L., Liu, B., & Li, X. (2003).Eliminating noisy information in web pages for data mining.In KDD '03 (pp. 296�305).: ACM.
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages
References
Yin, X. & Lee, W. S. (2005).Understanding the function of web elements for mobile contentdelivery using random walk models.In WWW '05 (pp. 1150�1151).: ACM.
M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages