heuristic role detection of visual elements of web pages

27

Upload: elgin1988

Post on 07-Jul-2015

75 views

Category:

Technology


1 download

DESCRIPTION

Presented in ICWE 2013, Aalborg.

TRANSCRIPT

Page 1: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Heuristic Role Detection ofVisual Elements of Web Pages

M. Elgin Akp�nar1 Yeliz Ye³ilada2

[email protected], Middle East Technical University, Ankara, Turkey

[email protected], Middle East Technical UniversityNorthern Cyprus Campus, Kalkanl�, Güzelyurt,

Mersin 10, Turkey

ICWE, 2013

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 2: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Outline

1 Introduction

2 Ontology Based Heuristic Role Detection

3 Evaluation

4 Conclusion

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 3: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

MotivationRelated Work

Problem De�nition

Accessibility issues in interactive webpages

Problems with accessing in alternativeforms such as audio with assistivetechnologies

Problems with mobile devices

Screen size problems

Limited resources

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 4: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

MotivationRelated Work

Problem De�niton (cont.)

Compatibility issues

Development of new web technologies

Dynamic web content, HTML5, etc.

Flexible syntax of HTML and CSS

Ability to create the same visual layoutwith di�erent underlying coding

Inability to fully describe web elements

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 5: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

MotivationRelated Work

Requirements

Propose a method to automatically identify visual elements in webpages;

Serving di�erent purposes

Providing better accessibility for disabled people and mobiledevicesImproving the accuracy of information retrieval and datamining applicationsTranscoding or reorganising web page structure for betterpresentation

Adapting to new technologies

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 6: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

MotivationRelated Work

Recent Application Fields

Web page adaptation for small screen devices[Yin & Lee, 2005, Ahmadi & Kong, 2008, Chen et al., 2005,Xiao et al., 2008, Chen et al., 2001]

Intelligent user interface creation [Xiang & Shi, 2006]

Information retrieval and web data mining[Kovacevic et al., 2002, Lin & Ho, 2002, Liu et al., 2003,Yi et al., 2003]

Web accessibility [Takagi et al., 2002]

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 7: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

MotivationRelated Work

Drawbacks

Simplistic sets of roles

Narrow understanding of web page elements

Inability to describe a web page semantically

Static de�nition of roles and attributes

Maintenance problems

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 8: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

System Architecture

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 9: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

Vision Based Page Segmentation Algorithm (VIPS)

Aims to extract the block structure by using some visual cuesand tag properties of the nodes.

Visual Cues: Tag, color, text and size of a node[Cai et al., 2003]

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 10: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

System Architecture

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 11: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

Knowledge Representation

Systematic characterisation of roles of visual elements

De�nition of properties which a�ect how visual elements areused and presented

Visual styles, speci�c keywords, relation between parent andchildren elements

eMine Ontology

Based on WAfA Ontology [Harper & Yesilada, 2007]

Iterative knowledge base construction:

Comparison with ARIA Ontology [Craig & Cooper, 2010]

Factor annotations

Object property classi�cation

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 12: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

An object property for Header role...<owl:Restriction><owl:onProperty rdf:resource="emine#has_tag" /><owl:allValuesFrom><owl:Class><owl:oneOf rdf:parseType="Collection"><owl:Thing rdf:about="emine#Header" /><owl:Thing rdf:about="emine#Div" /></owl:oneOf></owl:Class></owl:allValuesFrom></owl:Restriction>

...

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 13: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

System Architecture

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 14: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

Role Detector

Jess, a Java based rule engine and scripting environment

Initial state: a set of rules, which are converted from eMineOntology and a tree of unlabeled visual elements

Process of role detection:

1 Rule engine object construction2 Load of template de�nitions and initial variables3 Assertion of facts (properties of visual elements)4 Firing of prede�ned rules over visual elements

Final state: a tree of labeled visual elements

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 15: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

Jess rules for Header role...(defrule Header06 (block (has_tag $? /.*header.*/ $?))=>(bind ?*Header* (+ "2" ?*Header*)))...(defrule Header07 (block (has_tag $? /.*div.*/ $?))=>(bind ?*Header* (+ "2" ?*Header*)))

...

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 16: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Visual Element Identi�erRule GeneratorRole Detector

Labeled Block Structure

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 17: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

EvaluationResults

Evaluation

User Evaluation

Online survey based evaluation

Given a list of roles, participants were asked to assign a role togiven visual blocks

Nine randomly chosen web pages from a group of 30 pages

25 participants evaluated

Technical Evaluation

Technical feasibility of the proposed approach and itsimplementation

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 18: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

EvaluationResults

User Evaluation Results

Complexity

Group

System-Expert

Evaluation

Receptive

Evaluation

Block

Count

Low 79.82 % 73.68 % 65

Medium 88.28 % 79.77 % 237

High 88.47 % 85.53 % 569

Overall 86.83 % 80.82 % 298

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 19: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

EvaluationResults

Technical Evaluation Results

Complexity

Group

Total

Memory

Total

Time

Avr. Memory

per Block

Avr. Time

per Block

Block

Count

Low 8,369 KB 6,576 ms 244.29 KB 102.29 ms 65

Medium 7,013 KB 23,799 ms 36.44 KB 102.12 ms 237

High 9,165 KB 54,837 ms 34.28 KB 101.95 ms 569

Overall 8,176 KB 29,157 ms 100.20 KB 102.11 ms 298

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 20: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Conclusion

Ontology based heuristic approach

Probabilistic model

Automatic identi�cation and classi�cation of web elements

Visual element identi�erKnowledge baseHeuristic role detector

Adaptable to di�erent domains, purposes and requirements

Modi�able knowledge base

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 21: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Future Work

Improvements to our system

Knowledge base improvementWeb service implementation

Reengineering web pages

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 22: Heuristic Role Detection of Visual Elements of Web Pages

IntroductionOntology Based Heuristic Role Detection

EvaluationConclusion

Thank you for listening!

For further informationContact: [email protected]

Project Page: http://emine.ncc.metu.edu.tr/1

Thanks to

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 23: Heuristic Role Detection of Visual Elements of Web Pages

References

Ahmadi, H. & Kong, J. (2008).E�cient web browsing on small screens.In Proceedings of the working conference on Advanced visual

interfaces (pp. 23�30).: ACM.

Cai, D., Yu, S., Wen, J. R., & Ma, W. Y. (2003).Vips: a vision based page segmentation algorithm.Technical Report MSR-TR-2003-79, Microsoft Research.

Chen, J., Zhou, B., Shi, J., Zhang, H., & Fengwu, Q. (2001).Function-based object model towards website adaptation.In WWW '01 (pp. 587�596).: ACM.

Chen, Y., Xie, X., Ma, W.-Y., & Zhang, H.-J. (2005).Adapting web pages for small-screen devices.IEEE Internet Computing, 9(1), 50�56.

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 24: Heuristic Role Detection of Visual Elements of Web Pages

References

Craig, J. & Cooper, M. (2010).Accessible rich internet applications (WAI-ARIA) 1.0.http://www.w3.org/TR/2010/WD-wai-aria-20100916/com-plete.

retrieved on 15.01.2013.

Harper, S. & Yesilada, Y. (2007).Web authoring for accessibility (WAfA).Journal of Web Semantics (JWS), 5(3), 175�179.

Kovacevic, M., Diligenti, M., Gori, M., & Milutinovic, V.(2002).Recognition of common areas in a web page using visualinformation: a possible application in a page classi�cation.

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 25: Heuristic Role Detection of Visual Elements of Web Pages

References

In Proceedings 2002 IEEE International Conference on Data

Mining (pp. 250�257). Washington, DC, USA: IEEE ComputerSociety.

Lin, S.-H. & Ho, J.-M. (2002).Discovering informative content blocks from web documents.In KDD '02 (pp. 588�593).: ACM.

Liu, B., Chin, C. W., & Ng, H. T. (2003).Mining topic-speci�c concepts and de�nitions on the web.In WWW '03 (pp. 251�260).: ACM.

Takagi, H., Asakawa, C., Fukuda, K., & Maeda, J. (2002).Site-wide annotation: reconstructing existing pages to beaccessible.In ASSETS '02 (pp. 81�88).: ACM.

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 26: Heuristic Role Detection of Visual Elements of Web Pages

References

Xiang, P. & Shi, Y. (2006).Recovering semantic relations from web pages based on visualcues.In IUI '06 (pp. 342�344).: ACM.

Xiao, Y., Tao, Y., & Li, W. (2008).A dynamic web page adaptation for mobile device based onweb2.0.In Proceedings of the 2008 Advanced Software Engineering and

Its Applications (pp. 119�122). USA: IEEE Computer Society.

Yi, L., Liu, B., & Li, X. (2003).Eliminating noisy information in web pages for data mining.In KDD '03 (pp. 296�305).: ACM.

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages

Page 27: Heuristic Role Detection of Visual Elements of Web Pages

References

Yin, X. & Lee, W. S. (2005).Understanding the function of web elements for mobile contentdelivery using random walk models.In WWW '05 (pp. 1150�1151).: ACM.

M. Elgin Akp�nar, Yeliz Ye³ilada Heuristic Role Detection of Visual Elements of Web Pages