software engineering for the web: the state of practice. icse 2014

54
Software Engineering for the Web The State of the Practice Alex Nederlof http://bit.ly/sop_icse14 Arie van Deursen Ali Mesbah @alexnederlof @avandeursen @amesbah

Upload: alex-nederlof

Post on 29-Nov-2014

985 views

Category:

Software


0 download

DESCRIPTION

Today’s web applications increasingly rely on client-side code execution. HTML is not just created on the server, but ma- nipulated extensively within the browser through JavaScript code. In this paper we seek to understand the software en- gineering implications of this. We look at deviations from many known best practices in such areas of network per- formance, accessibility, and correct structuring of HTML documents. Furthermore, we assess to what extent such deviations manifest themselves through client-side code ma- nipulation only. To answer these questions, we conducted a large scale experiment, involving automated client-enabled crawling of over 4000 web applications, resulting in over 100,000,000 pages analyzed, and close to 1,000,000 unique client site user interface states. Our findings show that the majority of sites contain a substantial number of problems, making sites unnecessarily slow, inaccessible for the visually impaired, and with layout that is in unpredictable due to errors in the dynamically modified DOM trees http://salt.ece.ubc.ca/publications/docs/icse14-seip.pdf

TRANSCRIPT

Page 1: Software Engineering for the Web: The state of practice. ICSE 2014

Software Engineering for the WebThe State of the Practice

Alex Nederlof

http://bit.ly/sop_icse14

Arie van DeursenAli Mesbah

@alexnederlof@avandeursen

@amesbah

Page 2: Software Engineering for the Web: The state of practice. ICSE 2014

TESTING WEB APPS IS A

PAIN IN THE NECKCan’t we fix that?

Page 3: Software Engineering for the Web: The state of practice. ICSE 2014

SPOILER:WE’RE NOT DOING WELL

Page 4: Software Engineering for the Web: The state of practice. ICSE 2014

Web

Applications?

Page 5: Software Engineering for the Web: The state of practice. ICSE 2014

The web was designed for document sharing

between researchers using

HTML

Page 6: Software Engineering for the Web: The state of practice. ICSE 2014

But thenJavaScript

Happened

Page 7: Software Engineering for the Web: The state of practice. ICSE 2014
Page 8: Software Engineering for the Web: The state of practice. ICSE 2014
Page 9: Software Engineering for the Web: The state of practice. ICSE 2014

COMPLEXITY x DIVERSITY - TESTING

= BUGS

Page 10: Software Engineering for the Web: The state of practice. ICSE 2014

CRAWLJAX JavaScript-Enabled Crawling

Page 11: Software Engineering for the Web: The state of practice. ICSE 2014

sldfjsdfk

Page 12: Software Engineering for the Web: The state of practice. ICSE 2014

<!DOCTYPE HTML> <HTML> <HEADER> <TITLE>Computers Rule</TITLE> </HEADER> <BODY> <H1>Computer says:</H1> <p>NO</p> </BODY> </HTML> !

!

<!DOCTYPE HTML> <HTML> <HEADER> <TITLE>Ultimate Answer</TITLE> </HEADER> <BODY> <H1>Computer says:</H1> <p>42</p> </BODY> </HTML> !

!

Page 13: Software Engineering for the Web: The state of practice. ICSE 2014

STATESARE THE NEW

PAGES

Page 14: Software Engineering for the Web: The state of practice. ICSE 2014

4,221 APPLICATIONS

2,974,641 STATES

Page 15: Software Engineering for the Web: The state of practice. ICSE 2014

How dynamic is the web?

How bad is the web?

Page 16: Software Engineering for the Web: The state of practice. ICSE 2014

MEASURINGDYNAMISM

Page 17: Software Engineering for the Web: The state of practice. ICSE 2014

How dynamic is the web?

States / URL 1.9 states

State invisibility 96%

Post-load DOM manipulations

64% Text 89% DOM

Page 18: Software Engineering for the Web: The state of practice. ICSE 2014

ASSESSINGTHE DAMAGE

Page 19: Software Engineering for the Web: The state of practice. ICSE 2014

DEFINING AMBIGUOUS ID

ATTRIBUTES

Page 20: Software Engineering for the Web: The state of practice. ICSE 2014

<H1 class=”title” id=”first-title”>Hello!</H1>

Page 21: Software Engineering for the Web: The state of practice. ICSE 2014

53% of the sites do on 35% of the states

Page 22: Software Engineering for the Web: The state of practice. ICSE 2014

DEFINE A DOCTYPE

Page 23: Software Engineering for the Web: The state of practice. ICSE 2014

<!DOCTYPE HTML> <HTML> <HEADER> <TITLE>Hello World</TITLE> </HEADER> <BODY> <H1>Hello Msc Thesis!</H1> <A href=”http://ns.nl”>Go to NS</A> </BODY> </HTML>

Page 24: Software Engineering for the Web: The state of practice. ICSE 2014

61.6% RENDER IT

90’s STYLE

Page 25: Software Engineering for the Web: The state of practice. ICSE 2014

FORMULATE VALID HTML

Page 26: Software Engineering for the Web: The state of practice. ICSE 2014

<H1 class=”title” id=”first-title”>Hello!</H1>

13% Forget this

{9% go wrong here

20% misplace elements altogether

Page 27: Software Engineering for the Web: The state of practice. ICSE 2014

53% Contain Double IDs

61% Renders like the 90s

~ 20% Contains invalid HTML

Page 28: Software Engineering for the Web: The state of practice. ICSE 2014

SPEED

Page 29: Software Engineering for the Web: The state of practice. ICSE 2014

Errors in the web

Best practices

Page 30: Software Engineering for the Web: The state of practice. ICSE 2014

THOU SHALL CACHE THY RESOURCES

Page 31: Software Engineering for the Web: The state of practice. ICSE 2014

43% doesn’t

0% Used HTML-5 Caching

Page 32: Software Engineering for the Web: The state of practice. ICSE 2014

THOU SHALL COMPRESS

THY RESOURCES

Page 33: Software Engineering for the Web: The state of practice. ICSE 2014

80% doesn’t

Page 34: Software Engineering for the Web: The state of practice. ICSE 2014

THOU SHALL PUT STYLE SHEETS

ON TOP

Page 35: Software Engineering for the Web: The state of practice. ICSE 2014

56% doesn’t

Page 36: Software Engineering for the Web: The state of practice. ICSE 2014

THOU SHALL ONLY BLOCK JS

WHEN NECESSARY

Page 37: Software Engineering for the Web: The state of practice. ICSE 2014

43% Does not cache

80% Is not compressed

56% Reloads CSS too often

Page 38: Software Engineering for the Web: The state of practice. ICSE 2014

ACCESSIBILITY

Page 39: Software Engineering for the Web: The state of practice. ICSE 2014

FEEL

LISTEN

Page 40: Software Engineering for the Web: The state of practice. ICSE 2014

<IMG src=”lolcat.jpg” alt=”Picture of a cat” />

<LABEL for=”username”> Enter your username </LABEL>

Page 41: Software Engineering for the Web: The state of practice. ICSE 2014

36% Do not label input

Page 42: Software Engineering for the Web: The state of practice. ICSE 2014

<div role=”navigation”>

<HEADER> <ARTICLE>

NAVIGATION

<NAV>

Page 43: Software Engineering for the Web: The state of practice. ICSE 2014

25%

5%11% 60%

No indicatorsJust rolesJust SemanticBoth

NAVIGATION ASSISTANCE

Page 44: Software Engineering for the Web: The state of practice. ICSE 2014

THE WEB IS:• HIGHLY DYNAMIC

• RIDDLED WITH ERRORS

• NOT AS FAST AS IT COULD BE

• NOT NEARLY ACCESSIBLE ENOUGH

Page 45: Software Engineering for the Web: The state of practice. ICSE 2014

What to do?

Page 46: Software Engineering for the Web: The state of practice. ICSE 2014

Modern Web development

All pages are rendered

New pages are rendered client side

Page 47: Software Engineering for the Web: The state of practice. ICSE 2014

STATIC ANALYSIS+ CRAWLER= SUPER POWERS

Page 48: Software Engineering for the Web: The state of practice. ICSE 2014

Generic invariantsValid HTML, JavaScript, CSS

Accessibility support

Performance best practices

Do all images load?

Page 49: Software Engineering for the Web: The state of practice. ICSE 2014

Am I using my framework correctly?

Are all my pages translated?

Are there any JS errors triggered?

Semi-Generic invariants

Page 50: Software Engineering for the Web: The state of practice. ICSE 2014

Is my logo on every page?

Is the feedback button on every page?

Does every page link to the homepage?

App-specific invariants

Page 51: Software Engineering for the Web: The state of practice. ICSE 2014

CRAWLING BONUS!Code coverage

Performance testingRandom testing

Page 52: Software Engineering for the Web: The state of practice. ICSE 2014

CHALLENGESState duplication detection is hard

Deployment seems hard

Page 53: Software Engineering for the Web: The state of practice. ICSE 2014

Testing by crawling works and should be explored further.

Page 54: Software Engineering for the Web: The state of practice. ICSE 2014

Automated Error detectionQuestions?

Find me on Twitter: @alexnederlofhttp://crawljax.com