big data. little brain. little data. big brain. · pdf filebig data. little brain. little...

54
Big data. Little Brain. Little data. Big brain. (with apologies to Dr. Seuss) Gail C. Murphy University of British Columbia Tasktop Technologies Unless otherwise indicated on a particular slide, this work is licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License

Upload: trinhmien

Post on 26-Mar-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Big data. Little Brain. Little data. Big brain.

(with apologies to Dr. Seuss)

Gail C. Murphy

University of British Columbia Tasktop Technologies

Unless otherwise indicated on a particular slide, this work is licensed under a Creative Commons Attribution-Share Alike 2.5 Canada License

2

frustruation

3

problem

result

big data >> little data little brain transformational factors

situational factors

*big data refers to much much more data than humans can consume

*

4

fact #1:

software engineering is about solving human problems

5

fact #2:

software developments generate big data over time

6

fact #3:

human cognition has limits

7

premise:

our “solutions” often exceed human cognition

<cartoon removed due to license restrictions>

8

big data. little brain.

big brain. little data.

duplicate bug examples to derive “formula”

big data >> little data little brain

stories about formula properties

transformational & situational factors

take away

h

9

duplicate bug problem

10

how does a developer realize/find a new bug is a duplicate?

11

1. duplicate bug search with keywords

big data big brain

bug #56162:

reported in 2000 16 comments

hundreds of words

bug #136422:

reported in 2000

32 comments

2x as long as #56162

12

2. duplicate bug search with natural language recommender

big data >> little data big brain

36-50% recall

[Runenson et. Al, ICSE 2007] [Hiew, MSc Thesis, 2006]

13

3. duplicate bug search with natural language +execution recommender

big data >> little data big brain

67-93% recall

[Wang et. Al, ICSE 2008]

14

4. duplicate bug search with recommender & summarizer

big data >> little data little brain

Summary:

View source is broken.

The second time I use it, the View

source window fails to open, but I

get an hourglass-and-pointer

cursor.

I can’t reproduce comment 0 with

Mozilla/5.0

[Rastkar & Murphy, ICSE 2010]

15

interludes

big data >> little data significant reduction in data

a human considers

big brain versus little brain

amount of cognitive and physical activity required to perform task

16

big data >> little data little brain

17

big data. little brain.

big brain. little data.

duplicate bug examples to “derive formula”

big data >> little data little brain

stories about formula properties

transformation & situational factors

take away

18

stories

software reflexion model hipikat mylyn

spyglass

to get at formula properties

h

19

story #1: software reflexion model

three case studies,

including one at Microsoft

[Murphy, Notkin, Sullivan, FSE 1995]

20

high-level model of part of Microsoft Excel

[Murphy & Notkin, Computer 1997]

21

example (not Excel) low-level model (Excel : 15,000 functions/77,746 calls)

Diagram from http://rixstep.com/1/1/20070206,00.shtml

22

mapping

[ file = ^shtreal\.c mapTo=Sheet ] [ file = ^textfl1[ez]\.c$ mapTo=File ]

[ file = ^shtreal\.c function=foo mapTo=Graph ]

blue entry is illustrative

23

reflexion model (partial)

24

big data >> little data little brain

big data = low-level model little data = high-level model

transformational properties

low-cost (albeit manual)

assessable completeness

predictable

25

high-level

model mapping low-level

model

reflexion

model

mapped

low-level model

unmapped

low-level model

low-cost because

mapping can be

specified simply,

incrementally and

partially

easy to assess

completeness

predictable

26

big data >> little data little brain

story #1: software reflexion models

low-cost (manual)

assessable completeness

predictable

27

[Cubranic & Murphy, ICSE 2003]

story #2: hipikat

wizard-of-oz case study

multiple case study (long sessions)

manual precision/recall

28

[Cubranic, Murphy, Booth and Singer, TSE 2005]

29

30

+ text (e.g., exception trace)

31

big data >> little data big brain

big data = project repositories little data = recommendations

transformational properties

low-cost (automatic)

situational factors

context (automatic)

32

low-cost because

project memory

auto-updates

multiple entry

points provide

auto-context

33

big data >> little data big brain

story #2: hipikat

low-cost (automatic)

context (automatic)

34

story #3: mylyn

three field studies

adoption in practice

[Kersten & Murphy, FSE 2006]

35

demo

36

big data >> little data little brain

big data = workspace information little data = focused workspace information

transformational properties

low-cost (automatic)

assessable completeness

predictable

situational factors pervasive fits in workflow

37

complete, predictable,

automatic (with

manual override)

pervasive

38

folding/ &

content assist

test

change sets

workflow

39

big data >> little data little brain

story #3: mylyn

pervasive fits in workflow

low-cost (automatic) assessable completeness

predictable

40

story #4: spyglass

longitudinal case study

controlled lab study

[Viriyakattiyaporn & Murphy, CASCON 2010]

41

42

big data >> little data big brain

big data = user interactions in workspace little data = recommended commands

transformational properties

low-cost (automatic)

situational factors (mostly) fits in workflow

43

automatic

(mostly) fits in workflow

44

big data >> little data big brain

story #4: spyglass

(mostly) fits in workflow

low-cost (automatic)

45

lessons

46

lessons about transformational factors

big data >> little data

low-cost (manual or automatic)

assessable completeness

predictable

47

reflexion

model

hipikat mylyn spyglass

low-cost

assessable

completeness

predictable

need sufficiently high

precision/recall?

48

lessons about situational factors

little data little brain

context (automatic) pervasive

workflow

49

reflexion

model

hipikat mylyn spyglass

context

(automatic)

~ (3.6)

pervasive

workflow

~

provide intent

automatically?

how to handle

specialized tools?

big brain

50

51

john anvik

elisa baniassad

wesley coelho

davor cubranic

brian de alwis

rob elves

thomas fritz

jan hannemann

lyndon hiew

reid holmes

mik kersten

seonah lee

shawn minto

martin robillard

izzet safer

david shepherd

ducky sherwood

annie ying

trevor young

robert walker

and others!

meghan allen

john anvik

elisa baniassad

wesley coelho

davor cubranic

brian de alwis

rob elves

thomas fritz

jan hannemann

lyndon hiew

reid holmes

mik kersten

shawn minto

sarah rastkar

martin robillard

izzet safer

david shepherd

ducky sherwood

apple

viriyakktiyaporn

annie ying

trevor young

robert walker

and others!

52

take away

53

54

problem

result

big data >> little data little brain transformational factors

situational factors

www.cs.ubc.ca/~murphy www.tasktop.com