1 core empirical concepts and skills for computer science grant braught dickinson college...

17
1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson .edu Craig Miller DePaul University [email protected] .edu David Reed Creighton University davereed@creighton. edu Talk outline: • Overview of the project • List of core competencies • Sample exercises/laboratories • Web repository

Upload: scott-beasley

Post on 26-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

1

Core Empirical Concepts and Skills for Computer Science

Grant BraughtDickinson College

[email protected]

Craig MillerDePaul University

[email protected]

David ReedCreighton University

[email protected]

Talk outline:• Overview of the project• List of core competencies• Sample exercises/laboratories• Web repository

Page 2: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

2

Overview of the project CS demands basic competencies in experimentation and data analysis

e.g., choosing a data structure/algorithm, optimizing a server, comparing CPUs, GUI design

But little attention has been paid to empirical aspects in CS curriculacan find examples of experimentation in specific courses, but no systematic approachsee SICGSE 2001 paper by authors for more details/motivation

In 2003, we began work on an NSF CCLI project (NSF DUE—0230950) to

1. Formalize a set of core empirical concepts and skills needed by today's computer science graduate.

2. Develop and evaluate resources for integrating the presentation and development of those concepts and skills throughout the computer science curriculum.

3. Disseminate our approach for adoption at other institutions.

Page 3: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

3

Core empirical competencies KEY POINT 1: any list of core concepts & skills must be incremental

as with programming, must develop understanding over time introduce foundational concepts early, revisit often, build on existing knowledge as new

applications arise applications become more open-ended as skills develop

KEY POINT 2: it must also complement traditional CS curricula CS curricula are already "full", faculty are stretched to integrate concepts/skills with minimal effort, organize around traditional practice

Introductory level: augment programming with simulation/experimentation/analysis instill basic understanding that programs are used to solve problems, provide informal tools

Intermediate level: provide formal measures/techniques for core CS, software testing transition from following prescribed experiments to successfully designing experiments

Advanced level: build upon foundations, add domain-specific concepts where needed students are expected to design, conduct, and analyze experiments in various settings

Page 4: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

4

Level Concepts Skills

Introductory mean vs. median informal definition of probability chance error expected values Law of Large Numbers consistency vs. accuracy benefits/limitations of models

Be able to conduct a well-defined experiment, summarize the results, and compare with the expected results.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as the clarity of the hypothesis, the sample size, and data consistency.

Intermediate uniform vs. normal distribution standard deviation sampling large populations methods for obtaining a good sample curve-fitting (e.g., linear regression) data presentation (e.g., tables, scatter-

plots, histograms) software testing & debugging strategies validity threats (e.g., confounding variables,

non-generalizability)

Be able to plot and describe the relation between two variables, such as problem size vs. efficiency when analyzing algorithms.

Be able to use a sample to make predictions about a population.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as appropriate statistical measures, relevance of a model, and generality of the conclusions.

Be able to design a test suite for a software project and conduct systematic debugging.

Advanced standard error confidence intervals measures of goodness of fit (e.g.,

correlation coefficient) observational study vs. controlled

experiment correlation vs. causality significance tests (e.g., t-test, z-score)

Be able to apply experimental methods to the analysis of complex systems in different domains.

Be able to formulate testable hypotheses and design experiments for supporting or refuting those hypotheses.

Page 5: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

5

Introductory level examples

Three Monte-Carlo simulations Estimating the number of 4 letter words recognized by the spell checker Estimating the value of Comparing country drunkards to city drunkards

These simulation assignments: Incrementally develop consistency and accuracy concepts Emphasize computer simulation as a tool for scientific investigation Can be easily adapted to reinforce traditional programming topics

Page 6: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

6

Estimating the number of 4 letter words

The question How many English words can be formed using only the letters “etaoinshrd”?

The experiment Generate 1000 random 4 letter sequences using the ten letters “etaoinshrd” Count the number of sequences recognized as English words Estimate the total number of 4 letter words

The analysis Compare results from multiple trials for “consistency”

Intuitive binary definition of consistency: Within 10% of the mean Compare results to actual value using percent error

Total #of 4 letter words ≈#of recognized sequences

1000⋅104

Page 7: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

7

Refining consistency and accuracy

Estimating by throwing darts

The experiment Estimate using 100, 1000 and 10,000 darts

The analysis Relative consistency: Range as a percentage of the mean Relationship between consistency and accuracy

Effects of systematic error

darts in circle

total darts=area of circle

area of square=π

4

Page 8: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

8

Further refining consistency

Country drunkards vs. city drunkards Which has a greater expected net distance, a fully random walk or a Manhattan

random walk?

The experiment Simulate fully random and Manhattan random walks

The analysis Compare the mean net distances Compare the distribution and consistency of net distances

Formal definition of consistency using standard deviation

Fully random walk Manhattan walk

Page 9: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

9

Intermediate level example Optimization of the average case for quick sort

At what array size should quick sort switch to insertion sort?

Select the switch point (M) to maximize the expected time saved by using insertion sort

Quick Sort vs. Insertion Sort

0

0.002

0.004

0.006

0.008

0.01

0.012

0 5 10 15 20 25 30 35 40

Array Size (n)

Time (ms)

Quick Sort Insertion Sort Q(n) I(n)

Expected Time Savings ≈1

MQ(i) − I(i)( )

i=1

M

Page 10: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

10

Level Concepts Skills

Introductory mean vs. median informal definition of probability chance error expected values Law of Large Numbers consistency vs. accuracy benefits/limitations of models

Be able to conduct a well-defined experiment, summarize the results, and compare with the expected results.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as the clarity of the hypothesis, the sample size, and data consistency.

Intermediate uniform vs. normal distribution standard deviation sampling large populations methods for obtaining a good sample curve-fitting (e.g., linear regression) data presentation (e.g., tables, scatter-

plots, histograms) software testing & debugging strategies validity threats (e.g., confounding variables,

non-generalizability)

Be able to plot and describe the relation between two variables, such as problem size vs. efficiency when analyzing algorithms.

Be able to use a sample to make predictions about a population.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as appropriate statistical measures, relevance of a model, and generality of the conclusions.

Be able to design a test suite for a software project and conduct systematic debugging.

Advanced standard error confidence intervals measures of goodness of fit (e.g.,

correlation coefficient) observational study vs. controlled

experiment correlation vs. causality significance tests (e.g., t-test, z-score)

Be able to apply experimental methods to the analysis of complex systems in different domains.

Be able to formulate testable hypotheses and design experiments for supporting or refuting those hypotheses.

Page 11: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

11

Advanced level example: theoretical vs. empirical

For many domains, performance can be estimated in two ways: Theoretically --- analyze and sum component steps Empirically --- time a sample of actual performances

Example: Measuring the usability of an interface in terms of the time needed to complete the task Theoretically: analyze user actions needed to complete the task Empirically: time users performing the task

Page 12: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

12

Estimating task completion times

Theoretical Keystroke-level model (Card, Moran and Newell) Decompose task into the following operations:

Keying, includes key presses and mouse clicks Pointing: moving mouse pointer to position on screen Homing: moving hand from keyboard to mouse or mouse to keyboard

Model provides method for adding mental operations Sum time costs

Empirical Create task instructions Request participants to perform task after providing task instructions Time participants performing task Calculate average and confidence intervals to estimate mean task completion time for a

user population

Page 13: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

13

Comparing analytical and empirical methods

Limitations of analytical approach Parameter assumptions may be poor estimates Some analysis decisions require expert judgment Some sequences of actions may have hidden efficiencies or costs

Limitations of empirical approach Sample of trials may be too small to estimate mean for entire population Participants may not be representative of actual users Test environment may not simulate natural usage

Page 14: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

14

Domain (HCI) concepts and skills

Running a simple usability test Scripting natural instructions Creating natural test environment

Learning relative time costs of user controls Pointing actions are an order of magnitude more costly than keying actions Most interactions can be decomposed into a small set of physical actions

Variation among human behavior

Aggregate human behavior is predictable

Page 15: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

15

Level Concepts Skills

Introductory mean vs. median informal definition of probability chance error expected values Law of Large Numbers consistency vs. accuracy benefits/limitations of models

Be able to conduct a well-defined experiment, summarize the results, and compare with the expected results.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as the clarity of the hypothesis, the sample size, and data consistency.

Intermediate uniform vs. normal distribution standard deviation sampling large populations methods for obtaining a good sample curve-fitting (e.g., linear regression) data presentation (e.g., tables, scatter-

plots, histograms) software testing & debugging strategies validity threats (e.g., confounding variables,

non-generalizability)

Be able to plot and describe the relation between two variables, such as problem size vs. efficiency when analyzing algorithms.

Be able to use a sample to make predictions about a population.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as appropriate statistical measures, relevance of a model, and generality of the conclusions.

Be able to design a test suite for a software project and conduct systematic debugging.

Advanced standard error confidence intervals measures of goodness of fit (e.g.,

correlation coefficient) observational study vs. controlled

experiment correlation vs. causality significance tests (e.g., t-test, z-score)

Be able to apply experimental methods to the analysis of complex systems in different domains.

Be able to formulate testable hypotheses and design experiments for supporting or refuting those hypotheses.

Page 16: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

16

Web repository

Hopefully, this list of core empirical concepts & skills will raise awareness initiate discussion lead to greater integration of empirical methods

The list and example labs can be found at the project Web site:

http://empirical.cs.creighton.edu

A variety of labs are accessible, along with instructor advice & assessment tools More labs will be added over the course of the project

Feedback, ideas, and classroom examples & labs are always welcome (see the feedback form at the Web site)

Page 17: 1 Core Empirical Concepts and Skills for Computer Science Grant Braught Dickinson College braught@dickinson.edu Craig Miller DePaul University cmiller@cs.depaul.edu

17

Level Concepts Skills

Introductory mean vs. median informal definition of probability chance error expected values Law of Large Numbers consistency vs. accuracy benefits/limitations of models

Be able to conduct a well-defined experiment, summarize the results, and compare with the expected results.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as the clarity of the hypothesis, the sample size, and data consistency.

Intermediate uniform vs. normal distribution standard deviation sampling large populations methods for obtaining a good sample curve-fitting (e.g., linear regression) data presentation (e.g., tables, scatter-

plots, histograms) software testing & debugging strategies validity threats (e.g., confounding variables,

non-generalizability)

Be able to plot and describe the relation between two variables, such as problem size vs. efficiency when analyzing algorithms.

Be able to use a sample to make predictions about a population.

Be able to evaluate the persuasiveness of experimental conclusions, focusing on issues such as appropriate statistical measures, relevance of a model, and generality of the conclusions.

Be able to design a test suite for a software project and conduct systematic debugging.

Advanced standard error confidence intervals measures of goodness of fit (e.g.,

correlation coefficient) observational study vs. controlled

experiment correlation vs. causality significance tests (e.g., t-test, z-score)

Be able to apply experimental methods to the analysis of complex systems in different domains.

Be able to formulate testable hypotheses and design experiments for supporting or refuting those hypotheses.