s519: evaluation of information systems analyzing data: merit ch8

S519: Evaluation of Information Systems

Analyzing data:

Merit

Ch8

Last week

What are 6 strategies to determine the importance.

Pros and cons of them (Table 7.10)?

Merit determination

Keep in mind of what the evaluation is. Evaluation is the systematic determination of the

quality or value of something (Scriven, 1991). Today we will discuss how to determin the

importance of evaluand components or dimensions

Merit determination

It is the process of setting „standards“ (definitions of what performance should constitute „satisfactory“, „good“, etc.) and applying those standards to descriptive data to draw explicitly evaluative conclusions about performance on a particular dimension or component.

Two steps

Step 1: defining what constitutes poor, adequate, good, very good and excellent performances on a particular dimension (or component)

Step 2. using this definition to convert empirical evidence into evaluative conclusions (e.g. something explicit about quality or value)

Determing merit

Decscriptive facts about

performance

Quality or value

determination guide

Evaluative conclusions

Using single quantitaive measure

In a simple case, performance is measured on a single quantitative dimension The quality or value determination guide is just a

set of cutoffs E.g., >90%=A/excellent, 80%-89%=B/good, 70%-

79%=C/adequate E.g., satisfactory/unsatisfactory

Difficulty: where to put the cutoff score and how to compare with different systems?

Exercise

School grading system USA: A(>90%), B(80-90%), C(70-79%), D(60-69%),

F(<60%) New Zealand: A(>80%), B(65-79%), C(50-64%), D(35-

49%), F(<35%). Does that mean in New Zealand, it is easier to get A? Can this way of grading ensure objectivity and

consistency of grading across courses? If yes, why? If no, why not?

What happens in your country? Form a group to discuss

Using qualitative or multiple measures

Using a single measure is not generally good practice.

When using multiple measures, it is tricky on how to merge them together to come out with conclusion.

See table 8.1

Experience

Do not try to go for high precision It is perfectly appropriate to give an answer that

still has a certain amount of fuzziness or uncertainty associated with it.

Please do not oversell the precision of your work Providing a well-supported broad-brush answer

to an important question is not a bad idea.

Rubric

Rubric is a tool that provides an evaluative description of what performance or quality „looks like“.

It has two levels: Grading rubric is used to determin absolute

quality or value (e.g., Table8.2) Ranking rubric is used to determin relative quality

or value

Rubric for absolute value

Rubric for “grading” is based on: Discussion with domain expert Discussion with upstream stakeholders Existing rules (scope of duties) or literatures Evaluand expectations (needs assessment) Evaluation context (job market, current situation)

Sample grading rubric 1

Table 8.3 Using Table 8.3 to grade Table 8.1 – what is

grade for this master program? Why? Do it by yourself first? Form a pair and discuss your point

Sample grading rubric 2

Table 8.4 provides you a better grading rubric Scope Performance indicators Ranking (1-5) Discuss

How do they develope rank baesd on performance indicators

Exercise

Refine Table 8.3 Identify scopes Identify performance indicators Refine the ranking according to identified

indicators. Form a group and discuss

Exercise

Develop a grading rubric for your evaluation project Take Table 8.4 as example

Please include scope, performance indicators and ranking description

Using rubric for determining „relative“ merit

Relative metric is important for experiment that uses a control or comparison group. Student scores are interpreted by comparison

with other similar schools

It simily tell us how the person or program did relative to peers or competitors.

Using rubric for determining „relevative“ merit

Score falls in Grade assigned

Top 10% A

Next 20% B

Next 50% C

Next 15% D

Next 5% F

“Grading on the curve”: instructors rank students into different percentage

E.g.: GRE, SAT, GMAT, IQ

Significance

Statistical significance: Any observed difference (or statistical relationship)

is unlikely to be due to chance Practical significance:

Real impact on people‘s life E.g., the difference has a noticeable and nontrivial effect

on functioning or performance

When determining the merit of a particular outcome, we should taken both significance into consideration

Relative merit

Using comparison to determine relative merit Benchmark process, outcome, and cost criteria

against what has been achieved elsewhere (e.g. by other evaluands of a similar setting).

Benchmarking

It is a systematic study of one or more other organizations‘ systems, processes, and outcomes to identify ideas for improving organizational effectiveness.

It refers to a process of gathering comparison data about what organizations in similar or related industries are achieving (e.g. About process, outcomes, and costs). Quantiative data Qualitative data (observation of processes)

Exercise

Grading Table 8.8 according to the rubric in Table 8.7 What is your grading Why is it? How can you improve table 8.7? Form a group to discuss

Exercise

Take the grantsmanworkshop, draw the absolute and relative rubrics to grade this training program Absolute rubric (see Table 8.4) Relative rubric (see Table 8.7) Write down half page Form a group to discuss

Exercise

Draw the absolute and relative rubric for your evaluation project Work together with your project team Absolute rubric (see Table 8.4) Relative rubric (see Table 8.7)

Some hints

Keep an open mind Any help that can help you to make good

sense of the data Balance of time and effort Balance of time and level of details

s519: evaluation of information systems analyzing data: merit ch8

Documents