ppal 6200 research methods and info systems

Post on 12-Jan-2016

32 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

PPAL 6200 Research Methods and Info Systems. Intro and Chapter 1. Class Outline. Intro to the Course Discussion of Software and Technical Issues Break Describing Data “Distributions” with Graphs. Introduction to the Course. What can you expect to learn in this class - PowerPoint PPT Presentation

TRANSCRIPT

PPAL 6200Research Methods and Info

SystemsIntro and Chapter 1

Class Outline

• Intro to the Course

• Discussion of Software and Technical Issues

• Break

• Describing Data “Distributions” with Graphs

Introduction to the Course

• What can you expect to learn in this class– A Framework for Conducting and Evaluating

Empirical Research– A Framework for Conducting and Evaluating

Statistical Research– The challenges facing those who must deal

with information systems as part of their jobs

• What you should not expect to learn in this class– A professional capacity to conduct statistical

work, at best you will be prepared to learn more about how to undertake statistical work (if you choose to do so) and to be a “knowledgeable consumer” of research prepared using statistics.

• Some Key Concepts to Start Us Off *source unless noted Moore(2009)

– Data • Numbers with a context (xxiv). The context including how

data is collected can alter results.

– Variable• An empirical property that can take on two or more values

(Frankfort-Nachmias & Nachmias 1996:50) Don’t get suckered in by small and rapid changes, look at the big picture (xxvii)

– Case• An individual, event or other thing for which we have data

– Measurement• The assignment of numbers to objects, events or variables

according to rules (ibid: 156-157)

– Levels of Measurement• Nominal, Ordinal, Interval, Ratio

– Validity• Are you measuring what you thought you are measuring?

– Reliability• Are you measuring it accurately?

– Spuriousness• Is there something else involved? Beware the lurking

variable (xxvii)

– Statistics• The science of learning from data (xxiv)

The Book Title Says It All…

• This is a class in the “basic practice of statistics” with a little bit of practical advice thrown in regarding management of information systems

• Inside the front cover of the book is a wonderful set of flow through figures that show how one can go about statistical thinking in a disciplined manner and three four step plans to guide your work

Some software and technical issues

• For this portion of the class we will quickly review my website then leave power point to go look at the electronic resources available there to assist you

www.yorku.ca/dcohn/PPAL-6200.html

The Secure Website

Please Note: The secure website will look different for you as I have access to page design resources you will not see

• We will now leave power point to look at these resources

Describing Data Distributions with Graphs

• As the introductory sections of the book noted, you really cannot go wrong to begin your work by visualizing the individual variables that comprise your data (and on occasion plotting them against another variable such as time).

• The distribution tells you what values a variable takes and how often it does so

Ways we can Visualize and Explore Data

• Exploratory analysis is not meant to allow us to reach any deep conclusions it is meant to help us better understand the data set and the relationships within it

• We want to look both for an overall pattern (consistencies) and deviation from it (often called outliers)

• Tables– Tables are effective tools for visualizing data, provided that we

do not have too many variables, nor too many cases

• At a certain point we need to graphically depict our data to make it understandable as a snapshot

Which Graph?

• The graphic depictions we employ are dependent on:– The type of data we have

• Level of Measurement• Whether Stationary or Chronological

Some Common Graphs

• Pie Chart (good for showing percentages when few categories of a nominal or ordinal variable)

Percentage of Students Picking a Given Major

• Bar Charts are equally useful for nominal and ordinal variables but have the benefit of allowing more flexibility

Foreign Born Population of US States by Percentage

Histograms

• Histograms can be confusing as they look like Bar Graphs sometimes. In fact you can make them by carefully specifying a Bar Graph. However they are really quite different.

• They are meant for use with Interval and Ratio data where there is a lot of variability among cases because there are so many possible values for the data

• Therefore we have to “group the data” to a certain extent to allow us to represent it

• What a histogram shows is the percentage of cases that have a score within the groups represented by the bars

• You will notice that this graph looks a bit different from the one in the book.

• This is because the scaling that my software used is a bit different from that used by the person who did the examples in the book.

This brings up a good point

• Be careful how you manipulate data as you will see in the next section of the talk. these two graphs portray the same information but one will give us a more interesting result.

Describing a Distribution

• Once we get to developing histograms we can start to evaluate the shape of our data in a number of interesting ways (Shape, Centre, Spread)– What is the shape of the plot? Is it single peaked or

multi-peaked?– Where is the peak? Is it at the centre or off-centre

(skewed)? When the tail of a distribution heads off to one side unevenly we say it is skewed to that side (this is confusing)

– What about outliers? Any unusually high or low scores?

As you can see below: Regrouping the data makes one figure more symmetrical than the

other

A stemplot is not so elegant

• Granted it is not so elegant but it does allow us to figure out what is happening inside of those bars….

28

Stemplots(Stem-and-Leaf Plots)

• For quantitative variables• Separate each observation into a stem (first part

of the number) and a leaf (the remaining part of the number)

• Write the stems in a vertical column; draw a vertical line to the right of the stems

• Write each leaf in the row to the right of its stem; order leaves if desired

29

Weight Data12

BPS - 5th Ed. Chapter 1 30

Weight Data:Stemplot

(Stem & Leaf Plot)

1011121314151617181920212223242526

Key

20|3 means203 pounds

Stems = 10’sLeaves = 1’s

192

2

1522

5

135

BPS - 5th Ed. Chapter 1 31

Weight Data:Stemplot

(Stem & Leaf Plot)

1011121314151617181920212223242526

Key

20|3 means203 pounds

Stems = 10’sLeaves = 1’s

2

2

5

BPS - 5th Ed. Chapter 1 32

Weight Data:Stemplot

(Stem & Leaf Plot)

10 016611 00912 003457813 0035914 0815 0025716 55517 00025518 00005556719 24520 321 02522 023242526 0

Key

20|3 means203 pounds

Stems = 10’sLeaves = 1’s

BPS - 5th Ed. Chapter 1 33

Extended Stem-and-Leaf Plots

If there are very few stems (when the data

cover only a very small range of values), then

we may want to create more stems by splitting

the original stems. In other words, you can

have more than one stem with the same base

number.

BPS - 5th Ed. Chapter 1 34

Extended Stem-and-Leaf Plots

Example: if all of the data values were between 150 and 179, then we may choose to use the following stems:

151516161717

Leaves 0-4 would go on each upper stem (first “15”), and leaves 5-9 would go on each lower stem (second “15”).

Thinking about these Graphs

• When we look at these graphs we have to keep in mind the questions we have started– Shape– Centre (other than time-series)– Outliers

top related