data collection. what is data? data is facts and statistics that are collected together we use data...

36
Data Collection

Upload: shavonne-nelson

Post on 11-Jan-2016

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Data Collection

Page 2: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

What is Data? Data is facts and statistics that are collected together

We use data to be able to gather information for reference or analysis

We translate the information we find into a form that is more convenient to understand using charts and graphs

Page 3: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Types of Data

Numerical Data

Categorical Data

Page 4: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Numerical Data Numerical data is quantitative

This means it can be measured using numbers and these numbers can be placed in ascending or descending order

We use scatter plots and line graphs to represent numerical data

There are two types of numerical data – discrete and continuous

Page 5: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Discrete Numerical Data Discrete means the numbers used to measure the data have to be whole numbers

The numbers must be distinct and separate

Examples for discrete numerical data would be age, number of kittens, number of people, etc

Page 6: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Continuous Numerical Data Continuous means the numbers used to measure the data can be any number including decimals

Examples of continuous data would be temperature, time, and height

Page 7: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Categorical Data Categorical data is data that can be sorted into groups or categories

Categorical data is qualitative meaning it describes something

We use bar graphs and pie charts to sort categorical data

There are two different types of categorical data – nominal data and ordinal data

Page 8: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or
Page 9: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Nominal Categorical Data Nominal data can be counted but not put in ascending or descending order (sorted)

Nominal data makes sense regardless of the order it is presented

Examples of nominal data include gender, eye colour, hair colour, etc

Page 10: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Ordinal Categorical Data Values or observations that are ordinal can be ranked or have a scale attached

You can count and order ordinal data, but it cannot be measured like numerical data

Examples of ordinal data include house numbers, dates, swimming level, etc

Page 11: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Data Collection Data collection is separated into two types: primary data and secondary data

Primary data is collected first hand

Secondary data is data that was collected by somebody else

Page 12: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Primary Data - Examples Surveys

Focus groups

Questionnaires

Personal interviews

Experiments and observational study

Page 13: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Primary Data - Limitations Do you have the time and money for:

◦ Designing your collection instrument?◦ Selecting your population or sample?◦ Pretesting/piloting the instrument to work out sources of bias?◦ Administration of the instrument?◦ Entry/collation of data?

• Uniqueness• May not be able to compare to other populations

• Researcher error• Sample bias• Other confounding factors

Page 14: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Secondary Data – Examples of Sources

County health departments

Vital Statistics – birth, death certificates

Hospital, clinic, school nurse records

City and county governments

Surveillance data from state government programs

Federal agency statistics - Census, NIH, etc.

Page 15: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Secondary Data – Limitations When was it collected? For how long?

◦ May be out of date for what you want to analyze.◦ May not have been collected for a long enough time

• Is the data set complete?• There may be missing information on some observations• Unless such missing information is caught and corrected for, analysis will

be biased.• Is the data consistent/reliable?• Did variables drop out over time?• Did variables change in definition over time?

• E.g. number of years of education versus highest degree obtained.

Page 16: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Secondary Data – Advantages No need to reinvent the wheel.

◦ If someone has already found the data, take advantage of it.

It will save you money.Even if you have to pay for access, often it is cheaper in terms of money than collecting your own data. (more on this later.)

It will save you time.Primary data collection is very time consuming. (More on this later, too!)

It may be very accurate.When especially a government agency has collected the data, incredible amounts of time and money went into it. It’s probably highly accurate.

Page 17: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Data Collection When collecting data from a group, we can do it two ways

Observational data or Experimental data

Page 18: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Observational Data Observational data is collected by grouping people into different categories and observing how something affects them

An example of observational data collection would be to separate a group into adults vs children and compare the effects of sunlight on them

Page 19: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Experimental Data Experimental data is collected by creating our own groups and imposing our own treatment on the groups to see the effects

An example for experimental data would be administering a placebo drug to one group

Page 20: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Data Collection We use data collection to be able to obtain information on a smaller group and extend it to a larger population

The most important thing to remember is that the group we select must represent the population as a whole

It is very difficult to ensure this happens

Page 21: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Population vs Sample Population – the entire group being studied. Example: How many families in Canada have internet?

Sample – the part of the population that is being studied. Example: We would not be able to ask every family in Canada if they have internet. But we would select smaller groups from each province and territory and extend it to the entire country

We select a sample from an entire population so that it is easier to get the information we need

We use various sampling techniques to select our sample. Example: Our survey would not be very valid if we selected only families in southern parts of Canada where internet is more easily accessible.

Page 22: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Characteristics of a Good Sample

Each person must have an equal chance of being selected into the sample.

The sample must be large enough to represent the population

We use various sampling techniques to ensure this happens

Page 23: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Simple Random Sample Every member of the population has an equal chance of being picked

Example: Putting names in a hat and drawing at random

Page 24: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Systematic Random Sample To go through a population sequentially and select at even intervals

Example: Going through a phone book and selecting every 50th person

Page 25: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Stratified Sample A strata is a group of subjects that share a common characteristic

It keeps proportionate samples of each strata to the population

Example: If the population has both men and women, you ensure men and women are in the sample

Page 26: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Cluster Sample One representative group of the population chosen at random

Example: Picking one floor of an office building and surveying them

Page 27: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Multi-Stage Sampling Using a combination of stages to obtain the sample

Page 28: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Convenience Sample A type of sampling technique that is based on how easy responses are to obtain

Example: Surveying people stranded at an airport during a snowstorm about air travel

Page 29: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Voluntary Response Sampling Inviting subjects to voluntarily be a part of the sample

Example: Receiving a survey in the mail and being asked to complete it, random phone surveys from businesses

Page 30: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Problems with Data Collection Questions must be simple, clear, specific, ethical, free from bias, allow for honest response, and not infringe on anyone’s privacy

Questions must not contain slang, abbreviations, negatives, leading questions, and insensitivities

Good questions are often anonymous and require the subject to select from a list of possible responses

Survey bias can be unintentional, but can cause the data collected to be invalid. There are many different types of bias

Page 31: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Sampling Bias The chosen sample does not accurately reflect the population

Example: Asking basketball players about issues with the math curriculum

Page 32: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Non-Response Bias Particular groups are under-represented in the sample because they choose not to participate

When responders don’t respond, the surveyor is forced to draw their own conclusions about the sample

Page 33: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Measurement Bias When the data collection method consistently under- or overestimates a characteristic of the population

Leading questions can also cause measurement bias

Example: Police radar gun measuring for average speed on a particular road

Page 34: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Response Bias When participants in a survey give false or misleading answers

Question quality or topic might lead to response bias

Example: Teacher asks the class to raise hands if they completed their homework

Page 35: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Tally Charts A tally chart is a table used to record values by hand as the data is collected.

One tally mark is used for each occurrence of a value

Tally marks are usually grouped into sets of five to allow for easier counting

Page 36: Data Collection. What is Data? Data is facts and statistics that are collected together We use data to be able to gather information for reference or

Frequency Tables Tally charts are helpful during the collection of data

Once the data is collected, it is more useful to summarize the data into what we call a frequency table.

A frequency table shows the data numerically Number of days with rain Number of weeks

01234567

255519664

Total 52