using excel biostatistics 212 lecture 4. housekeeping questions about lab 3? –replace vs. recode...

30
Using Excel Biostatistics 212 Lecture 4

Upload: trevin-cordier

Post on 14-Dec-2015

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Using Excel

Biostatistics 212

Lecture 4

Page 2: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Housekeeping

• Questions about Lab 3?– replace vs. recode

• Final Project Dataset!– “Housekeeping” commands vs. data cleaning

(don’t show data cleaning)

• A little short-handed today…

Page 3: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Today...

• Why are we talking about spreadsheets?• Pro’s and Con’s of using a spreadsheet for:

– Data management, Statistics, Calculating, Modeling, Tables, Figures

• Cells• Formulas• Cutting and pasting formulas• Spreadsheet style• Examples

Page 4: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Why spreadsheets?

• Excel is widely used, and for good reason– Store numbers and text– Calculations– Desktop graphics – Tables and Figures– Flexible creation of ledgers, models, other

complex programs

Page 5: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Why spreadsheets?

• How is a spreadsheet different than Stata’s data editor?– Less structured– Formulas– Formatting

Page 6: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Why spreadsheets?

• How is a spreadsheet different than a database program like Access?– Less structured– Formula chains– Formatting

Page 7: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Pro’s and Con’s of spreadsheets

• For data management– Pro’s

• Easy start – just name columns and start typing

– Con’s• No structure

• Can’t sort, filter or query data

• “Flat” file – no relational table structure allowed

Page 8: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Pro’s and Con’s of spreadsheets

• For statistical analysis– Pro’s

• Easy start, if you know how to do formulas

– Con’s• Extremely limited range of options

• Difficult to document

Page 9: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Pro’s and Con’s of spreadsheets

• For calculating, or “modeling”– Pro’s

• Repetitive calculations easy• Complex calculations easy

– Con’s• Simple, 1-time calculations not as fast as a

calculator• Sometimes hard to decipher in retrospect

Page 10: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Pro’s and Con’s of spreadsheets

• Tables and Figures – will discuss in Sessions 6 and 7

Page 11: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Cells

• The basic building block of a spreadsheet

• Can contain:– Numbers

– Text

– Dates, times, other special formats

– “blanks”• Start with 46 million blank cells!

(230 cols x 66536 rows x 3 worksheets)

Page 12: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Cells, cont

• Enter anything you like into each cell (numbers, text, symbols, etc) using keyboard

• Contents displayed on spreadsheet

• Organized and named by column/row

Page 13: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Use when you want the contents of one cell to depend on the contents of other cells

•ALWAYS starts with: =

(an “equals sign”)

Page 14: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Can contain:– Numbers– Text– References to cells– The usual math operators (+ - * / ^ )– Built-in functions

Page 15: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Cell contents update automatically when a referenced cell content changes

• “Chains” of formulas make for flexible calculating

Page 16: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Contents of a cell displayed on spreadsheet

• The formula determining that content is displayed in the “formula box”

• Example

Page 17: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Types of formulas– Arithmetic

• +, -, *, /, ^

– Logic• IF(boolean, value 1, value 2)

– Returns value 1 if TRUE, value2 if FALSE

• AND(boolean, boolean, boolean…)– Returns TRUE if all booleans are true, otherwise FALSE

• OR(boolean, boolean, boolean…)– Returns TRUE if any booleans are true, otherwise FALSE

Page 18: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Types of formulas, cont– Functions, for example:

• SUM(range of cells)– Returns the sum of the values in the range

– SUM(A5:A10)

• AVERAGE(range of cells)– Returns the average of the values in the range

• STDEV(range of cells)– Returns the standard deviation

• NORMINV(probability, mean of dist, SD of dist)– Returns the z-value associated with a given probability…

Page 19: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Types of formulas, cont– Functions, for example:

• LN(number)– Returns the natural log of a number

• ABS(number)– Returns the absolute value of a number

• LEFT(text, number of characters=x)– Returns x number of characters from the text in the cell, starting

at the left side…

• NOW()– Returns the current date, time

Page 20: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Formulas

• Tips– Use parentheses

• IF(SUM(A5:A10)>5,1,IF(C9=“y”,2,3))

– Or do in multiple steps

Page 21: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Cutting/Copying and Pasting

• Cutting and Copying treat formulas differently!

Page 22: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Cutting and pasting formulas

• Excel assumes the cell references are ABSOLUTE, and you’re just moving the location of the formula cell

• Example

Page 23: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Copying and pasting formulas

• Excel assumes the cell references are RELATIVE

• Example

• Shortcut: drag little square in the corner…

Page 24: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Copying and pasting formulas

• If you want to FIX the position of a referenced cell, use $’s= A5 + $B$6

• Example

Page 25: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Examples

• Repetitive calculations– Back-transforming linear regression coefficients

• Complex calculations– 2 x 2 template

• Modeling– Mortgage calculator– Risk integrator– Figure 2 for LDL-lowering paper

Page 26: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Spreadsheet style

• Formatting– Text– Column width– Borders– Placement of stuff on the page

Page 27: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Spreadsheet style

• For models:– Inputs on the left, in red– Outputs on the right, in blue, boxed, bolded, etc– Calculations on other sheets– “Protect” all cells besides inputs

• Format/Cells…/Protection

• Tools/Protect

Page 28: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Take home points

• Understand cells and formulas

• Use copy/paste with and without fixed cells ($A$45)

• Good formatting adds significant value to your spreadsheet

Page 29: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

Lab 4

• Practice with:– A repetitive calculation spreadsheet– A complex calculation spreadsheet– Introduction to making a figure with Excel

• Due before lecture next week

• Extra credit puzzle challenge – 2x2 excel template– Due Sept 18th – email to [email protected]

Page 30: Using Excel Biostatistics 212 Lecture 4. Housekeeping Questions about Lab 3? –replace vs. recode Final Project Dataset! –“Housekeeping” commands vs. data

To come…

• Next lecture– Epidemiologic analysis with Stata

• 2 x 2 tables, confounding and interaction

• Epitab commands

• Logistic regression introduction