tricks in stata anke huss generating „automatic“ tables in a do-file
TRANSCRIPT
Tricks in Stata
Anke Huss
Generating „automatic“ tables in a do-file
Why programming tables?
• It‘s much more writing in the do-file!
• BUT: once you have done it, the next one will be faster (copy & paste...)
• No more troubles with updates of your data
• No more copying mistakes, because Stata does it for you
Caerphilly castle
Used data: Caerphilly Prospective study (CAPS)
download at: www.blackwellpublishing.com/ essentialmedstats/datasets.htm
Basic idea
• Use the Stata data sheet for your table-to-be
illnMIdiabetes
%19.48 1.85
Stored results in r() and e()
• Use stored results usually from r-class: results after general commands
such as summarize are saved in r() and generally must be used before executing more commands. For an overview type:
return list
e-class: results from estimation commands (regress/logictic…) are saved in e() until the next model is fitted. Overview:
ereturn list
Steps
1. DESIGN TABLE FIRST: what do I want my table to look like?
2. generate a new variable for each column
3. replace cell with number of interest
4. use „outsheet“ to write your new variables in text/ excel file
Example 1
1. DESIGN FIRST: what do I want my table to look like? E.g.:
Illness %
Myocardial inf 19.48
diabetes 1.85
Example 1
2. Generate a new variable for each column
gen str illness = ““
gen percent =.
Illness %
Example 1
3. Replace cell with contents/ number of interest: first column
sort id
replace illness = “myocardial inf“ in 1
replace illness = “diabetes“ in 2
Illness %Myocardial inf
diabetes
Example 1
3. Replace cell with contents/ number of interest: second column
sum misort idreplace percent = r(mean)*100 in 1
sum diabetessort idreplace percent = r(mean)*100 in 2
format percent %9.2f
Illness %Myocardial inf 19.48
diabetes 1.85
Example 1
4. use „outsheet“ to write your new variables in text/ excel file
outsheet illness percent in 1/2 using textres/illns.txtFor further
*comment 1: this works only if you have set STATA to work in a specific STATA folder. Eg: cd "d:/Statistisches/automatic_tables/STATA„
*comment 2: you can also export as excel file (*.xls), but automatic import of new textfile lets graphics survive...
Example 1
*Alternative way to do the same: program a small loop:
gen str name = ""gen percent = .local i = 1foreach var of varlist mi diabetes {
replace name = “`var'“ in `i' sum `var' sort idreplace percent = r(mean)*100 in `i' local i = `i' + 1
}format percent %9.2f
Example 2
1. DESIGN TABLE FIRST:
Category percent
underweight 4.20
normal 32.03
overweight 51.29
obese 12.49
Example 2
2. Generate a new variable for each column
gen str category = ""
gen percent = .
Category percent
Example 2
3. Replace cell with contents/ number of interest: first column
sort id
replace category = "underweight" in 1
replace category = "normal" in 2
replace category = "overweight" in 3
replace category = "obese" in 4
Category percent
underweight
normal
Overweight
obese
Example 2
3. Replace cell with numbers: second column
ta bmicat, gen (bminew)*4 lines with percentages*4 variables with ending in numbers from 1 to 4 ---
LOOP!
forvalues i = 1/4 {sum bminew`i' sort idreplace percent = r(mean)*100 in `i'
}format percent %9.2f
Category percent
underweight 4.20
normal 32.03
Overweight 51.29
obese 12.49
Example 2
4. Outsheet
...same as in example 1
Less writing...
label list bmicatcapture drop percent category bminew*ta bmicat, gen (bminew)gen category =.gen percent = .forvalues i = 1/4 {
sum bminew`i' sort idreplace category = `i' in `i'replace percent = r(mean)*100 in `i'
}label values category bmicatformat percent %9.2f
Example 3
1. THINK FIRST: table after logistic reg.Myocardial infarction OR uci lci pval
Current smoking
Current smoking
(+ age)
Current smoking(+ age + bmi)
Example 3
2. Generate a new variable for each column
gen str currsmok = ""gen OR = .gen uci = .gen lci = .gen pval =.
Example 3
3. Replace cell with contents/ number of interest: first column
sort id
replace currentsm = "current smoking" in 1
replace currentsm = "current smoking + age" in 2
replace currentsm = "current smoking + age + bmi" in 3
Example 3
3. Replace cell with numbers: second columnlogistic mi cursmoke
sort idreplace OR = exp(_b[cursmoke]) in 1replace lci = exp(_b[cursmoke] - 1.96*_se[cursmoke]) in 1replace uci = exp(_b[cursmoke] + 1.96*_se[cursmoke]) in 1est store Alogistic miest store Blrtest A Bsort idreplace pval = r(p) in 1
... In lines 2 and 3
Example 3
4. outsheet
...as in example 1
Resulting table
Myocardial infarction OR uci lci pval
Current smoking 1.74 2.22 1.36 6.76e-06
Current smoking
(+ age)
1.67 2.18 1.28 0
Current smoking(+ age + bmi) 1.82 2.40 1.39 0
Other way to save results after estimation commands
• Use the statsby command: eg:
statsby "logistic mi diabetes smoking" _b _se, saving (D:\Statistisches\automatic_tables\STATA\data\caerphillystatsby.dta) replace
Statsby will collapse your dataset!
Store results in a new dataset and open the original file again. Rerun "statsby" with next variables and append data to first stored results.