empirical methods in trade: analyzing trade costs …b. stata windows c. organization of the...
TRANSCRIPT
![Page 1: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/1.jpg)
Empirical Methods in Trade: Analyzing Trade Costs and Trade Facilitation
June 2015
Bangkok, Thailand
Cosimo Beverelli Simon Neumueller
(ERSD/WTO) (ERSD/WTO)
1
![Page 2: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/2.jpg)
Content
2
a. Resources
b. Stata windows
c. Organization of the “Bangkok_June_2015\Stata” folder
d. The “directory_definition” do file
e. Datasets used in this introduction to Stata
f. Do files
g. Importing data into Stata
h. Basic commands
i. Merging datasets
j. Macros
k. Loops
l. Graphics
![Page 3: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/3.jpg)
a. Resources
3
1. Stata help and Stata manual
2. A variety of books covering Stata exist
Web resources:
1. Germán Rodríguez’s webpage • Data management, graphics and programming
2. UCLA IDRES’ webpage • Very comprehensive , covering all sorts of topics (data management,
analysis,…) with several examples
• FAQ
3. Statalist • Typically accessed via a google search
![Page 4: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/4.jpg)
b. Stata windows
4
![Page 5: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/5.jpg)
![Page 6: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/6.jpg)
Type commands here
![Page 7: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/7.jpg)
Type commands here
List of variables and labels
![Page 8: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/8.jpg)
Type commands here Some properties of the variables
List of variables and labels
![Page 9: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/9.jpg)
Type commands here Some properties of the variables
List of variables and labels Actions taken in the session
![Page 10: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/10.jpg)
c. Organization of the “Bangkok_June_2015\Stata” folder
6
• The “Bangkok_June_2015\Stata” folder contains the following sub-folders: • data
• do_files
• results
• Do not change the name of the folder or of the sub-folders
• The first thing to do is to define your own working directory (see slide c.)
![Page 11: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/11.jpg)
d. The “directory_definition” do file
7
• To make things easy for all of us, we define a “path” for the working directory that can be easily changed, and will be changed only once and for all
• When you open Stata, the first thing to do is to open the “directory_definition” do-file in the do-file editor
• Click here:
• …or Ctrl+9
• The “directory_definition” looks like this and you have to change it to your own path
![Page 12: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/12.jpg)
e. Datasets and do files used in this introduction to Stata
8
• To apply some of the Stata commands described in this presentation, we will use two datasets:
• WDI.csv – a very small subset of the World Development indicators
• WB_ES.xls – derived from the World Bank Enterprise Surveys
• You can find the datasets in the “data” directory: "$BKK\data\Introduction_Stata"
• There are also 4 do files
• 01_data_intro_stata
• 02_descriptive
• 03_reshape_merge
• 04_loops
• You can find the do files in the directory: "$BKK\do_files\Introduction_Stata“
![Page 13: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/13.jpg)
f. Do files
9
• A do file is a set of Stata commands typed in a plain text file
• When you work with STATA, always use do files
• E.g. one do file for creating your master dataset and one do file for regressions
• Do files can also be used to set globals and directories or to run a series of different do files after each other
![Page 14: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/14.jpg)
Typical commands at beginning of each do file:
clear all /* removes all data in the current Stata session*/
set more off, perm /* prevents Stata to pause while running a do file */
capture log close /* closes a log file */
cd “directory” /* sets the directory, e.g.. “$BKK\data*/
log using “filename”, replace /* useful for long do files, allows printing */
use “dataset.dta”, replace /* open dataset in Stata format (.dta); “ ”
are not necessarily needed */
![Page 15: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/15.jpg)
Typical commands at beginning of each do file:
clear all /* removes all data in the current Stata session*/
set more off, perm /* prevents Stata to pause while running a do file */
capture log close /* closes a log file */
cd “directory” /* sets the directory, e.g.. “$BKK\data*/
log using “filename”, replace /* useful for long do files, allows printing */
use “dataset.dta”, replace /* open dataset in Stata format (.dta); “ ”
are not necessarily needed */
Notes
• “*” treats everything after it in a line as a comment
• “/* text */” will make Stata treat “text” as a comment (“text” can span over several lines)
![Page 16: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/16.jpg)
g. Importing data into Stata
11
• insheet using filename.csv, clear
• Typically used for text files that are either comma or tab-separated
• import excel using filename.xls, sheet(“Sheet1”) first clear
• Reads excel files directly into Stata
• Allows to specify variables, cell range and worksheet to import
• Copy paste is strongly discouraged. Watch out. The accuracy of copied numbers depends on:
o How data are formatted in excel, i.e. how many digits are shown
o Your settings in Stata (use set type double before copying)
To save the dataset (in Stata format): • save filename, replace
![Page 17: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/17.jpg)
h. Basic commands
12
• Identify missing values (represented by . (dot) or empty cell) • inspect varlist
• codebook varlist
• Identify duplicate observations • duplicates (report/drop/tag/list) varlist
• Identify number of unique values
• unique varlist
• Browse the dataset
• browse varlist
![Page 18: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/18.jpg)
Other basic commands
• describe
• generate
• destring/tostring
• replace
• rename
• Alternative: renvars
• keep
• drop
• The list can go on…what is important is to keep in mind that, in case of doubt, you can always use the “help”:
• help command
![Page 19: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/19.jpg)
List of useful operators commonly used in expressions
Arithmetic Logical Relational
+ add ! not (also ~) == equal
- subtract | or != not equal (also ~=)
* multiply & and < less than
/ divide <= less than or equal
^ raise to power > greater than
+ string concatenation >= greater than or equal
![Page 20: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/20.jpg)
Commands for descriptive statistics
• summarize varlist
• tabulate var1 var 2
• table rowvar (colvar), content()
• tabstat varlist, statistics() by()
![Page 21: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/21.jpg)
The egen command
• Often used command to create new variables
• Commonly used egen functions (refer to WB_ES dataset):
• bysort cou sector: egen sales_sec=total(sales), missing
• bysort cou sector: egen sales_sec=mean(sales)
• egen exp_tot=rowtotal(exp_intermediate exp_final)
• egen id_cluster=group(cou sector)
• egen cou_sec=concat(cou sector)
• See 02_descriptive.do
• Further functions include: max, min, count, tag,…
![Page 22: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/22.jpg)
String functions
• generate newvar =function()
• Some useful functions are:
• abbrev() –> shortens the string the number of indicated characters
• length() –> returns the length of the string, i.e. number of characters
• subinstr() –> allows to replace or delete particular substrings
• substr() –> allows to extract substrings based on its position
• upper (lower) –> Changes the entire string to upper-case (lower- case) strings
• trim() –> removes leading and trailing blanks of the string
![Page 23: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/23.jpg)
The collapse command
• collapse (mean) varlist (sum) varlist, by(varlist)
• Creates an aggregate dataset by e.g. averaging or summing variables across the dimension identified in by()
• All observations not included in the command are dropped
• Useful in analysis when moving to a higher level of aggregation, e.g. aggregating trade flows from HS 6-digit to HS 2-digit
• Useful for calculating descriptive statistics before exporting them to excel using export excel
![Page 24: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/24.jpg)
The collapse command
• collapse (mean) varlist (sum) varlist, by(varlist)
• Creates an aggregate dataset by e.g. averaging or summing variables across the dimension identified in by()
• All observations not included in the command are dropped
• Useful in analysis when moving to a higher level of aggregation, e.g. aggregating trade flows from HS 6-digit to HS 2-digit
• Useful for calculating descriptive statistics before exporting them to excel using export excel
• If you do not want to collapse, duplicates drop after bysort (): egen gives the same results as collapse. Example (see 02_descriptive.do): • collapse (mean) sales (sum) dummy_exp, by(cou sector)
is equivalent to:
• bysort cou sector: egen avg_sales = mean(sales)
• bysort cou sector: egen number_exporters =
total(dummy_exp)
• keep cou sector avg_sales number_exporters
• duplicates drop
![Page 25: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/25.jpg)
The reshape command
• reshape wide (long) ‘stub’, i(var) j(var) options
• Reshapes dataset from long to wide format and vice versa
• Data dimensions such as country, year or sector are normally put in long format
• ‘stub’ are variables in reshape wide and stubs of variables in reshape long
• i(var) are identifying dimensions; j(var) dimension to change
• Exercise: Open WDI.dta and reshape it first long and then wide (see 03_reshape_merge.do)
i j stub
1 1 4.1
1 2 4.5
2 1 3.3
2 2 3.0
i stub1 stub2
1 4.1 4.5
2 3.3 3.0
Long
Wide
Reshape
![Page 26: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/26.jpg)
i. Merging datasets
20
• To merge datasets you can use joinby or merge
• joinby varlist using filename, unmatched(both)
• The command forms all pairwise combination for varlist
• unmatched can keep unmatched observations from the master dataset, the using dataset or both (see generated variable _merge)
• Exercise: Open WB_ES.dta and merge it with WDI.dta (see 03_reshape_merge.do)
![Page 27: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/27.jpg)
j. Macros
21
• Macros are names associated with some text • The commands global and local assign strings to global and local macro names
• global mname [=exp | :extended_fcn | [`]"[string]"['] ] • Global macros, once defined, are available anywhere in Stata
• Evaluate it using $mname
• local lclname [=exp | :extended_fcn | [`]"[string]"['] ]
• Simplest example: local c USA JPN
• Evaluate it using `lclname'
• Local macros work only within the do file in which they are defined
• Globals and locals have a variety of uses
• To define the directories for this class, i.e. directory_definition.do
• They are used in loops (see next slides)
• A set of explanatory variables can be grouped under one macro name
![Page 28: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/28.jpg)
k. Loops
22
• See Stata help and Germán Rodríguez’s webpage
• Two main commands: foreach and forvalues
• foreach loops through strings of text, forvalues loops through numbers
• Syntax (3 alternatives): foreach item in a-list-of-things (e.g. a b d) {
commands referring `item‘ }
foreach varname of varlist list-of-variables {
commands referring to `varname‘ }
forvalues number = first(step)last {
commands referring to `number' }
![Page 29: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/29.jpg)
23
Examples for loops in WB_ES.dta
Ex1 foreach k in USA JPN { /* Loop over any_list */
egen sales_`k'=total(sales) if cou=="`k'"
}
Ex2 vallist cou, local(c) /* vallist shows values and creates
local */
foreach k of local c { /* Loop over a local macro */
capture drop sales_`k'
egen sales_`k'=total(sales) if cou=="`k'"
}
Ex3 forvalues k=1(1)3 { /* Loop over sector codes. The
range can be defined in different ways*/
egen total_`k'=total(sales) if sector==`k'
}
• foreach can also be used to loop over variables and numbers • foreach k of var varlist; foreach k of num numlist
![Page 30: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/30.jpg)
l. Graphics
24
• Useful links: official Stata or UCLA IDRES
• Histograms and bar graphs: • histogram var
• graph bar (stat) var, over(var)
![Page 31: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/31.jpg)
25
More graphics
• Distribution plots: • histogram var, frequency kdensity
• twoway kdensity var
• twoway (kdensity var1 if var2==“”) (kdensity var1 if var2
==“”), by(var)
![Page 32: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/32.jpg)
26
More graphics
• Scatter plots: • scatter yvar xvar
• graph twoway (scatter yvar xvar) (lfit yvar xvar)
![Page 33: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/33.jpg)
27
More graphics
• Scatter plots: • scatter yvar xvar
• graph twoway (scatter yvar xvar) (lfit yvar xvar)
![Page 34: Empirical Methods in Trade: Analyzing Trade Costs …b. Stata windows c. Organization of the angkok_June_2015 \Stata folder d. The directory_definition do file e. Datasets used in](https://reader034.vdocuments.mx/reader034/viewer/2022042107/5e8707bab97a6e5a20378d5b/html5/thumbnails/34.jpg)
28
More graphics
• Line plots: • line yvar year
• graph twoway (line yvar year if…) (line yvar year if …)