statistical methods for the social sciences rahul mukherjee review session 01 ta: marcio cruz...

Download Statistical Methods for the Social Sciences Rahul Mukherjee REVIEW SESSION 01 TA: Marcio Cruz marcio.cruz@graduateinstitute.ch Office hours Wednesdays

If you can't read please download the document

Upload: preston-boyd

Post on 17-Dec-2015

214 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1
  • Statistical Methods for the Social Sciences Rahul Mukherjee REVIEW SESSION 01 TA: Marcio Cruz [email protected] Office hours Wednesdays 09:00-11:00 Rigot 10
  • Slide 2
  • 1.Where do I find interesting data? GOOGLE!!! ;-) Basic data management Some interesting links to MDEV and MIA students: MACRO (agregate variables/different countries) World Development Indicators (World Bank) http://data.worldbank.org/data-catalog/world-development-indicators World Economic Outlook Database (IMF) http://www.imf.org/external/pubs/ft/weo/2011/02/weodata/index.aspx World Economic Outlook Database (IMF) http://stat.wto.org/Home/WSDBHome.aspx
  • Slide 3
  • By region: US Economy - Federal Reserve Economic Data http://research.stlouisfed.org/fred2/ European Union Economy - ECB statistics http://sdw.ecb.europa.eu/ China National Bureau of Statistics of China http://www.stats.gov.cn/english/statisticaldata/ Mexico Banco de Mxico http://www.banxico.org.mx/estadisticas/index.html Basic data management
  • Slide 4
  • MACRO DATA: You can find macro dataset for most of countries on their central banks and national statistics bureau webpages. Central Banks http://www.bis.org/cbanks.htm Official National Bureau of Statistics Basic data management
  • Slide 5
  • MICRO (household surveys, firm-level data, etc.) Official National Bureau of Statistic http://www.census.gov/acs/www/data_documentation/public_use_microdata_sample/ http://epp.eurostat.ec.europa.eu/portal/page/portal/microdata/introduction http://www.esds.ac.uk/international/access/micro.asp International Organizations http://microdata.worldbank.org/index.php/home Some blogs provide good links: https://sites.google.com/site/medevecon/development-economics/devecondata/micro http://openmicrodata.wordpress.com/ Faculty webpages http://dvn.iq.harvard.edu/dvn/dv/JAngrist Basic data management
  • Slide 6
  • 2. How should I download this data? Let us start with an example using MACRO data (from WDI)..csv,.txt or.xls? What is the difference? How to manage this data on excel? How to sort this data? How to do basic math operations on excel? How to get basic descriptive statistics on excel? How to generate a graph? Basic Excel
  • Slide 7
  • Statistical packages Why should I manage data using a statistical package? It provides you more flexibility and you can keep the information about what you did in your research! Some examples of statistical packages: http://en.wikipedia.org/wiki/List_of_statistical_packages SPSSSPSS comprehensive statistics package EViewsEViews for econometric analysis StataStata comprehensive statistics package; SASSAS comprehensive statistical package MATLABMATLAB programming language with statistical features; RR A free implementation of the S language. S-PLUSS-PLUS general statistics package
  • Slide 8
  • Basic STATA 3. Where can I find resources and tips for learning STATA? GOOGLE!!! ;-) Stata webpage, universities webpage, etc. Resources for learning Stata http://www.stata.com/links/resources1.html Stata Starter Kit: Learning Modules http://www.ats.ucla.edu/stat/stata/sk/modules_sk.htm Getting Started in Data Analysis http://dss.princeton.edu/training/
  • Slide 9
  • Basic STATA This link provides some exercises from the course's textbook: Statistical Methods for the Social Sciences, the 3rd edition by Alan Agresti & Barbra Finlay http://www.ats.ucla.edu/stat/examples/smss/default.htm Textbook Examples: Introduction to the Practice of Statistics by David Moore and George McCabe http://www.ats.ucla.edu/stat/examples/mm/default.htm How to start on STATA?.do,.dta,.log files? USE.do FILES!!! Why? You can keep the information about everything you have done! If you need to manage data: use.do file!
  • Slide 10
  • .do FILE How to use a.do file? 1.Open STATA 2.New.do file editor 3.Set memory (this can improve the performance of STATA), but it depends on the capacity of your computer. So, if it does not work, you should demand less memory. (You dont need to use this command) ex: set memory 1200m 4.Define the directory you will work: cd "C:\Users\My Documents " See example: " rs01_example01.do "
  • Slide 11
  • 4. How to import data from excel to STATA? Importing data from excel: Source : http://www.stata.com/support/faqs/data/newexcel.html http://www.stata.com/support/faqs/data/newexcel.html 1. A rule to remember Stata expects one matrix or table of data from one sheet, with at most one line of text at the start defining the contents of the columns. 2. How to get information from Excel into Stata Start Excel. Enter data in rows and columns or read in a previously saved file. Highlight the data of interest, and then select Edit and click Copy. Start Stata and open the Data Editor (type edit at the Stata dot prompt). Paste data into editor by selecting Edit and clicking Paste. You can do this (2), but better avoid it! Why??? Importing data to STATA
  • Slide 12
  • INSHEET COMMAND 3.1 insheet command Launch Excel and read in your Excel file. Save as a text file (tab delimited or comma delimited) by selecting File and clicking Save As. If the original filename is filename.xls, then save the file under the name filename.txt or filename.csv. (Use the Save as type listspecifying an extension such as.txt is not sufficient to produce a text file.) Quit Excel if you wish. Launch Stata if it is not already running. (If Stata is already running, then either save or clear your current data.)saveclear In Stata, type insheet using filename.ext, where filename.ext is the name of the file that you just saved in Excel. Give the complete filename, including the extension. In Stata, type compress.compress Save the data as a Stata dataset using the savesave command. THE BEST WAY TO IMPORT DATA FROM EXCEL!!!
  • Slide 13
  • Importing data to STATA Common problems 5.1 Nonnumeric characters One cell containing a nonnumeric character, such as a letter, within a column of data is enough for Stata to make that variable a string variable. 5.2 Spaces What appear to be purely numeric data in Excel are often treated by Stata as string variables because they include spaces 5.3 Cell formats Much formatting within Excel interferes with Stata's ability to interpret the data reasonably. Just before saving the data as a text file, make sure that all formatting is turned off, at least temporarily. You can do this by highlighting the entire spreadsheet, selecting Format, and then Cells, and clicking General.
  • Slide 14
  • 5.4 Variable names Stata limits variable names to 32 characters and does not allow within such names any characters that it uses as operators or delimiters. Also, variable names should start with a letter. 5.5 Missing rows and columns Completely empty rows in a spreadsheet are ignored by Stata, but completely empty columns are not. A completely empty column gets read in as a variable with missing values for every observation. 5.6 Leading zeros With integer-like codes, such as ICD-9 codes or U.S. Social Security numbers, that do not contain a dash, leading zeros will get dropped when pasted into Stata from Excel. One solution is to flag within the first line that the variable is string: add a nonnumeric character in Excel on that line, and then remove it in Stata. 5.7 Filename and folder Confirm the filename and location of the file you are trying to read. Use Explorer or its equivalent to check. Importing data to STATA Common problems
  • Slide 15
  • STATA - data types Numeric variables String variables What is a STRING variable ? How to deal with them?
  • Slide 16
  • Some basic commands Summary: sum Conditions: if, &, | Sort variables: sort Order variables: order Generate variables: gen var Drop variables (columns): drop Drop rows: drop in Concatanate variables: concat() Destring variables: destring var, replace Generate numerical variables from string variables: tab var, gen(newvar) Basic math operations : / ; *; -; + or rsum(var1, var2, , varn); Replace: replace var Collapse: collapse (sum) var, by(var) see help collpase
  • Slide 17
  • Linking with class notes How to generate a quantitative variable from a categorical variable? For example:. Favorite music type of (rock, jazz, folk, classical) Command on STATA tab, gen(name of the var. For example: music) tab, gen(music)
  • Slide 18
  • EXERCISE The slide on page 30 of the first class notes is the following: www.stat.ufl.edu/~aa/social/data.html
  • Slide 19
  • EXERCISE Access this webpage (www.stat.ufl.edu/~aa/social/data.html) and do the following procedure:www.stat.ufl.edu/~aa/social/data.html 1.Download the data in Excel; 2.Plot a graph showing the age of students (on axes x) and the time they spend on TV (on axes y); 3.Plot a pie graph showing the number of males and females; 4.Save this data as.csv; 5.Transfer this data to STATA 6.Identify which variables are numerical and which one are string. 7.Plot a graph showing the age of students (on axes x) and the time they spend on TV (on axes y); 8.Plot a pie graph showing the number of males and females; 9.How many of these students are: D = Democrat, R = Republican, I = independent ? 10.Generate a variable called average_gpa that is: average_gpa = (high school GPA (on a four-point scale) + college GPA)/2
  • Slide 20
  • I have a problem on STATA If you have any doubt about how to use one specific procedure on STATA, how should you deal with this? 1. Google!!! ;-) . If this doesnt work: 2. Google!!! Try again, maybe you havent searched properly but, if this doesnt work: 3. Google!!! Try once more, just in case. 4. Command HELP on STATA. 5. Send your questions to statalist: http://www.stata.com/statalist/http://www.stata.com/statalist/ 6. Talk to you TA 7. Talk to your Professor You can talk to your TA whenever you want, but try at least the first 4 steps. This will be important for developing your skills to deal with Stata! ;-)