introduction toe views

1

1. Introduction to EVIEWS

1.1 EViews

EViews is one of the most popular econometric packages around. As well as containing a host of uptodate econometric features, it is incredibly easy to use. In addition to the menudriven objectoriented user interface, it is also possible to write simple programs in EViews programming language, without having to invest too much effort in the programming.

For a full product description and overview see http://www.eviews.com/eviews6/eviews6/eviews6.html

1.2 EViews basics

Assuming that you can open the program by doubleclicking on the EViews icon, you will be confronted by the following view

There are two important things to note about this view. 1. The white bar beneath the command menu is called the command line. You

can either use the command line or the drop down menus to carry out tasks in EViews, and as some actions are quite common, and therefore are repeated quite often, you might find it easier to type a short command in the command

http://www.eviews.com/eviews6/eviews6/eviews6.html

2

line. Once you learn the “EViews language” for these commands, it can actually be quicker than using the menus.

2. For additional information, open the EViews program and select Help/EViews Help Topics… and a list of help categories is revealed. The Help system is often more use than the manual because it contains updates to the documentation that were made after the manuals went to press. You can search for everything you want to do in EViews and you will find detailed explanations and answers.

The most important object in EViews is the workfile and your first step in any project will be to create a new workfile or to load an existing workfile into memory. Workfiles are the workhorses of EViews. They store your data and results of your analysis. While each workfile will contain several data series, each data series is stored as its own object. For example, suppose one wished to create a new workfile with the potential to hold a sample of crosssectional data with 20 observations in the crosssection.

1) Select File/New/Workfile on the EViews main menu bar. You will be faced with the following window:

2) Because this is crosssectional data, set the Workfile structure type: to Unstructured/Undated.

3

3) Since there are 20 observations in the data set, enter the number 20 in the Data Range. (At this stage, don’t worry about the information Eviews offers regarding irregular and undated panels workfiles; this is for more advanced users of Eviews).

4) You can also use this window to name the workfile if you wish. Enter myfirst in the Names/WF field. Click OK. EViews will now create a workfile, and will display the workfile window in the main work area of the EViews screen. The workfile window displays two pairs of numbers: one for the Range: of data contained in the workfile, and the second for the current workfile Sample. Both the workfile range and sample can be changed after the workfile has been created. Note that all new workfiles will contain two objects: the coefficient vector named c and the residual series named resid (see below).

5) You will also notice two tabs at the bottom of the workfile, one called “Untitled”, and one called “New Page”. When you are more proficient with EViews you will discover that this is a very useful feature of EViews 6, and you can find information on using the “New Page” by searching the Help menu. However, you can ignore it for now.

4

6) To save your workfile, select Save on the workfile menu bar or File/Save or File/Save As on the main menu bar. File/Save As will offer the following option:

5

Every time you save a workfile, you will be asked to specify whether you require single precision or double precision storage. Essentially, the difference is that single precision will use less space, but of course your data is saved with less precision. Compression of files will also save space, but it will not be possible to open your file with older versions of EViews.

At this stage, you can just select Double Precision and OK.

You are now ready to start entering data into the workfile.

Note that EViews allows for daily, weekly, monthly, quarterly, semiannual or annual time series data. Note also that if you simply enter the starting year and the ending year of your range, and specify what kind of regular data you are dealing with (daily, monthly, annual etc.), EViews will automatically structure the workfile for you. The program will create the largest possible workfile for your range.

Alternatively, you can specify the format of the dates as follows: • Annual – specify the year. (Example: 1980 & 1996). • Semiannual – you can specify whether it’s the first half or second half of the

year (Example: 1980:1 & 1996:2)

6

• Quarterly – specify the year, followed by a colon and the quarter number. (Example: 1980:1 & 1996:4) .

• Monthly – specify the year, followed by a colon and the month number. (Example: 1980:1 & 1996:12).

• Daily & Weekly – the default format is month:day:year, and you will need to specify your range in this order. If you find this confusing it is relatively easy

to change the order by using Options/ Dates and Frequency Conversion … and select Day/Month/Year.

See See Help/Eviews Help Topics/Users Guide/EViews Fundamentals/Workfile Basics for more information about creating and using workfiles in EViews.

1.3 Data input Once you have a workfile opened, you can type data into EViews. Using the menu items Quick/Empty Group (Edit Series) will give you a spreadsheet window. At the top of the window you will see an Edit+/ button. This locks or unlocks the spreadsheet for editing. Once unlocked for editing you simply type in the data.

As Excel is commonly used for data storage, it is useful to know how to import data from an Excel worksheet into EViews. The most straightforward way of doing this is using Copy and Paste. To begin with, you will be working relatively small datasets comprising only a few series and a few observations, so cut and paste will be the fastest way to load data into EViews.

First step: In Excel – copy the data from the spreadsheet into the clipboard (highlight the data you want, right click and select“Copy”). Second step: In EViews – use Quick/Empty Group (Edit Series) place the cursor in the upper left cell and paste the data from the clipboard. EViews will assign default names to the series, usually ser01, ser02 etc. It is advisable to

7

rename the series. This can be done by returning to the worksheet view and right clicking on ser01and selecting Rename.

Alternatively you can read data directly read data directly from files created by other programs. Data may be in standard ASCII form or in either Lotus (.WKS, .WK1 or .WK3) or Excel (.XLS) spreadsheet formats. This is useful if you are dealing with a larger dataset, as it is more difficult to manoeuvre the copy and paste. It is best to become familiar with both methods as your skills will increase every time you use EViews.

First make certain that you have an open workfile to receive the contents of the data import. Next, click on Procs/Import/Read TextLotusExcel... You will see a standard File dialog box asking you to specify the type and name of the file. Select a file type, navigate to the directory containing the file, and double click on the name. Alternatively, type in the name of the file that you wish to read (with full path information, if appropriate); if possible, EViews will automatically set the file type, otherwise it will treat the file as an ASCII file. Click on Open. EViews will open a dialog prompting you for additional information about the import procedure. The dialog will differ greatly depending on whether the source file is a spreadsheet or an ASCII file.

Example: US Macroeconomic Data Find the EViews folder in the Business documents directory – where will the files be??? Using Excel open the file called US Macroeconomic Data.xls. This file now needs to be imported into EViews. You will see that this file contains annual data for a number of US macroeconomic variables for the period 1963 to 1989. To import the file follow the following steps.

1. Open a new workfile and specify annual data with start date 1963 and end date 1989. Close the Excel file US Macroeconomic Data.xls

2. Click on Procs/Import/Read TextLotusExcel… and provide the filename details in the usual way before clicking Open. Note that Eviews doesn’t like it if you currently also have the file open in Excel!

8

3. At this point you will obtain the Excel Spreadsheet Import dialog box (shown below). There are now a number of important bits of information to specify.

a. As the column with “Year” in it is superfluous information, the upperleft data cell is specified as B2.

b. Make sure the radio button “By observation” is selected to tell EViews that the spreadsheet contains data series in columns.

c. There are 4 series which I name Y, P, U and R. 4. Click OK and the data should then appear in your workfile. Save

the workfile as US Macroeconomic Data.wf1.

Once you have imported the data, it is possible to go straight into running regressions. However, it is useful to go through a few processes to visualise the data before conducting any analysis. This allows you to get a feel for the data you are using. In addition, the EViews skills you acquire in the following exercise will help you in future tasks. It is always useful to plot your series as a group to see whether they are moving together.

2. Describing and Visualising Data

The calculation of economic statistics provides a way to condense and synthesise information on a range of economic variables such as national account variables, wages and prices, interest rates, exchange rates etc. There are two ways to describe data:

9

1. Graphical – plot the data using plots such as line plots, scatterplots, histograms and plots of the empirical distribution of the data.

2. Statistical – compute descriptive statistics of the distribution such as the mean, variance, skewness and kurtosis etc.

2.1 Transforming and plotting data In analysing any economic data, time series or crosssectional data, it is necessary to plot the series so as to:

• identify any incorrectly recorded data points; • reveal the key characteristics of the data (trends, outliers, seasonality

etc.) . In some situations it is useful also to graph one series against another. This is known as a scatter plot. The advantage of this is that it will help to identify the degree of association between two economic series.

For some problems it is of interest to compare an empirical distribution with a theoretical distribution, such as a normal distribution. An empirical distribution can be constructed by using a histogram. More formal empirical distributions based on nonparametric kernels can be computed in EViews

Often it is necessary to transform or filter the data before computing descriptive statistics or plotting. A number of important filtering procedures are now described.

• Log Filter This smooths out large movements in economic series.

( ) t t Y LOG X =

• First Difference This filter is often used to extract the trend from economic series.

1 t t t Y X X − = −

• Seasonal Difference This is used to extract the seasonal factors from economic series. For example, if monthly data ( ) t X are available, then the appropriate seasonal filter is

12 t t t Y X X − = −

• Moving Average This filter smoothes out supposed random movements in a series so as to highlight the underlying trends and cycles. This filter is commonly used to identify business cycle turning points. For example, a 3rd order moving average is given by

1 1

3 t t t

t X X X Y − + + +

=

• Growth Rates For certain situations it is more important to compute the growth rate of a series. For example, in analyzing inflation, if t X represents the

10

level of the CPI, price inflation is computed using both the log and first difference filters

1 1 ( ) ( ) ( ) t t t t t Y LOG X LOG X LOG X X − − = − = /

Example: The Australian Business Cycle Open the workfile gdp.wf1 that contains quarterly, real, seasonally adjusted GDP for the period September 1959 to June 1996. There are two ways to implement the filters discussed above in EViews. The first way is to follow Quick/Generate Series on the main menu bar and the second uses the radio button Genr on the menu bar in the workfile. In either case the Generate Series by Equation dialogue box will appear in which the commands to filter the series may be entered.

For example, to compute the annual percentage rate of growth of Australian GDP the relevant command is given below

You need to be a little careful to ensure that you get all the syntax correct. Note that GDP(4) is the EViews syntax for 4 t GDP − . Now compute the Australian business cycle by smoothing the annual percentage growth of real GDP using a seventhorder moving average. Once again using Genr, the appropriate dialogue box will be

11

Now plot G and BCYCLE on the same graph. To do this, highlight the both variables by clicking on them and holding down the CTRL key. Now place the cursor anywhere in the highlighted area as shown and right click. You will receive some drop down menus from which you choose Open/As Group.

At this stage you should obtain the following spreadsheet view of the two variables opened as a group.

12

Note that the NA observations are those lost in doing the transformations required by the filters (EViews automatically adjusts the sample size when generating G and BCYCLE). In order to plot the series on the same graph click on the View menu and then Graph and Line on the subsequent drop down menus. The end result is the following graph of the Australian business cycle.

13

2.2 Descriptive Statistics Given sets of timeseries or crosssection data it is customary to summarize its characteristics by considering (i) a measure of the central tendency or location of the data, (ii) a measure of the spread, dispersion or scale of the data, and (iii) measures of association between different sets of data.

1. Mean The mean of the distribution represents the “average” value of a variable. For a time series, the mean is computed as

1

1 T t

t

X X T =

= ∑ Two other commonly used measures of data location are the median and the mode. The median of the data is the value of the middle observation (or average of the middle two if the number of observations is even) after the values have been ordered. The mode is the most frequently occurring value, if a variable is discrete.

2. Standard Deviation and Variance The standard deviation measures the spread of 1 2 T X X X , , ..., around its mean X . It is computed as

2

1

1 ( ) T

t t

X X T

σ =

= − ∑ The variance is computed as

2 2

1

1 ( ) T

t t

X X T

σ =

= − ∑

8

4

0

4

8

12

1960 1965 1970 1975 1980 1985 1990 1995

G BCYCLE

14

3. Coefficient of Variation The magnitude of the values of the variables and units of measurement affect the statistical measurement of sample variability. In other words, comparing sample variability based on sample variance for random variables with vastly different magnitudes or measurement units is a futile exercise. The coefficient of variation is the standard deviation as a percentage of the sample mean and is given by

100 CV X σ

= ×

Dividing the standard deviation by the sample mean accounts for the “size” of the variable’s values and removes the effect of units of measurement.

4. Skewness For distributions that are symmetric such as the normal distribution, there is no skewness. For some distributions however, high (low) values can be more common than low (high) values. In this case the distribution is skewed. Skewness is computed as

3

1

1 T t

t

X X S T σ =

− =

∑

5. Kurtosis Some economic series have extreme observations in both tails of the distributions which are not consistent with the assumption of normality. This “fatness” in the tails of the distribution is known as excess kurtosis. Kurtosis is computed as

4

1

1 T t

t

X X K T σ =

− =

∑

Example: Comparing the variability of US GDP and interest rates Load the workfile US Macroeconomic Data.wf1 that you created earlier from the Excel file. Open the variable named, R, the prime rate by double clicking on it in the workfile window. You should now have a spreadsheet view of the variable.

15

All the relevant descriptive statistics for this variable are now easily computed. Click on the View menu in this window and on the drop down menus click on Descriptive Stats/Histogram and Stats to obtain

If you now compute the same statistics for GNP you will find that the mean and standard deviation of the data are 2915.31 and 631.36 respectively. Comparing the variability of GNP and R based on sample variation is futile given their differing magnitudes. The coefficient of variation may now be computed to provide such a comparison. In order to do this computation you will need to type the following instructions in the command line: scalar cv_r = (@stdev(r)/@mean(r))*100 scalar cv_y = (@stdev(y)/@mean(y))*100

0

1

2

3

4

5

4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Series: R Sample 1963 1989 Observations 27

Mean 8.833704 Median 8.030000 Maximum 18.87000 Minimum 4.500000 Std. Dev. 3.585711 Skewness 1.012432 Kurtosis 3.659293

JarqueBera 5.101584 Probability 0.078020

16

The scalar instruction tells EViews that we are not generating a series but rather a single value and we then use the built in EViews commands @stdev and @mean to compute the coefficient of variation. You will notice that if successfully created the scalar icon with the name you have given will appear in the workfile! You will also get a little message in the bottom left hand corner telling you the variable has been created successfully. Note also that to display the value of CV_R and CV_Y double click on their icons and the value will be displayed in the bottom left hand corner as well.

In this case CV_R = 40.59% and CV_Y = 21.66%. It is clear the prime rate has a larger coefficient of variation about its sample mean despite the fact that Y R σ σ > .

6. Covariance The covariance is a generalisation of the variance measure. It measures the strength of the interaction between two series. If 1 2 T X X X , , ..., and 1 2 T Y Y Y , ,..., represent two time series, the covariance is computed as

1

1 ( )( ) T

XY t t t

X X Y Y T

σ =

= − − ∑

• A positive (negative) value shows that t X and t Y are positively (negatively) related.

• A zero value shows that t X and t Y are unrelated.

7. Correlation

17

The correlation coefficient is derived by rescaling the covariance. The advantage of this rescaling is that it provides a dimensionless quantity which is thus unaffected by units of measurement. The correlation coefficient is estimated as

1

2 2 1 1

( )( )

( ) ( )

T t t t

XY T T t t t t

X X Y Y r

X X Y Y

=

= =

− − =

− −

∑ ∑ ∑

The correlation coefficient, as with the covariance, measures the strength or degree of linear association between two variables. It has the property that it falls between 1 and 1:

• 1 XY r = , Perfect positive association • 0 1 XY r < < , Imperfect positive association • 0 XY r = , No association • 1 0 XY r − < < , Imperfect negative association • 1 XY r = − , Perfect negative association

Example: Commodity returns Load the workfile commod.wf1 that contains monthly data on the prices of the commodities copper, lead, silver and gold. For copper, lead and silver compute the returns to commodities (expressed as a percentage) as follows R_X = 100*LOG(X/X( 1)) or R_X = 100*DLOG(X) using the built in EViews function DLOG. Open the three series you have created as a Group (click on each series in workfile view while holding control, now place the cursor anywhere in the highlighted area as shown and right click) and compute the descriptive statistics using individual samples. You should obtain the following output.

Again using the View menu you can compute the covariance and correlation matrices (in each case you will need to select View/Covariance Analysis and then select or deselect either Covariance or Correlation, depending on which matrix output you want).

18

In the case of the correlation matrix you should obtain

indicating a high positive correlation between the returns to lead and copper. A way of visualising covariance is to use a scatterplot. On the View menu, click on Graph/Scatter and then on the right hand side of the dialog box, select Scatterplot Matrix from the drop down menu.

The “Fit lines” option will allow you to draw regression lines in the scatter plots (however you are not up to running regressions yet, see section….). The “Axis borders” options will enable you to show histograms of each of the variables on the borders of the scatterplots. You can experiment with these functions, however the most important outcome is that you can visualise the relationships of primary concern, and that is the correlation between the variables as seen below.

19

The strong positive relationship between copper and lead returns is quite evident.

12

8

4

0

4

8

R_C

OPP

ER

15

10

5

0

5

10

R_LEA

D

10

5

0

5

10

12 8 4 0 4 8

R_COPPER

R_S

ILVER

20 10 0 10

R_LEAD

10 5 0 5 10

R_SILVER

20

Now that you understand some of the key characteristics of your data, you can begin running regressions.

1. Introduction to Linear Regression

Linear regression estimates a relationship in a population using a selection of data.

Model specification Many economic theories can be represented by the relationship between Y , the dependent variable, and 1 X to K X , a set of independent, or explanatory variables. Assuming a linear relationship, the linear regression model is

0 1 1 2 2 t t t K K t t Y X X X u β β β β , , , = + + + ...+ + where the sample period is 1 2 t T = , ,..., , and k β , 1 2 k K = , ,..., , are the unknown population coefficients. The disturbance term is given by t u which can arise from measurement error in t Y or from errors in the specification of the relationship between

t Y and the t X ’s. The expected value (mean) of t Y is given by

0 1 1 2 2 ( ) t t t K K t E Y X X X β β β β , , , = + + + ... +

You will use the same equation form in EViews when running regressions to examine the relationship between the dependent variable (Y) and the independent variables (Xs).

Assumptions The disturbance term has zero mean. The disturbance variance is constant for all observations (Homoskedastic). The disturbances corresponding to different observations have zero correlation (No autocorrelation).

t X is nonstochastic, that is, it is taken as given.

t Y is stationary. The disturbances are uncorrelated with the explanatory variables. There is no perfect linear relationship between the explanatory variables, (no multicollinearity). The disturbances are normally distributed.

Example: California School Districts Data in 1998 The workfile california.wf1 contains crosssectional data for 420 school districts in California. The data set contains measurements of several different variables for each district. These are:

21

AVGINC = district average income (in $1,000s) CALW_PCT = % qualifying for CalWorks (a public assistance program) COMP_STU = computers per student (=COMPUTER/ENROL_TOT) COMPUTER = number of computers EL_PCT = % of English learners ENROL_TOT = total 21nrolment EXPN_STU = expenditures per student ($s) MATH_SCR = average maths score MEAL_PCT = % qualifying for reducedprice lunch READ_SCR = average reading score STR = student teacher ratio TEACHERS = number of teachers TESTSCR = average of maths and reading test scores

As a first example of doing a multiple regression in Eviews consider regressing average test scores (the dependent variable) on the student teacher ratio and the percentage of English learners in the school district (the independent variables). Using the Quick/Estimate Equation option on the main menu bar will yield the following dialogue box

The variables are entered as a “list”. This format follows the regression equation at the top of the page. The first variable is “testscr”, which is the dependent variable (or the Y), followed by the constant term “c”, and the two independent variables (or the Xs).

The output obtained is:

22

Rather than typing out each of the variable names individually (which can become timeconsuming), you can use the cursor to select the variables required for analysis.

Click on the variables required, still retaining the order of the regression equation, i.e. dependent variable first, followed by independent variables. Click on the dependent variable first, then hold CTRL and click on the independent variables. Hold the cursor over the blue highlighted area and right click to follow the Open/as Equation path. Click on “as Equation” and Eviews will insert the variables in order, with the ‘c’ constant at the end of the equation.

23

Note: If you use this method, you will not have to specify the “c” constant. Eviews will insert it automatically in the regression.

You can see that the equation from using the list method and the equation from selecting the variables are the same.

The regression output is also the same.

Now that you can obtain regression outputs, you can learn to interpret them correctly, and your life begins to get really interesting.

introduction toe views

Documents