application of regression analysis
DESCRIPTION
Application of regression analysis. Economic structure and air pollution in a transition economy: The case of the Czech republic. Gabriela Jandov á Michaela Krčílková. Structure of presentation. Definition of regression model Compilation of regression model Analysis of the results. - PowerPoint PPT PresentationTRANSCRIPT
Application of regression analysis
Economic structure and air pollution in a transition economy: The
case of the Czech republic
Gabriela Jandová
Michaela Krčílková
Structure of presentation
I. Definition of regression model
II. Compilation of regression model
III. Analysis of the results
represents reality by using the system of equations.explains relationship between variables.enables quantification of these relationships.
I. Regression model
II. Compilation of regression model
Conceptual modelHypothesesEquations Data collectionCalculation VerificationErrors of the model
is a graphical scheme. serves for specification of sought-after mutual relations.is a tool for defining of investigation matter.should clarify our minds and help during determination of researching methods.
Compilation of regression model
Conceptual model
Our conceptual model
Agriculture
Industry Services
Individual people
Economic system
Air
Soil Water
Organisms
Ecological system
Political system
INPUTS: production resources
OUTPUTS: products, waste
Hypotheses
are formulated expectations and suppositions.
Theirs confirmation or rejection is the goal of the regression analysis.
Compilation of regression model
Our hypotheses
There is a relationship between economic structure and air pollution.
Industry is the biggest polluter of air.
There is a significant improvement of air quality during the 90th. Decrease of functioning of economy is not a cause of this fact.
Equations
should involve mainly essential relations between examined phenomenons, which have permanent character.
consist from explained and explanatory variables.
Partial correlation coefficients measure the effect of given explanatory variable on explained variable.
Compilation of regression model
1.
Requests on the variables:Measurability
Accessibility
Conclusiveness
Testify ability
Standardized methods of attaining
Compilation of regression model Equations
2.
Comparability
Time series
Inter-independence
Uniqueness
Convenience
Our equations
Where:Yn = dependent variables (NO,CO, Dust) X1 = Gross value added in agriculture
X2 = Gross value added in industry
X3= Gross value added in services
= Random error term 123 = partial correlation coefficients
Yn = n1X1 + n2X2 + n3X3 + n
Data collection
Statistical office´s reportsLibraryInternetJournalInterview
Compilation of regression model
Sources:
Our dataUnderlying data necessary for compilation of basic matrixes have been acquired from regional branches of Czech statistical office and from Czech Hydrometeorological office.
Tab. č. P.1. Regionální makroekonomické ukazatele Jihočeský kraj1993 1994 1995 1996 1997 1998 1999
Hrubá přidaná hodnotav základních cenách (mil. Kč) 54426 62649 71 278 80985 87061 94350 95532Hrubý domácí produktv tržních cenách (mil. Kč) 57613 66261 76274 87533 93491 100985 103321Podíl kraje na HDP České republikyv % (ČR = 100) 5,6 5,6 5,5 5,6 5,6 5,5 5,5Hrubý domácí produkt
v mil ECU 1 686 1 940 2198 2540 2602 2780 2801v mil. PPS 5677 5907 6252 6898 7029 6918 7 171
Hrubý domácí produktna 1 obyvatelev Kč 92061 105733 121 614 139692 149 207 161 165 164986v ECU 2694 3096 3505 4054 4153 4437 4473v PPS 1) 9072 9425 9968 11 008 11 217 11 040 11 451
Hrubý domácí produktna 1 obyvatele
průměr ČR= 100 93,2 92,4 91,0 92,0 91,5 90,7 89,9v PPS(EUR15=100) 2) 56,8 56,4 56,5 59,6 57,8 54,5 53,9vPPS(EUR25 = 100) 3) . . 65,6 68,9 66,7 62,9 62,2vPPS(CECC10=100) 4) . . 150,8 154,0 148,3 141,0 140,0
1) PPS - jednotka pro měření kupní síly 2)EUR15 - průměr zemí Evropské unie 3)EUR25 = EUR15 + CECC104)CECC10 - průměr zemí: Bulharsko, Česká republika, Estonsko, Litva, Lotyšsko, Maďarsko, Polsko, Rumunsko,
1.
Form of indicators of one region.
To acquire underlying data was necessary to contact all 14 regions.
Tab. č. P.2. Struktura hrubé přidané hodnoty podle odvětví OKEČ
v % Jihočeský krajOdvětví 1993 1994 1995 1996 1997 1998 1999
Hrubá přidaná hodnota celkem 100,0 100,0 100,0 100,0 100,0 100,0 100,0
v tom:
A Zemědělství, myslivost, lesnictví 9,6 9,1 9,1 9,7 8,4 8,1 7,8
B Rybolov 0,4 0,4 0,5 0,4 0,3 0,3 0,3
C Dobývání nerostných surovin 0,4 0,3 0,4 0,4 0,3 0,4 0,3
D Zpracovatelský průmysl 22,9 24,7 26,3 29,3 31,1 30,4 30,3
E Výroba a rozvod elektřiny, plynu, vody 7,6 8,3 7,5 8,9 6,1 6,3 6,4
F Stavebnictví 10,5 9,7 11,8 10,0 10,0 8,8 9,1
G Obchod, opravy motorových vozidel
a spotřebního zboží 12,9 10,7 9,9 8,2 9,8 10,8 10,7
H Pohostinství a ubytování 1,6 2,1 1,9 1,5 1,6 1,3 1,2
l Doprava, skladování,
pošty a telekomunikace 8,7 8,2 7,9 7,3 7,7 8,2 7,9
J Peněžnictví a pojišťovnictví 5,7 4,7 3,8 2,9 2,8 3,9 3,6
K Činnosti v oblasti nemovitostí,
pronajímání nemovitostí, služby
pro podniky, výzkum a vývoj 7,3 8,5 7,3 7,2 7,3 7,2 7,0
L Veřejná správa, obrana, povinné sociální
zabezpečení 4,4 4,1 4,3 4,6 5,5 5,4 5,6
M Školství 3,3 3,6 3,6 3,8 3,6 3,2 3,9
N Zdravotnictví, veterinární
a sociální činnosti 3,0 3,4 3,6 3,4 3,5 3,5 3,6
O Ostatní veřejné sociální
a osobní služby 1,8 2,0 2,3 2,3 2,0 2,1 2,3
P Soukromé domácnosti s personálem 0,0 0,0 0,0 0,0 0,0 0,0 0,0
Q Exteritoriální organizace a spolky 0,0 0,0 0,0 0,0 0,0 0,0 0,0
Our data 2.
Underlying data were adjusted and used for compilation of basic mattrixes.
Our data 3.
Y1 X1 X2 X3
1995 NOx
GVA in agriculture
GVA in industry
GVA in SERVICES
[t/year] [t/year] [t/year] [t/year]STR 29165,60 7344,72 51209,02 43558,27PLZ 7811,00 5661,14 29755,50 33621,60KVA 8622,00 1330,23 15577,70 18098,10LIB 4464,50 1256,89 21627,16 20413,61HKR 7577,20 4288,26 27058,33 28991,07PAR 12782,30 4107,95 24967,80 24220,90VYS 3356,90 5977,97 25117,50 19139,54JIHM 9699,50 6523,43 50180,26 68746,96OLO 6718,60 5250,74 29444,00 31770,27ZLI 4723,60 3296,60 35827,20 23076,20MSL 38261,00 4053,76 80640,79 60372,01BUD 7593,30 6842,69 32787,88 31789,99PHA 7536,20 263,62 54305,93 208787,83
Y1 X1 X2 X3
1998 NOx
GVA in agriculture
GVA in industry
GVA in SERVICES
[t/year] [t/year] [t/year] [t/year]STR 17412,10 8680,00 67163,24 66309,47PLZ 6236,10 6420,74 38260,60 43274,10KVA 9373,30 1374,52 18910,00 21409,10LIB 2553,00 1698,94 28663,82 27520,32HKR 4063,00 5110,49 37051,00 37689,80PAR 9298,60 5010,54 32744,90 32815,50VYS 2648,30 7759,32 32459,80 24506,50JIHM 5721,70 8214,84 61024,60 98410,50OLO 4510,70 6013,80 36500,40 41010,80ZLI 4016,80 3762,37 41984,70 39761,40MSL 23062,60 4554,83 94740,40 82715,60BUD 7593,00 7925,40 43306,65 43023,60PHA 7536,20 389,85 67833,73 322015,27
Example of basic mattrixes for NOx
Calculation
Ordinary least square method (OLS)Two stage least square methodInstrumental variablesMaximum likelyhood methodGeneral least squareNon-linear least square
Compilation of regression model
Our calculation
Method of callculation: OLSResults:
NO 1995 1998 DUST 1995 1998 CO 1995 1998 11- AGRICULTURE -0,97 -0,62 11- AGRICULTURE -0,39 -0,04 11- AGRICULTURE -15,65 -10,68
12 - INDUSTRY 0,59 0,30 12 - INDUSTRY 0,50 0,15 12 - INDUSTRY 4,44 2,61 13 - SERVICES -0,12 -0,04 13 - SERVICES -0,10 -0,03 13 - SERVICES -1,03 -0,47
Verification
statistical verification R-squared
• R2 should be equal at least 0,66 t-statistic
• Every attained t-value should be higher than critical t-values mentioned in statistical tables
F-statistic• Every attained F-value should be higher than critical F-
values mentioned in statistical tables confidence interval
• Estimated intervals have not include zero. logical verification
Compilation of regression model
Verification of our modelStatistical verification
1995 R2t-test -
agriculturet-test -
industryt-test -
servicesNO 0,735 -1,411 5,031 -2,654CO 0,835 -4,241 7,123 -4,339
DUST 0,723 -0,630 4,821 -2,490
1998 R2t-test -
agriculturet-test -
industryt-test -
servicesNO 0,691 -1,598 4,928 -2,245CO 0,811 -4,304 6,621 -4,033
DUST 0,629 -0,177 3,875 -2,428
tc 0,1 1,383
tc 0,05 1,833
tc 0,01 2,821
critical value of t-test
1.
Verification of our model
Statistical verification
2.
1995 CO0,1 -20,753 -10,546 3,579 5,303 -1,354 -0,699
0,05 -22,413 -8,886 3,298 5,584 -1,460 -0,5930,01 -26,059 -5,240 2,682 6,200 -1,694 -0,359
NO0,1 -1,928 -0,019 0,425 0,748 -0,179 -0,056
0,05 -2,238 0,292 0,373 0,800 -0,199 -0,0360,01 -2,920 0,973 0,258 0,916 -0,242 0,007
DUST0,1 -1,240 0,464 0,358 0,646 -0,153 -0,044
0,05 -1,517 0,741 0,311 0,693 -0,171 -0,0260,01 -2,126 1,350 0,208 0,795 -0,210 0,013
11-AGRICULTURE 12 - INDUSTRY 13 - SERVICES
11-AGRICULTURE 12 - INDUSTRY 13 - SERVICES
11-AGRICULTURE 12 - INDUSTRY 13 - SERVICES
1998 CO0,1 -14,115 -7,249 2,066 3,158 -0,636 -0,311
0,05 -15,232 -6,132 1,889 3,335 -0,689 -0,2580,01 -17,684 -3,680 1,499 3,725 -0,805 -0,142
NO0,1 -1,155 -0,083 0,218 0,389 -0,067 -0,016
0,05 -1,329 0,091 0,191 0,416 -0,075 -0,0080,01 -1,712 0,474 0,130 0,477 -0,093 0,011
DUST0,1 -0,387 0,299 0,098 0,207 -0,045 -0,012
0,05 -0,499 0,411 0,081 0,225 -0,050 -0,0070,01 -0,744 0,656 0,042 0,264 -0,062 0,005
13 - SERVICES
11-AGRICULTURE 12 - INDUSTRY 13 - SERVICES
11-AGRICULTURE
12 - INDUSTRY 13 - SERVICES
12 - INDUSTRY
11-AGRICULTURE
Confidence interval:
Verification of our model
Logical verification
Coefficients of industry have a positive slope.
Coefficients for services and agriculture have negative slope.
Errors of the model
Indicators of the errors Low value of R-squared Coefficients are not significant Zero lies in the confidence intervals
Reasons of the errors: Bad choice of variables Omission of important factors Equations are not identificated Errors in data collection Low number of executed observation
Compilation of regression model
Experiments with our model
Calculation with additive constant1995 NO CO DUST
10- ADDITIVE -4170,18 -35130,38 -2152,0111- AGRICULTURE -0,50 -11,70 -0,15
12 - INDUSTRY 0,62 4,71 0,5213 - SERVICES -0,11 -0,93 -0,09
1998 NO CO DUST10- ADDITIVE -426,67 -28732,85 -370,57
11- AGRICULTURE -0,58 -8,30 -0,0112 - INDUSTRY 0,31 2,85 0,1613 - SERVICES -0,04 -0,44 -0,03
III. Analysis of the results
is an important step for correct interpretation of the model.
is crowned and concluded by confirmation or rejection of hypothesis.
Analysis of results
The significance of coefficients comfirms our first hypothesis.
First hypothesis
CO 11-AGRICULTURE 12 - INDUSTRY
13 - SERVICES1995 -4,241 7,123 -4,3391998 -4,304 6,621 -4,033
NOx11-AGRICULTURE
12 - INDUSTRY 13 - SERVICES
1995 -1,411 5,031 -2,6541998 -1,598 4,928 -2,245
DUST 11-AGRICULTURE 12 - INDUSTRY
13 - SERVICES1995 -0,630 4,821 -2,490
1998 -0,177 3,875 -2,428
Critical values for t-testtc 0,1 1,383tc 0,05 1,833tc 0,01 2,821
1.
There is a relationship between economic structure and air pollution.
The coefficients for industry have the biggest value and positive slope. This fact confirms our second hypothesis.
Analysis of resultsSecond hypothesis
2.
NO 1995 1998 DUST 1995 1998 CO 1995 1998 11- AGRICULTURE -0,97 -0,62 11- AGRICULTURE -0,39 -0,04 11- AGRICULTURE -15,65 -10,68
12 - INDUSTRY 0,59 0,30 12 - INDUSTRY 0,50 0,15 12 - INDUSTRY 4,44 2,61 13 - SERVICES -0,12 -0,04 13 - SERVICES -0,10 -0,03 13 - SERVICES -1,03 -0,47
Industry is the biggest polluter of
air.
All coefficients decrease during the time, that confirms our third hypothesis.
Analysis of resultsThird hypothesis
3.
NO 1995 1998 DUST 1995 1998 CO 1995 1998 11- AGRICULTURE -0,97 -0,62 11- AGRICULTURE -0,39 -0,04 11- AGRICULTURE -15,65 -10,68
12 - INDUSTRY 0,59 0,30 12 - INDUSTRY 0,50 0,15 12 - INDUSTRY 4,44 2,61 13 - SERVICES -0,12 -0,04 13 - SERVICES -0,10 -0,03 13 - SERVICES -1,03 -0,47
There is a significant improvement of air quality during the 90th. Decrease of functioning of
economy is not a cause of this fact.
Conclusion
All hypotheses are confirmed.
Our recommendation is:
to use the model in the conditions of non-transition economy.
to use the model in a country with higher number of regions.
END