philipalexanderkoehler.website · web viewanalysis of the world happiness report 2014 to 2018....
Post on 22-Sep-2020
4 Views
Preview:
TRANSCRIPT
Running head: Analysis WHR 2014 t0 2018
Analysis of the World Happiness Report
2014 to 2018
Koehler, Philip Alexander
Professor
Vittorio Merola
Intermediate Statistic
Analysis WHR 2014 t0 2018 II
POL 502.02
Analysis WHR 2014 t0 2018 III
Abstract
To find out which variables influence our happiness the most, eleven variables from the World
Happiness Report were analyzed in this paper. Out of the six significant variables Social Support
had the highest positive influence on happiness (coded as Life Ladder). Per unit increase in
Social Support (scale 0 – 1), happiness increased by 4.6 unites. The highest negative influence
had the variables Perception of Corruption, per unit increase in Perception of Corruption (scale
0 – 1) happiness decreased by 2.0 unites. As well as the variable Confidence in the National
Government (scale 0 – 1). Per unit that Confidence in the National Government increased,
happiness decreased by 1.9 units.
Analysis WHR 2014 t0 2018 IV
Contents
1. Introduction..................................................................................................................1
2. The World Happiness Report.......................................................................................2
2.1 Highlighted Variables..........................................................................................2
2.2 Excluded Variables..............................................................................................5
3. Analysis.......................................................................................................................6
3.1 Overview descriptive measures...........................................................................6
3.1.1 Dataset: “WHR” – World Health Report, Years 2014 - 2018..........................6
3.2 Distribution of the data:.......................................................................................7
3.2.1 Shapiro-Wilk Test............................................................................................7
3.2.2 Anderson-Darling Test.....................................................................................7
3.2.3 QQNorm..........................................................................................................8
3.3 Outliers................................................................................................................8
3.3.1 Delete Outliers.................................................................................................9
3.3.2 Retest...............................................................................................................9
3.4 Linear Model.....................................................................................................10
3.4.1 Interpretation..................................................................................................10
3.5 Testing for Robustness.......................................................................................11
3.6 Testing Further Hypothesis................................................................................12
3.7 Visualizing the Results......................................................................................12
4. Conclusion.................................................................................................................13
References..........................................................................................................................14
Appendix............................................................................................................................16
4.1 Appendix A........................................................................................................16
4.2 Appendix B........................................................................................................19
4.3 Appendix C........................................................................................................21
4.4 Appendix D........................................................................................................21
Analysis WHR 2014 t0 2018 1
1. Introduction
Create all the happiness you are able to create; remove all the misery you are able to remove.
Every day will allow you, -will invite you to add something to the pleasure of others, -or to
diminish something of their pains.
Jeremy Bentham
How can we increase our happiness and the happiness of our loved ones? This thesis offers
answers to this fundamental question by analyzing variables that influence happiness. The data is
derived from the World Happiness Report (WHR) which relies on the Gallup Word Report. To
offer an in-depth understanding of which variable influence happiness, the results of the past five
reports (2014 – 2018) were analyzed. The programming language R was used to analyze and to
graph the data. The code can be downloaded here.
Using the variable Life Ladder, the WHR analyzes happiness of over 150 countries. While none
of the other variables were used to calculate the variable Life Ladder, they are being consulted to
supplement the interpretation.
To understand which variables influence our happiness the most, eleven independent variables
from the WHR were chosen. At the beginning of this paper the variables are explained. In the
following chapter descriptive measures: mean, confidence interval, standard deviation, count,
minimum value, and maximum value are listed. With the Shapiro-Wilk Test, Anderson-Darling
Test, and QQplots the normal distribution of the continuous variables was tested. Not normal
distributed variables are tested on outliers. A linear model is run, and the results are then
interpreted and graphed.
Analysis WHR 2014 t0 2018 2
2. The World Happiness Report
The World Happiness Report (WHR, 2019) is issued annually and measures the self-perceived
happiness of 156 countries and is published by the United Nations department Sustainable
Development Solutions Network. The data used comes from the Gallup World Poll, surveying
about 1,000 inhabitants of the 156 countries (Gallup, N.A.). The WHR includes a wide variety of
data: from GDP per Capita to Generosity, and the Perception of Corruption. The jewel in the
report is the variable Life Ladder. The two graphs in the Appendix A offer an overview over the
Life Ladder variable combined with the variable Country for all countries and the top five
countries.
2.1 Highlighted Variables
The following variable review is based on the original questions the WHR used: the Gallup
World Poll Questions (Gallup, 2006), the Global Health Observatory (WHO, N.A.), and the
variable explanation of the authors of the study: John F. Helliwell, Haifang Huang, and Shun
Wang (2016).
Life Ladder 1
The variable Life Ladder is used to measure the self-perceived happiness. The outline of the
survey question is vertically from 0 to 10 and the respondents were asked to imagine the question
as a ladder and how high their happiness would climb on that imaginary ladder. 10 equals the
highest level of happiness a respondent can imagine and 0 the lowest. The results captured are
numerical and on an ordinal Likert-scale. It must be noted that the variable Life Ladder must be
interpreted with caution. The respondents are asked “to think of a ladder, with the best possible
life for them being a 10, and the worst possible life being a 0.” (WHR, 2019). The best and worst
1 The original variable names differ slightly. The variable names that are used in this paper are similar to the variable names in the WHR.
Analysis WHR 2014 t0 2018 3
imagined life for each respondent is vastly different. The social comparison theory2 states that
humans link their happiness to their surroundings. A respondent that is considered poor in a
developed country might rank lower in the Life Ladder than a rich respondent in a poor country,
even if the respondent from the developed country has more wealth than the rich respondent
from the poor country. Hence, Life Ladder is a great indicator for self-reported happiness but is
not a good indicator for the life circumstances.
Also, the variable Life Ladder is based on one question, asking the respondents to imagine a
ladder, “with the best possible life for them being a 10, and the worst possible life being a 0”
(WHR, 2019) and not on other variables as some write (Flerlage, 2016). Additional variables are
used to aid the interpretation of the results.
GINI Index - average between 2000 and 2016
The WHR also includes data sets the World Bank issues, such as the GINI index. The GINI index
is a widely accepted measure that represents the wealth distribution in a country. It is measured
on a scale from 0 to 1, in which 0 stands for maximum equality and 1 for maximum inequality.
Alternatively, the values are multiplied by 100, hence the scale ranges from 0 to 100.
Countries which have an overrepresented age group, old or young, might influence the GINI
INDEX and make it more difficult to interpret.
Due to many missing values in the 2018 Gallup report, the average of the GINI Index for the
years 2000 to 2016 is taken.
GDP per Capita
2 “Social comparison theory states that individuals determine their own social and personal worth based on how they stack up against others they perceive as somehow faring better or worse.” Psychology Today. (N.A.)
Analysis WHR 2014 t0 2018 4
The Gross Domestic Product (GDP) per person in the country. The GDP is the value of all goods
and services a country produced, measured in US Dollars. This variable represents the average
GDP per capita of a country.
Social Support
The Social Support variable is based on one question: "If you were in trouble, do you have
relatives or friends you can count on to help you whenever you need them, or not?" (Gallup,
2006) and can only be answered with yes (1) or no (0). The average of the answer for the country
is displayed as the result in the WHR.
Healthy Life Expectancy at Birth
This variable is based on the WHO´s Global Health Observatory and includes more than 100
health factors (WHO, N.A.). This variable states the expected years a human will live at the day
s/he is born.
Freedom to make Life Choices
This variable is based on the Gallup question: “Are you satisfied or dissatisfied with your
freedom to choose what you do with your life?”. The respondent can choose yes (1) or no (0).
The result is the average of the answers each respondent gave per country.
Generosity
The Gallup report asks: “Have you donated money to a charity in the past month?” and answers
yes (1) or no (0) are reported. The answer is then combined with results from the variable GDP
per capita. A high value indicates a higher Generosity and vice versa.
Perception of Corruption
Analysis WHR 2014 t0 2018 5
The Perception of Corruption is measured with the following two statements: “Is corruption
widespread throughout the government or not?” and “Is corruption widespread within businesses
or not?”. The answers are also binary: yes (1) and no (0) and the result is calculated out of the
average each responded gave within a country.
Confidence in the National Government
The confidence the citizens have in the national government is measured between 0 – not
confident and 1 – confident.
Positive Experience Index
Measures the well-being of the respondents one day before the survey was taken. Five questions
are asked, and the average is taken. Positive answers are coded with 1 and negative answers
(including “I don´t know and similar answers) are coded with 0.
Negative Experience Index
Tis index is based on five questions. Negative outcomes are coded with 1 and all other answers
with 0.
2.2 Excluded Variables
The following variables were excluded due to the high amounts of missing values and/or results
that cannot be interpreted with the information provided and/or combining data from other data
sets that cannot be reviewed and/or the values are not used in this analysis: Democratic Quality,
Delivery Quality, GINI Index for 2018, GINI of household, Most people can be trusted (all
columns).
3. Analysis
3.1 Overview descriptive measures
At first it is important to report the following measures, to get an overview over the data.
Analysis WHR 2014 t0 2018 6
Functions used
Summary(); Inference()
3.1.1 Dataset: “WHR” – World Health Report, Years 2014 - 2018
Variable Mean 95% Confidence Interval
SD N Min Max
Life Ladder 5.4304 5.3475, 5.5133
1.1295 713 2.662 7.858
GINI Index 0.3884 0.378, 0.3909
0.0824 629 0.2110 0.6260
GDP per Capita
9.2698 9.1809, 9.3587
1.1923 691 6.466 11.693
Social Support
0.8063 0.7974, 0.8152
0.1209 708 0.2902 0.9873
Healthy Life Expectancy at Birth
63.9743 63.447, 64.5016
7.1072 698 44.90 76.80
Freedom to make Life Choices
0.7624 0.7526, 0.7722
0.132 700 0.3035 0.9852
Generosity 0 -0.0121, 0.0121
0.1612 683 - 0.336385 0.677743
Perception of Corruption
0.7365 0.7222, 0.7507
0.1877 665 0.04731 0.97634
Confidence in the National Government
0.4851 0.47, 0.5003
0.1969 650 0.07971 0.99360
Positive Experience Index
0.7089 0.7011, 0.7167
0.1059 706 0.3694 0.9436
Negative Experience Index
0.2838 0.2774, 0.2903
0.0875 707 0.0927 0.6426
3.2 Distribution of the data:
Secondly, it is important to test if the data is normally distributed. The normal distribution of a
variable is an indicator for the representativeness of the variable.
This can be tested using the Shapiro-Wilk Test and the Anderson Darling Test.
Analysis WHR 2014 t0 2018 7
Some of the variables cannot be tested because they violate the continuous and normality
assumptions since they are measured as a dummy variable. Hence, only the continuous variables
are tested.
3.2.1 Shapiro-Wilk Test
Functions used
Shapiro.test()
Interpretation:
If the p-value is below 0.05 then we reject the H0 – the hypothesis that the data is normally
distributed.
Variable P-Value Reject H0 Yes / NoLife Ladder 1.822e-05 YesGini Index 6.966e-11 YesGDP per Capita 6.89e-12 YesHealthy Life Expectancy at Birth 2.317e-14 Yes
The result of the Shapiro-Wilk test could be a big concern for the WHR creators (if they tested
this linear model), however, the Shapiro-Wilk test is known to reject the H0 easily if the tested
distribution does not fit closely to the normal distribution. This problem especially occurs in
large data sets like the one from the WHR. Hence, the Anderson-Darling test can be used to
substitute the Shaprio-Wilk test:
3.2.2 Anderson-Darling Test
Functions used
ad.test()
Interpretation:
If the p-value is below 0.05 then we reject the H0 – the hypothesis that the data is normally
distributed.
Analysis WHR 2014 t0 2018 8
Variable P-Value Reject H0 Yes / No
A Value
Life Ladder 0.001121 Yes 1.4216GINI Index 4.873e-14 Yes 5.6949GDP per Capita 2.2e-16 Yes 7.5848Healthy Life Expectancy at Birth
2.2e-16 Yes 11.518
Taking the large data set into consideration, not all variables should be disregarded. However,
both tests show that the GINI Index, GDP per Capita, and the Healthy Life Expectancy at Birth
variables are critical. Hence, the function qqnorm() will be used for a visual test.
3.2.3 QQNorm
In addition to the conducted tests above, the graphs (Appendix B) indicate that the variables
GDP per Capita and Healthy Life Expectancy at Birth are not normally distributed. Hence, these
two variables will be excluded from the data set.
The graph of the variable Life Ladder indicates that this variable is mostly normally distributed
and therefore will be used in the further analysis,
The graph of the GINI Index indicates that the outliers might have caused the problems. Hence,
for this variable the outliers will be removed and the variable retested.
3.3 Outliers
Outliers are defined as values that lay more than 1.5 times apart from the 25% quartile or 75%
quartile.
Functions used
Boxplot.stats()
Interpretation
The third row, labeled “$out” lists the outliers.
Outliers
Analysis WHR 2014 t0 2018 9
Variable Outliers Life Ladder No outliersGINI Index 0.6260000 0.6260000 0.6260000
0.6260000 0.6260000 0.6113333 0.6113333 0.6113333 0.6240000 0.6240000 0.6240000 0.6240000 0.6240000
3.3.1 Delete Outliers
Functions used
Recode()
Explanation
Outliers are recoded as “NA”3
3.3.2 Retest
Variable Shapiro - P-Value AD - P-ValueBefore After Before After
GINI Index 6.966e-11 6.477e-09 4.873e-14 5.716e-12
The tests show, that the variable´s p-value improved. The visual test also confirms the results
(Appendix C). However, due to the extremely low p-values and the visual test with qqplot, the
variable will not be used in the analysis.
3.4 Linear Model
A linear model describes the relationship between a dependent variable and one or several
independent variables.
The WHR uses the variable “Life Ladder” as the main indicator for happiness in a country. The
following multiple linear regression analysis the influence the independent variables have on the
dependent variable Life Ladder.
This is the model being used, whereas Yi = Life Ladder3 Outlier “0.6113333” could not be recoded. No error message shown.
Analysis WHR 2014 t0 2018 10
Yi = β0 + β1Social Support + β2Freedom to make Life Choices + β3Generosity + β4Perception
of Corruption +β5Confidence in the National Government + β6Positive Experience Index +
β7Negative Experience Index + ϵ
Variable Estimate P-Value Statistically SignificantSocial Support 4.6204 2e-16 YesFreedom to make Life Choices
2.3891 3.76e-16 Yes
Generosity 0.3107 0.0707 NoPerception of Corruption
-2.018 2e-16 Yes
Confidence in the National Government
-1.9358 2e-16 Yes
Positive Experience Index
0.8405 0.0094 Yes
Negative Experience Index
0.818 0.0259 Yes
3.4.1 Interpretation
Social Support
The variable is positive and significant. For every unit increase in Social Support the Life Ladder
increases on average by 4.6204 units.
Freedom to make Life Choices
The variable is positive and significant. For every unit increase in Freedom to make Life Choices
the Life Ladder increases on average by 2.3891 units.
Generosity
The variable is not significant. Hence, this variable will not be interpreted.
Perception of Corruption
Analysis WHR 2014 t0 2018 11
The variable is negative and significant. For every unit increase in Perception of Corruption the
Life Ladder decreases on average by 2.018 units.
Confidence in the National Government
The variable is negative and significant. For every unit increase in Confidence in the National
Government the Life Ladder decreases on average by 1.9358 units.
Positive Experience Index
The variable is positive and significant. For every unit increase in the Positive Experience Index
the Life Ladder increases on average by 0.8405 units.
Negative Experience Index
The variable is positive and significant. For every unit increase in the Negative Experience Index
the Life Ladder increases on average by 0.818 units.
3.5 Testing for Robustness
Functions used
Robust.se()
Coeftest()
Interpretation:
All variables are significant, and the estimates did not change. The standard error slightly
increased and the significance slightly decreased.
3.6 Testing Further Hypothesis
Functions used
Linearhypothesis()
Interpretation:
Analysis WHR 2014 t0 2018 12
Three hypothesis tests were run. The first test was run to evaluate the functionality of the test.
The second and third test was conducted to understand if those variables are similar and
therefore are likely to amplify the effect of each other on the variable Life Ladder in the linear
model:
1. Hypothesis 1: Social Support = Confidence in the National Government
The p-value is extremely low; hence, we can reject the hypothesis H1.
2. Hypothesis 2: Perception of Corruption = Confidence in the National Government
The high p-value of 0.6789 indicates that we fail to reject H2.
3. Hypothesis 3: Positive Experience Index = Negative Experience Index
The exceedingly high p-value of 0.9609 indicates that we fail to reject H3.
3.7 Visualizing the Results
Functions used
GGplot()
Interpretation:
The dependent variable Life Ladder was matched with each significant variable from the linear
model and a linear regression line laid on top of the plot, including a 95% confidence interval
which is shaded grey. The graphs can be found in Appendix D.
4. Conclusion
Seven dependent variables were analyzed, out of which six were significant. These six variables
were used to analyze how they influence the variable Life Ladder, which the WHR uses to
determine the happiness ranking. The variable Social Support had the highest positive influence
on the dependent variable Life Ladder. The highest negative influence had the variables
Perception of Corruption and Confidence in the National Government. Hence, social support is
Analysis WHR 2014 t0 2018 13
likely to be a big contributor to our happiness. If we can, then we should avoid corrupt states and
states that have a high confidence in their government.
The variables Negative Experience Index, Confidence in the National Government and
Perception of Corruption showed unexpected results. The estimates of the linear model for the
Negative Experience Index and Confidence in the National Government indicates that Life
Ladder increases with these variables. However, the graph in Appendix C indicates the opposite
effect: as the Negative Experience Index and Confidence in the National Government increase
Life Ladder decreases.
Interesting results offered the variable Confidence in the National Government which increases
as the variable Life Ladder decreases. This result is remarkably interesting for research on
populism. Further research could analyze how confidence in the national government decreases
happiness of a country. Analyzing the confidence in the government and the happiness over a
period is especially interesting for Turkey under the president Erdogan. The confidence of his
voters in the government increased after his election, but did their happiness increase as well?
Will the Istanbul major election annulation change their happiness?
References
Flerlage, Ken. (2016). https://www.kenflerlage.com/2016/08/whats-happiest-country-in-
world.html (accessed: 05/07/2019).
Gallup. (2006). World Poll Questions.
https://media.gallup.com/dataviz/www/WP_Questions_WHITE.pdf (accessed: 05/07/2019).
Analysis WHR 2014 t0 2018 14
Gallup. (N.A.). How Does the Gallup World Poll Work?. https://www.gallup.com/178667/gallup-
world-poll-work.aspx (accessed: 05/07/2019).
Psychology Today. (N.A.). Social Comparison Theory.
https://www.psychologytoday.com/us/basics/social-comparison-theory (accessed:05/07/2019).
WHR. (2019). FAQ. https://s3.amazonaws.com/happiness-report/2019/WHR19.pdf (accessed:
05/07/2019).
WHR. (2019). World Happiness Report. https://worldhappiness.report/faq/ (accessed:
05/07/2019).
WHO. (N.A.). Healthy life expectancy (HALE) at birth.
https://www.who.int/gho/mortality_burden_disease/life_tables/hale_text/en/ (accessed:
05/07/2019).
Helliwell, John F. et Huang, Haifang et Wang, Shun: John F. Helliwell, Haifang Huang and Shun
Wang. Statistical Appendix for \The Distribution of World Happiness".
https://s3.amazonaws.com/happiness-report/2016/StatisticalAppendixWHR2016.pdf (accessed:
05/07/2019).
Analysis WHR 2014 t0 2018 15
Appendix
4.1 Appendix A
Countries are ranked from top to bottom and from left to right. Hence, Finland has the happiest
inhabitants in the years 2014-2018 and the Central African Republic the unhappiest.
Analysis WHR 2014 t0 2018 16
Continues on the next page.
Analysis WHR 2014 t0 2018 17
Analysis WHR 2014 t0 2018 18
Back to the top
4.2 Appendix B
Analysis WHR 2014 t0 2018 19
Analysis WHR 2014 t0 2018 20
Back to the top
Analysis WHR 2014 t0 2018 21
4.3 Appendix C
Back to the top
4.4 Appendix D
Analysis WHR 2014 t0 2018 22
Back to the top
top related