econometrics final project

13

Click here to load reader

Upload: brock-prince

Post on 16-Aug-2015

32 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Econometrics Final Project

Econometrics Final ProjectBrock Prince

12/12/14

Page 2: Econometrics Final Project

Introduction:

I have been a Seattle Seahawk fan since I was a child, but over recent years I’ve seen a drastic increase in their fan base. Growing up I noticed that the Broncos, Patriots, Cowboys and Packers were much more popular than the Seahawks, but I could never figure out why. Even though Seattle was the nearest city with an NFL team to my hometown of Post Falls, my friends and most of the community seemed to choose teams from across the country. In my opinion, a true fan wouldn’t choose their team just because they were doing well in a particular year. Do fans really choose their favorite teams solely off of their winning percentages and Super Bowl trophies?

This year following Super Bowl XVIII, in which the Seahawks defeated the Broncos by a landslide, Russell Wilson had the highest selling jersey according to NFL.com. Ideally, merchandise sales for each individual team would be the best metric for measuring team popularity. However, after thoroughly searching the web, I could not obtain these figures. The next best thing, in my opinion, would be the amount of Facebook likes for an NFL team. I believe that this would be an appropriate measure for team popularity in this situation.

For this project I will try to pinpoint some of the attributing factors to an NFL team’s popularity as well as the amount of influence they have on it. I will be conducting a multiple linear regression model with team popularity as my response variable. My explanatory variables will consist of size of the state in which an NFL team is located, the number of wins over the last five years, and the number of Super Bowl wins attained from a particular team over the history of the NFL. I will be trying to find the equation:

Y = b0+b1x1+b2x2+b3x3+ε

The variables I will include in my multiple linear regression model are as follows:

y = Number of Facebook likes for any particular NFL team

x1 = The size of the state in which the team is located

x2 = The amount of wins for a team over the last 5 seasons

x3 = Number of Super Bowl wins

My data for team popularity came from Facebook, which lists the number likes of each NFL team page. The population for each state was attained from the United States Census website. Data for amount of Super Bowl wins and number of wins over the last five seasons came from NFL.com. I do not believe there should be any issues with the sources for the data. This is because the numbers are straightforward and I retrieved them from their original sources.

The figure below includes the actual data I used in my regression analysis. Some of the potential outliers are highlighted in yellow for low and red for high numbers.

Page 3: Econometrics Final Project

Red Largest quantity of Facebook likes Yellow Smallest quantity of Facebook likes

Descriptive Statistics

Team Size of State Super Bowl WinsWins (five years)

Facebook Likes *Millions

Arizona Cardinals 6,626,624 0 38 950,000Atlanta Falcons 9,992,167 0 49 1,660,000Baltimore Ravens 5,928,814 2 51 2,080,000Buffalo Bills 19,651,127 0 28 630,000Carolina Panthers 9,848,060 0 35 1,330,000Chicago Bears 12,882,135 1 44 3,850,000Cincinnati Bengals 11,570,808 0 44 980,000Cleveland Browns 11,570,808 0 23 960,000Dallas Cowboys 26,448,193 5 41 7,230,000Denver Broncos 5,268,367 2 46 3,160,000Detroit Lions 9,895,622 0 29 1,560,000Green Bay Packers 5,742,713 4 55 4,420,000Houston Texans 26,448,193 0 39 1,690,000Indianapolis Colts 6,570,902 2 48 2,130,000Jacksonville Jaguars 19,552,860 0 26 470,000Kansas City Chiefs 6,044,171 1 34 1,240,000Miami Dolphins 19,552,860 2 35 1,840,000Minnesota Vikings 5,420,380 0 36 1,740,000New England Patriots 6,692,824 3 61 5,230,000New Orleans Saints 4,625,470 1 55 3,910,000New York Giants 19,651,127 4 43 3,580,000New York Jets 19,651,127 1 42 1,780,000Oakland Raiders 38,332,521 3 29 2,770,000Philadelphia Eagles 12,773,801 0 43 2,750,000Pittsburgh Steelers 12,773,801 6 49 5,840,000San Diego Chargers 38,332,521 0 46 1,580,000San Francisco 49ers 38,332,521 5 50 3,580,000Seattle Seahawks 6,971,406 1 43 2,530,000St. Louis Rams 6,044,171 1 24 560,000Tampa Bay Buccaneers 19,552,860 1 28 810,000Tennessee Titans 6,495,978 0 36 780,000Washington Redskins 646,499 3 28 1,650,000

Page 4: Econometrics Final Project

Table 1: Summary Statistics

Variable Mean Standard Deviation Minimum Value Maximum Value

Size of State14,059,107.2

2 10,294,319.14 646,499.00 38,332,521.00

Super Bowl Wins 1.50 1.76 0.00 6.00

Wins (five years) 39.94 9.86 23.00 61.00

Facebook Likes 2,352,187.50 1,641,988.83 470,000.00 7,230,000.00

There are a few outliers in my data set. For teams such as the Dallas Cowboys and the New England Patriots, their social media presence is much higher than my model predicts. On the other side of the spectrum, the St. Louis Rams and the Jacksonville Jaguars have lower than expected Facebook likes. There are some other outside factors influencing these teams, but I could not pinpoint them in my model.

As you can see from the summary statistics the differences between the minimum and maximum values for each category vary quite a bit. The lowest amount of Facebook likes for a team is 470,000 (Jaguars) and the highest is 7,230,000 (Cowboys). These may cause some problems in the results of the regression analysis as outliers, but they will be covered later in the project.

Table 2: Correlations

Size of StateSuper Bowl

WinsWins (five

years)Facebook Likes

*MillionsSize of State 1Super Bowl Wins 0.204267777 1Wins (five years) -0.086774106 0.38682029 1Facebook Likes *Millions 0.112784657 0.784091823 0.62644206 1

The correlation table shows us how much each of the four variables are correlated. You want bigger correlations between the response variable and explanatory variables. A combination of any two explanatory variables with an output of .50 or higher would give us probable concern. Any value of .7 or higher would more than likely be something we should be worried about. As you can see from the table, Super Bowl wins and Facebook likes are very highly correlated with a value of .784. This provides proof of the “bandwagon” fact I mentioned in the introduction.

Page 5: Econometrics Final Project

People seem to be a fan of something because everyone else is, and this would also be because of a team’s historical success. However, this is good for my regression model. It shows that there is a pretty strong correlation between my response variable and two of my explanatory variables.

The following figures graphically show the relationship with all three of the explanatory variables and the response variable.

20 25 30 35 40 45 50 55 60 650

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

7,000,000

8,000,000

Fig. 1 Social Media and Wins

Wins (5 years)

Face

book

Like

s

As expected, the more an NFL team wins, the greater social media presence they receive.

0 1 2 3 4 5 6 70

1,000,0002,000,0003,000,0004,000,0005,000,0006,000,0007,000,0008,000,000

Fig. 2 Facebook Likes and SB wins

Super Bowl Wins

Face

book

Like

s

SB wins over the last 48 years are also a good predictor of how popular a team is on social media, which is positively correlated with the response variable. There is a steady, positive, linear relationship between this explanatory variable and team popularity.

Page 6: Econometrics Final Project

0 5,000,000 10,000,000 15,000,000 20,000,000 25,000,000 30,000,000 35,000,000 40,000,000 45,000,0000

1,000,000

2,000,000

3,000,000

4,000,000

5,000,000

6,000,000

7,000,000

8,000,000

Fig. 3 Facebook Likes and Size of State

Facebook Likes and Size of State

Face

book

Like

s

There is not much of a correlation between the size of the state of an NFL team and social media presence. This would probably be due to the amount of technology available today. You can now get any game on your TV at home, even if it is one with teams across the nation. Success of a team seems to be a much better predictor of popularity.

The following graphs show the relationship of the explanatory variables I was considering including in my model, but due to non-normal correlations and other issues I chose to leave them out of my model.

0 1 2 3 4 5 6 7 8 90

1,000,0002,000,0003,000,0004,000,0005,000,0006,000,0007,000,0008,000,000

Fig. 4 Facebook Likes *Millions

Super Bowl Appearances

Face

book

Like

s

Even though Super Bowl appearances were a good predictor of team popularity, the amount of actual Super Bowl victories was a better predictor for the popularity of a team.

Page 7: Econometrics Final Project

$200,000,000

$250,000,000

$300,000,000

$350,000,000

$400,000,000

$450,000,000

$500,000,000

$550,000,000

$600,000,0000

1,000,0002,000,0003,000,0004,000,0005,000,0006,000,0007,000,0008,000,000

Fig. 5 Facebook Likes and Annual Revenue

Annual Revenue

Face

book

Like

s

Annual revenue was one of my first choices of response variables, but some of the outliers seems to skew the data. If you were to remove the outlier in the top right (Cowboys) the trend line would have been much different.

20 25 30 35 40 45 50 55 60 65$0

$100,000,000

$200,000,000

$300,000,000

$400,000,000

$500,000,000

$600,000,000

Fig. 6 Annual Revenue and Number of Wins

Wins (5 season)

Annu

al R

even

ue

I also thought there might be a positive correlation between annual revenue and wins. If a team were to win more, one would expect the revenue to be higher due to increased merchandise sales from popularity. The graph shows that the relationship is actually pretty static, so I decided to leave this out of my final model.

Page 8: Econometrics Final Project

0 1 2 3 4 5 6 7$0

$100,000,000

$200,000,000

$300,000,000

$400,000,000

$500,000,000

$600,000,000

Fig. 7 Annual Revenue and SB Wins

Super Bowl Wins

Annu

al R

even

ue

The relationship between annual revenue and Super Bowl wins also seems to be mostly static. It is only slightly positive due to the Dallas Cowboys outlier.

Hypotheses tests

H0: B1 = 0

Ha: B1 > 0

The first hypothesis test tests the overall significance of state size on team popularity.

H0: B2 = 0

Ha: B2 > 0

The second hypothesis test tests the overall significance of the explanatory variable, Super Bowl wins, on team popularity.

H0: B3 = 0

Ha: B3 > 0

The third hypothesis test tests the overall significance of the explanatory variable, wins over the last five seasons, on team popularity.

Page 9: Econometrics Final Project

Variable Coefficent Strd. Error t-statistic p-value Lower 95% Interval Upper 95% IntervalIntercept -1120715.489 744459.246 -1.50541 0.143417676 -2645671.126 404240.1482Size of State 0.002690623 0.0160387 0.167758 0.867979376 -0.030163169 0.035544414SB Wins 590059.0405 101360.056 5.821416 2.96224E-06 382432.3784 797685.7026Wins 63849.4316 17781.5756 3.590763 0.001244003 27425.52509 100273.3381

R Square 0.737856Adj. R Square 0.709769143Observations 32

I would accept the null hypothesis for the size of state coefficient, but I would fail to reject my x1 and x2 coefficients.

Before running the regression analysis I expected the variables to all have a positive relationship with team popularity. I assumed that states with a larger population should have more popular NFL teams. However, large states tend to have multiple NFL teams, such as New York, California, and Texas. The coefficient value of .0027 shows that it has little effect on the amount of Facebook likes a team receives. This could take spread the population favorites out over multiple teams.

I also expected that Super Bowl wins would be a good predictor of team popularity. A team that has been historically good, should have a larger fan base. This variable received a coefficient of 590059 and is statistically significant with a very small p-value. A p-value of less than .05 tells us that we would reject the null hypothesis. This number tells us that if you were to increase the number of Super Bowl wins by one, while holding all other variables constant, you would expect the amount of Facebook likes for a particular team to increase by 590,059.

I expected wins over the last five seasons to also have a positive coefficient. This is similar to Super Bowl wins over the last 5 years. A team that has been doing better during the regular season should have more popularity than a team that has been doing poorly. This coefficient is statistically significant with a p-value of less than .05. It’s coefficient of 63,849 tells us that, while holding all other variables constant, for each additional game won you would expect the number of Facebook likes to increase by 63,849.

The R square value from the regression has a value of .73, which tells us that about 73% of the variance in the response variable can be explained by the explanatory variables. As far as statistical issues go for causing a problem in my regression, the only thing I can think of that might be a problem is a multicollinearity issue between the Super Bowl wins variable and the winning percentage over the last 5 seasons variable. However, I don’t think it is enough to cause any major problems since the wins were only collected from the last five years and not the entire 48. Some teams who were good a long time ago, are not good anymore.

The results from my project show that there is a pretty strong correlation between team popularity and the success over the team. This is interesting to me because it shows that many people probably switch teams when the team they used to like starts performing badly. This

Page 10: Econometrics Final Project

could be useful for NFL teams, because they can look at other ways besides success to attract fans. Bigger marketing efforts and brand building can help retain fans during slumps of losing streaks.

I do believe that the Seahawk adopting the “12 fan” will help them retain some of their fan base when they aren’t doing as well in the upcoming seasons.

. However, I believe that there are more factors that influence popularity. Marketing efforts of a professional football organization probably play a large role. The Dallas Cowboys for example, are thought of as America’s team. They sell large quantities of merchandise each year. This holds true even when the team isn’t performing very well. This could be because of the brand that they have created around the Dallas Cowboy logo.

If I had all the data I wanted at my fingertips, I would have ran this experiment differently. I would have used the revenue from merchandise for all 32 teams as my response variable. My explanatory variables would consist of distance from the nearest NFL team, winning percentage over the last five years, Super Bowl wins, and amount of money put into marketing for the team and building a brand around their logo. However, I am satisfied with the significant results that this project has given me about the correlations between the success of a team and team popularity.