1/26 why-a-graph why-a-graph a presentation at the university of bratislava bertil thorslund ...

Post on 20-Dec-2015

215 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

2/26

Who am I?

• Educated in sociology rather than statistics

• Fascinated by figures and data

• Hot on making information available,thus Internet

• Newly retired after 15 years in Social Insurance

3/26

4/26

5/26

What data?Swedish Stock ExchangeSwedish sickness insurance

What software?MS Office

ExcelPowerPoint

6/26

Graphs vs Diagrams

01 000 0002 000 0003 000 0004 000 0005 000 0006 000 0007 000 0008 000 0009 000 000

10 000 000

1998

01

1998

07

1999

01

1999

07

2000

01

2000

07

2001

01

2001

07

2002

01

2002

07

2003

01

2003

07

2004

01

2004

07

2005

01

2005

07

2006

01

2006

07

2007

01

2007

07

2008

01

2008

07

2009

01

2009

07

2010

01

2010

07

Social insurance

Sickness insurance Pensions Family benefits

7/26

Two branches in statistics

Estimation of error Knowing it all

Gallup polls Number ofsicklisted

8/26

Estimation of error

9/26

Estimation of error

March

April

95 %

95 %Difference? Yes

10/26

Estimation of error

March

April

95 %

95 %Difference? No

11/26

An example of a linear (or curve) graph

This was yesterday NYSE opens at 15:30

Describe the developments during the day (Nov 16th 2010)

12/26

What would the price of H&M shares be if developments would have been likestock exchange in general? When was the largest amount of shares bought and sold?

13/26

Sicklisting, netdays payed during a 12-month periodSweden

0

20 000 000

40 000 000

60 000 000

80 000 000

100 000 000

120 000 000

2001

-12

2002

-06

2002

-12

2003

-06

2003

-12

2004

-06

2004

-12

2005

-06

2005

-12

2006

-06

2006

-12

2007

-06

2007

-12

2008

-06

2008

-12

2009

-06

2009

-12

2010

-06

2010

-12

2011

-06

2011

-12

net

day

s

Payed to women

59,6%

Data until July 2010

14/26

Measurements need to be

• Reliablecan be repeated

• Validwhat you intendedrelevant

15/26

Choosing the best graph type

Bar graph or histogram

0

20 000 000

40 000 000

60 000 000

80 000 000

100 000 000

120 000 000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

16/26

0

20 000 000

40 000 000

60 000 000

80 000 000

100 000 000

120 000 000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Net days paid to sicklisted persons

Legends, everything to identify what is shown

Values are ’jumping’from one bar to the next

17/26

Stacked bar graph

0

20 000 000

40 000 000

60 000 000

80 000 000

100 000 000

120 000 000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Men

Women

Net days paid to sicklisted persons

18/26

0

10 000 000

20 000 000

30 000 000

40 000 000

50 000 000

60 000 000

70 000 000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Women

Men

Grouped bar graphLegends important

Net days paid to sicklisted persons

19/26

Net days paid to sicklisted persons

What are the changes observed?

Increase in the late 90-tiesPeaking in 2002Persistent decrease after 2002Same pattern for men and women

But there is possibly a change that is more visible in another variation of a bar graph

0

10 000 000

20 000 000

30 000 000

40 000 000

50 000 000

60 000 000

70 000 000

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Women

Men

20/26

0%

20%

40%

60%

80%

100%

1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Men

Women

Net days paid to sicklisted persons

Bar graph with fractions

All through this increase – decrease the proportion men/womenremains unchangedAnd what is it that you see in this graph?

21/26

Popular graph in presentation of survey dataany other suggestion for valid data?

0% 20% 40% 60% 80% 100%

My work tasks

My boss

My wage development

My career possibilities

not so contended

neutral

very contended

22/26

Net days paid to sicklisted persons

Bar graph – horizontal axis other than time

0

5 000 000

10 000 000

15 000 000

20 000 000

25 000 000

30 000 000

35 000 000

40 000 000

Region

1

Region

2

Region

3

Region

4

Region

5

Region

6

Region

7

Region

8

Region

9

Region

10

Region

11

Region

12

Region

13

Region

14

Region

15

Region

16

Region

17

Region

18

Region

19

Region

20

Region

21

You only see that the regions differ in size

23/26

Net days paid to sicklisted persons

Bar graph – horizontal axis other than timeMake sure values are ’comparable’ indexes/percentagesPresented here is the value for 2009 as a percentage of the value 2002 (the peak year)

0%

10%

20%

30%

40%

50%

60%

Region

1

Region

2

Region

3

Region

4

Region

5

Region

6

Region

7

Region

8

Region

9

Region

10

Region

11

Region

12

Region

13

Region

14

Region

15

Region

16

Region

17

Region

18

Region

19

Region

20

Region

21

What region has the least decrease. Which have the biggest?

24/26

Net days paid to sicklisted persons

01 000 0002 000 0003 000 0004 000 0005 000 0006 000 0007 000 0008 000 0009 000 000

10 000 000

1998

01

1998

07

1999

01

1999

07

2000

01

2000

07

2001

01

2001

07

2002

01

2002

07

2003

01

2003

07

2004

01

2004

07

2005

01

2005

07

2006

01

2006

07

2007

01

2007

07

2008

01

2008

07

2009

01

2009

07

2010

01

2010

07

Monthly data -> smaller ’jumps’ from one reading to the next -> line graph

But, in this case, new information isn’t much.

25/26

Net days YearMonth 2006 2007 2008 2009 201001 5 644 695 5 239 557 4 496 565 3 672 634 2 802 97302 5 584 749 4 992 476 4 266 754 3 474 681 2 641 49503 5 507 037 4 826 415 4 018 454 3 386 635 2 577 02504 5 430 565 4 922 322 4 396 018 3 434 322 2 803 82205 5 595 094 4 869 841 4 068 182 3 197 541 2 787 85006 5 537 153 4 774 286 3 967 066 3 355 602 2 827 56707 5 169 002 4 636 689 3 733 712 3 114 203 2 701 93608 5 285 729 4 715 844 3 513 358 2 982 793 2 663 31309 5 055 910 4 408 490 3 661 944 2 968 837 2 787 67910 5 171 331 4 475 319 3 756 819 3 044 56411 5 376 693 4 497 882 3 647 082 3 091 71812 5 171 803 4 254 215 3 671 000 3 131 939

What can you make out from those data?

26/26

Net days paid to sicklisted persons

0

1 000 000

2 000 000

3 000 000

4 000 000

5 000 000

6 000 000

Octobe

r

Novem

ber

Decem

ber

Janu

ary

Febru

ary

Mar

chApr

ilM

ayJu

ne July

Augus

t

Septe

meb

r

2006/2007

2007/2008

2008/2009

2009/2010In this decrease every monthhas a lower value than last year

So, what is it that you see in this graph

27/26

Net days YearMonth 2006 2007 2008 2009 201001 5 644 695 5 239 557 4 496 565 3 672 634 2 802 97302 5 584 749 4 992 476 4 266 754 3 474 681 2 641 49503 5 507 037 4 826 415 4 018 454 3 386 635 2 577 02504 5 430 565 4 922 322 4 396 018 3 434 322 2 803 82205 5 595 094 4 869 841 4 068 182 3 197 541 2 787 85006 5 537 153 4 774 286 3 967 066 3 355 602 2 827 56707 5 169 002 4 636 689 3 733 712 3 114 203 2 701 93608 5 285 729 4 715 844 3 513 358 2 982 793 2 663 31309 5 055 910 4 408 490 3 661 944 2 968 837 2 787 67910 5 171 331 4 475 319 3 756 819 3 044 56411 5 376 693 4 497 882 3 647 082 3 091 71812 5 171 803 4 254 215 3 671 000 3 131 939

The message is much more obvious in the graph, right?

28/26

RehabilitationSickness cash benefitInvalidity pension

An illhealth measure – distribution by benefit

Make an estimation ofwhat percentage arethose three ingredients

Think of it as minutes.Example: 12 minutes wouldbe 12/60 = 20 %

30/26

The area of the circle representsthe number of people.

If the radius is increased by 10 percenthow much bigger is the circle?

If the number of people is increased by 30 percent how much bigger should the radius be?

31/26

0,0%

25,0%

50,0%

75,0%

100,0%

125,0%

150,0%1-14 days

15-28 days

29-59 days

60-89 days

90-179 days

180-364 days

1-2 yrs

2-3 yrs

3-4 yrs

4-5 yrs

5-6 yrs

6+ yrs

Spiderweb graph

Many variables (measurements) at the same time

Intuitive interpretation but rather demanding

32/26

Nomogram – a very special kind of graph

33/26

And now onto new ideas of what you can do with graphs.

Admire the ideas of prof. Hans Rosling and the software created by his son and daughter-in-law.

www.gapminder.org

top related