how the british population survey can enhance geodemographics
DESCRIPTION
Martin Callingham Visiting Professor Birkbeck College University of LondonTRANSCRIPT
How The British Population Survey can enhance geodemographics
Royal Statistical SocietyOAC Conference
September 6th
Martin CallinghamVisiting ProfessorBirkbeck CollegeUniversity of London
Regionality
Bespoke combinations
Comparison of systems
Changing character over time
Resolution of ambiguity
Importance of each census variable
Highly resolved OAC
Introduction
Centre of gravity of each OAC 52 sub-group
Regionality
OAC types are distributed unevenly across the country
So the mix in an area is unique
It is a principle of geodemographics that a type in Brighton is the same as a type in Bolton
Is this true?
Wolverhampton
0.00
1.00
2.00
3.00
4.00
5.00
6.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2 21 2 2 2 25 2 27 2 2 3 31 3 3 3 35 3 37 3 3 4 41 4 4 4 45 4 47 4 4 50 51 52
Ind
ex n
atio
nal
Sollihul
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2 21 2 2 2 25 2 27 2 2 3 31 3 3 3 35 3 37 3 3 4 41 4 4 4 45 4 47 4 4 50 51 52
Ind
ex n
atio
nal
Bespoke combinations
Because the mix in an area is unique some areas have none or very few of particular OAC types
It is often sensible to combine the low represent types together to form bigger hybrid types
But what are the characteristics of these bespokely formed types?
Relationship between the levels in the system and mean Area %
R Squared =0.99
10
15
20
25
30
35
0 10 20 30 40 50 60 70
Number of Levels
Are
a %
Effectiveness of a system depends only upon the number of levels
OAC Of Mosaic 61
OAC 52 0.98OAC 21 0.82OAC 7 0.62
Comparison of systems
There are a variety of different geodemographic systems
Which is the ‘best’?
Which is the best for me?
Sun OAC sub groups indices
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
1.80
Telegraph OAC sub groups Indices
0.00
0.50
1.00
1.50
2.00
2.50
3.00
3.50
Changing character over time
An area will change over time due to population movements
It will also change over time due to social and economic factors
What are the types of changes that could happen?
Are some OAC types more likely to change than others?
Distribution of standard deviation of the distance from centroid of OAC subgroups
Resolution of ambiguity
Geodemographics are created by using cluster analysis
This seeks to group records that are similar together
Sometimes record are really quite different from the mass
But these still have to be put into a cluster on the basis of ‘least worst’ fit
What effect does this variation cause?
DEFINITIONpercentage of resident population aged 0-4 percentage of resident population aged 5-14 percentage of resident population aged 25-44 percentage of resident population aged 45-64percentage of resident population aged over 65percentage of people identifying as Indian, Pakistani or Bangladeshi percentage of people identifying as Black African, Black Caribbean or Other Black (1) percentage of people not born in the UK number of people per hectare percent of residents over 16 who are not living in a couple and are separated or divorced (2)percentage of households with one person who is not a pensioner percentage of households which are single pensioner households percentage of households which are lone parent households with dependent children percentage of households which are cohabiting or married couple households with no childrenpercentage of households comprising one family and no others with non-dependent children living with their parentspercent of households that are public sector rented accommodation percent of households that are private/other rented accommodation percent of all household spaces which are terraced percent of all household spaces which are detached percent of all household spaces which are purpose built, converted and communal building flats percent of occupied household spaces without central heating average household sizeaverage number of people per roompercent of people aged between 16 - 74 with a higher education qualification percent of people aged 16-74 in employment working in routine or semi-routine occupationspercent of households with 2 or more cars percent of people aged 16-74 in employment usually travel to work by public transport (3) percent of people aged 16-74 in employment who work mainly from home percentage of working age population with limiting long term illness (7) percent of people who provide unpaid care (6)percent of people aged 16-74 who are students (4)percent of economically active people aged 16-74 who are unemployedpercentage of economically active people aged 16-74s who work part time (5) percentage of economically inactive women aged 16-74 who are looking after the homepercent of all people aged 16-74 in employment working in agriculture and fishingpercent of all people aged 16-74 in employment working in mining, quarrying and construction percent of all people aged 16-74 in employment working in manufacturing percent of all people aged 16-74 in employment working in hotel and catering percent of all people aged 16-74 in employment working in health and social workpercent of all people aged 16-74 in employment working in financial intermediation percent of all people aged 16-74 in employment working in wholesale/retail trade
Importance of each census component
41 variables are used in the creation of OAC
Each has been carefully selected
But fewer variables make better segmentation (to a limit)
What actually is the impact of each variable on the use of OAC?
Could the number of variables be reduced?
Asian %
0
10
20
30
40
50
60
70
80
1300 level
Highly resolved OAC
The highest number of clusters in OAC is 52
A greater number would give greater discrimination
Creating OAC with many more variables is a trivial process
But what do they mean?
Summary
Regionality
Bespoke combinations
Comparison of systems
Changing character over time
Resolution of ambiguity
Importance of each census variable
Highly resolved OAC
THE END