On this page:

Public Attitudes to the Environment in Scotland - Technical Report

« Previous | Contents | Next »

Listen

PUBLIC ATTITUDES TO THE ENVIRONMENT IN SCOTLAND - TECHNICAL REPORT

Chapter 6 - WEIGHTINGS

6.1 INTRODUCTION

This chapter explains the technicalities of how the various weights were calculated for this survey. The following chapter provides design factors for some key questions in the survey, and explains how to use these.

Weighting removes the bias in population totals that may be introduced by the differential sampling strategy and response rates that were part of the survey.

The final weights for use with the survey data are made up of three different components. The first is from the sample design because the households and individuals in the survey were selected with unequal probabilities - due to the rural boost, and the fact that some households contain more adults than others, but only one adult per household was selected for interview. The second corrects for internal evidence of differential response rates by households and individuals. The third is optional and applies only to individuals (not households). It uses post-stratification to match the sample to national population estimates. This final weight has been used in analyses presented in the main report of the survey.

Three weights have been provided in a separate excel file to the Scottish Executive.

  • A household weight for analyses at a household level

hweight

  • An individual weight, not post-stratified to population totals

iweight1

  • An individual weight, post-stratified to match population totals.

iweight2

In each case the weights were normalised to sum to the total sample size of 4,119.

The weights were not affected by whether questionnaire A or B was administered. The effects of design and of non-response bias together resulted in rural areas being over represented in the survey. Table 6.1 gives the base number of respondents, the sum of the household weights and the base number as a percentage of the sum of weights by questionnaire type and rural/urban category. Where percentages are above 100% a type of respondent is over-represented in the survey, as was the case for remote areas.

Table 6.1
Total base numbers and sum of household weights by questionnaire type and rural/urban indicator.

No. of respondents

Sum of household weights

% of respondents to sum of weights

All households

4119

4119

100%

Type of questionnaire

A

1989

1990

100%

B

2130

2131

100%

Urban/rural indicator

The primary cities

1332

1616

82%

Other urban

1019

1255

81%

Small accessible towns

338

392

86%

Small remote towns

286

130

221%

Accessible rural

765

494

155%

Remote rural

379

233

162%

Individual weights were affected by the rural/urban indicator in a similar manner to the household weights. The individual weights showed that age and gender groups were differentially represented in the sample, as is shown in table 6.2. Older people and women are over-represented in the sample. The results are shown for the data before post-stratification to population totals. After post-stratification a similar pattern, but slightly more extreme, was found.

Table 6.2
Percentage of respondents to sum of household weights (without post-stratification) by questionnaire type and age / gender.

Males

Females

No of respondents

Sum of individual weights

% of respondents to sum of weights

No of respondents

Sum of individual weights

% of respondents to sum of weights

All households

1729

1816

95%

2390

2303

104%

Type of questionaire

A

830

875

95%

1159

1102

105%

B

899

941

96%

1231

1201

102%

Age group

16-24

147

222

66%

197

262

75%

25-34

236

265

89%

404

386

105%

35-44

333

338

98%

442

442

100%

45-54

293

329

89%

346

401

86%

55-64

272

263

103%

342

335

102%

65-74

284

256

111%

321

265

121%

75+

164

143

115%

338

212

160%

6.2 Weighting factors

Design weights

To recap on the sampling approach, the issued sample was stratified by the 6 category rural/urban indicator, defined as the predominant one for each ED. The post-code address file (PAF) was stratified by this indicator and a sample of EDs was drawn, and then 8 addresses plus 2 substitute addresses were selected per ED. The EDs were sampled with probability proportional to the number of addresses per ED and the sample was balanced by mosaic categorisation within each stratum. The sampling fractions differed between strata.

To make the issued sample representative of the PAF each selected household needs to be given a weight inversely proportional to its selection probability. Data on the number of households in the PAF and in the issued sample were provided by the sampling agency. These data have been used to calculate household design weights which are given in table 6.3. The weights have been normalised so that their sum adds to the total in the sample.

Table 6.3
Household design weights to make sample representative of the PAF

Urban-rural indicator

Weight

Number in issued sample

Primary cities

1.1851020

2660

Other urban

1.1855619

2060

Small accessible town

1.1669937

660

Small remote towns

0.5057057

480

Accessible rural

0.6411200

1440

Remote rural

0.6702913

700

The characteristics of the unweighted and the reweighted issued sample was then checked against the proportions in the PAF for these two classifications

  • Local Authority

  • Predominant MOSAIC classification

The results are given in tables 6.4 and 6.5.

Table 6.4
Percentages by Mosaic category of the ED in issued sample, PAF and weighted issued sample

Mosaic category for ED

Number in issued sample

Percentages

Issued sample

PAF

Weighted sample

Urban establishment

930

11.6

12.3

12.4

Burdened borrowers

800

10

10.6

10.5

Better off tenants

1130

14.1

14.6

14.6

Industrial success

570

7.1

7.3

7.3

Low rise council

640

8.0

8.0

8.0

Council flats

420

5.2

6.0

6.1

Low spending elders

520

6.5

7.3

7.3

Hi-rise tenements

430

5.4

6.3

6.3

Metro lifestyles

690

8.6

10.1

10.0

White collar owners

1340

16.8

13.0

13.2

Open countryside

530

6.6

4.4

4.4

As expected, the re-weighting has made the sample match the sampling frame much better than the unweighted sample, and no further design re-weighting for these categories should be required.

Table 6.5
Percentages by local authority in issued sample, PAF and weighted issued sample

Number in issued sample

Issued sample
(%)

PAF
(%)

Weighted sample(%)

Aberdeen

340

4.2

4.5

4.8

Aberdeenshire

390

4.9

4.1

3.6

Angus

190

2.4

2.1

2.5

Argyll & Bute

240

3

1.9

1.9

Clackmannanshire

80

1

0.9

1.1

Dumfries & Galloway

270

3.4

2.8

2.7

Dundee

190

2.4

3.1

2.8

East Ayrshire

120

1.5

2.2

1.6

East Dunbartonshire

90

1.1

1.8

1.3

East Lothian

150

1.9

1.7

1.5

East Renfrewshire

90

1.1

1.5

1.3

Edinburgh

650

8.1

9.4

9.5

Falkirk

200

2.5

2.8

2.7

Fife

550

6.9

6.7

6.8

Glasgow

920

11.5

13.5

13.6

Highland

480

6

4.1

4.3

Inverclyde

140

1.8

1.7

2.1

Midlothian

100

1.2

1.4

1.3

Moray

160

2

1.6

1.7

North Ayrshire

160

2

2.7

2.1

North Lanarkshire

410

5.1

5.9

5.7

Orkney

50

0.6

0.4

0.4

Perth & Kinross

240

3

2.5

2.6

Renfrewshire

290

3.6

3.6

4.2

Scottish Borders

170

2.1

2.2

1.8

Shetland

60

0.8

0.4

0.4

South Ayrshire

220

2.8

2.2

2.5

South Lanarkshire

430

5.4

5.6

5.7

Stirling

130

1.6

1.5

1.5

West Dunbartonshire

170

2.1

1.9

2.5

West Lothian

250

3.1

2.8

3.1

Western Isles

70

0.9

0.6

0.5

Non-response weights - household contact non-response

The sampling frame consisted of 8000 households, 10 per ED, of which two were replacements to be used only if required to achieve a target of 8 valid addresses (replacing any properties that were empty, businesses, holiday homes, demolished, or not located).

Interviewers attempted to contact 6,743 addresses. Of these 427 were not valid addresses for households. Thus an attempt was made to contact 6,316 households. Initial contact was made with 4,582 (72.5%) of these. After the initial contact the selected person was identified and completed an interview in 4,119 households (89.9% of those where initial contact was made). This gives an overall response rate for valid addresses of 65.2%. These figures count as household non-responders the 246 cases where the SIS was not returned. This approach errs on the side of underestimating the response rates, as some of these unreturned SIS's may not have been valid addresses.

The influence of the following factors on household non-response was investigated by logistic regression: rural/urban indicator, unitary authority, mosaic classification and whether the household lived at an address with more than one household (multiple occupancy indicator). All of these had some influence on non-response, even after adjusting for the other factors. The rural/urban indicator had the greatest influence on non-response with much better response rates in rural areas. Local authority was also important, even after allowing for the rural/urban indicator.

Mosaic classification and the multiple occupancy indicator had smaller effects, but still significant even after adjusting for other factors. Interactions between these factors were investigated, but they were not found to be important. Details of simple response rates by each of these factors are in table 6.6.

Figure 6.1
Household non-response weights

fig 6.1

A household non-response weight was calculated in proportion to the inverse of the predicted probability, from the logistic regression model, normalised to make the sum of the weights equal to the total sample size for the responders. It ranged from 0.76 to 1.64 and its distribution is shown in the figure opposite.

Table 6.6
Household contact response rates by different factors

Number of households

Response rate

LA Area

Number of households

Response rate

Rural/Urban Indicator

Aberdeen

272

73.9

The primary cities

2109

69.4

Aberdeenshire

312

72.8

Other urban

1643

70.4

Angus

151

65.6

Small accessible towns

519

74.0

Argyll & Bute

182

68.7

Small remote towns

380

82.4

Clackmannanshire

64

50.0

Accessible rural

1135

74.4

Dumfries & Galloway

208

72.6

Remote rural

530

79.6

Dundee

144

67.4

Scottish Mosaic

East Ayrshire

95

68.4

Urban establishment

737

69.5

East Dunbartonshire

72

86.1

Burdened borrowers

640

68.3

East Lothian

120

88.3

Better off tenants

901

76.6

East Renfrewshire

72

72.2

Industrial success

451

73.4

Edinburgh

514

67.7

Low rise council

511

78.9

Falkirk

160

79.4

Council flats

327

69.1

Fife

432

77.5

Low spending elders

414

71.3

Glasgow

733

68.1

Hi-rise tenements

343

67.9

Highland

360

82.8

Metro lifestyles

541

67.8

Inverclyde

111

78.4

White collar owners

1055

75.4

Midlothian

78

82.1

Open countryside

396

74.0

Moray

128

83.6

Multiple occupancy

North Ayrshire

126

87.3

Single household

5980

72.4

North Lanarkshire

328

68.0

Multiple households

336

75.9

Orkney

38

81.6

Perth & Kinross

189

69.3

Renfrewshire

230

74.3

Scottish Borders

134

78.4

Shetland

48

89.6

South Ayrshire

175

76.0

South Lanarkshire

344

62.2

Stirling

104

73.1

West Dunbartonshire

136

69.1

West Lothian

200

60.0

Western Isles

56

87.5

Non-response weights, person contact non-response

Of the households where an initial contact was made a completed interview with the selected person was not available for a further 463 households. This was due to the person refusing or, in the case of households with more than one person, to the selected person not being contactable, or because only a partial interview could be obtained. The number of people in the household (known for all but 145 of the non responders) was the most important predictor of person non-response. Mosaic category was also important in predicting person non-response, but other factors were not important. Response rates appear somewhat lower in more affluent areas. This produced a person-non-response weight, again inversely proportional to the response probability. This weight had a relatively small range 0.93 to 1.12 when normalised to sum to the total responders.

Table 6.7
Person contact non-response

Number of households

Response rate (%)

Persons in household

1

1716

96.7

2

2091

91.2

3 or more

620

87.4

Mosaic code

Urban establishment

512

90.2

Burdened borrowers

437

88.3

Better off tenants

690

88.6

Industrial success

331

91.5

Low rise coucil

403

88.3

Council flats

226

90.7

Low spending elders

295

87.1

Hi-rise tenements

233

100.0*

Metro lifestyles

367

91.0

White collar owners

795

90.2

K Open countryside

293

87.0

Total

4582

89.9

* this is not related to the number of single adult households in these areas. Just 61% of these interviews were with single adult households and 35% were with 2 adult households.

Combined household non response weight

The combined household non response weights was produced by the product of the household and person non-response weights, and had a distribution very similar to the first of these shown at figure 6.1.

6.3 COMBINED WEIGHT FOR HOUSEHOLDS AND FOR INDIVIDUAL RESPONDENTS

The combined household weight is obtained by multiplying the design weight by the two household non-response weights. This gives a range of values from 0.40 to 1.81. This is the column 'hweight' on the excel weighting spread sheet provided. It has 1,088 distinct values, but only 123 if rounded to two decimal places. A rounded version (hweightr) is also provided on the spreadsheet.

To get a combined individual weight, which will attempt to make the sample representative of the population in private households, this weight needs to be multiplied by the number of people in the household (table 6.8 below). Households of 5 or more were given a weight of 5.

Table 6.8
Responding households by number of people in household

Persons in household

1

2

3

4

5

6

7

Households

1663

1913

355

154

26

7

1

The final individual weight was normalised to sum to the number of respondents. Its range is 0.21 to 4.19 and a histogram of it is shown below at figure 6.2. The distribution of this weight (iweight1) for the responders is given below. It has 1,176 distinct values, but the version rounded to the nearest 0.02 has only 166 distinct values. The rounded version is in column 'iweight1r' of the excel weighting spread sheet. Use of rounded rather than exact weights makes almost no difference to the tables or the analysis.

Figure 6.2
Combined individual weight

fig 6.2

6.4 AGE/SEX AND LOCAL AUTHORITY DISTRIBUTION OF RESPONDENTS COMPARED TO GRO POPULATION ESTIMATES.

The age and sex distribution of the respondents was compared to the GRO mid year population estimates by age and sex

6. The respondents were grouped into 5 year age bands and figures are presented in table 6.9 for the respondents (both unweighted and weighted) by the combined individual weight ( iweight1).

Compared with the GRO (Scotland) figures, females are over-represented in the unweighted sample, while the reverse is true for young people especially young men. The weighted sample goes some way to correct this imbalance. There are several ways in which the weighting helps here. Most obviously because younger people tend to live in larger households, but also because both the design weights and the non-response weights gives higher weights to rural areas where the proportion of young people tends to be slightly lower. But a difference from the GRO(S) figures still persists. This difference is very similar to an equivalent difference found in the Scottish Household Survey.

Table 6.9
Age and sex distribution of respondents compared with the GRO(S) mid year population estimates (2000)* by age and sex.

Males

Females

Unweighted survey

Weighted survey

GRO population estimates

Unweighted survey

Weighted survey

GRO population estimates

All ages

42.0

44.1

47.9

58.0

55.9

52.1

16-19

1.3

2.3

3.2

1.7

2.5

3.1

20-24

2.3

3.0

4.0

3.1

3.9

3.8

25-29

2.6

3.0

4.2

4.0

4.1

4.1

30-34

3.1

3.4

4.9

5.8

5.3

4.9

35-39

4.2

4.2

5

5.9

5.6

5.1

40-44

3.9

4.0

4.5

4.9

5.1

4.6

45-49

3.7

4.2

4.0

4.2

4.9

4.1

50-54

3.4

3.7

4.1

4.2

4.8

4.2

55-59

3.4

3.2

3.3

3.9

4.0

3.5

60-64

3.3

3.2

3

4.4

4.1

3.3

65-69

3.4

3.1

2.6

3.8

3.3

3.1

70-74

3.5

3.1

2.2

4.0

3.2

2.8

75-79

1.8

1.7

1.6

3.5

2.5

2.4

80 and over

2.2

1.8

1.4

4.7

2.7

3.1

Note 1 Entries in the table are percentages of the total estimated population, or of the total number of respondents (weighted or unweighted).

Note 2 2,000 mid year population estimates are used as these are the only ones

available easily with single year breakdowns.

The data in table 6.9 can be used to calculate a final individual weight that will make the weighted sample match the GRO mid year population. The factors to be multiplied into the existing weights to achieve this are given in table 6.10. Notice that the weights imply that women are over-represented in the sample at all but the youngest and the very oldest age groups. The pattern for men is different, with under-representation up to age 60 and then over-representation at the oldest age groups, compared with the GRO(S) population.

A final individual weight was be obtained by multiplying iweight1 by this factor and then normalising so that the final individual weight (iweight2) sums to the total responders. The weight (iweight2) has a range from 0.167 to 5.78 and a histogram is provided below at figure 6.3. A version of this rounded to the nearest 0.02 is also provided on the excel weighting spread sheet (iweight2r).

Table 6.10
Factor required to be applied to the weights to make the sample representative of the GRO(S) population estimates.

Age group

Males

Females

16-19

1.38

1.25

20-24

1.31

0.99

25-29

1.43

1.01

30-34

1.45

0.94

35-39

1.20

0.91

40-44

1.15

0.91

45-49

0.95

0.84

50-54

1.11

0.88

55-59

1.04

0.87

60-64

0.94

0.82

65-69

0.86

0.95

70-74

0.70

0.90

75-79

0.96

0.98

80 and over

0.77

1.17

Figure 6.3
Final weight adjusted to GRO(S) population estimates

fig 6.3

Note that this final stage of weighting was not carried out for the Scottish Household Survey. So any comparisons with this survey should be made using the weight iweight1.

As a final check the local authority population estimates provided by GRO(S) were checked against the responders, weighted according to the final individual weight (iweight2). The two sets of percentages were found to be in very good agreement with each other.

6.5 Some cautions about weighting.

This chapter has provided three different weights that could be used in conjunction with this survey. This section will discuss when they should be used, and also in what circumstances a completely unweighted analysis may be more appropriate.

The purpose of weights is intended to make the results of a survey applicable to the whole population of an area or country. Here we have two kinds of weights. The first is a household weight and as such, likely to be applicable to only a minority of analyses to be carried out here. In addition two weights are supplied that could be used for individual analyses. The first adjusts for design factors and for internal evidence of non-response by area and size of household. The second adjusts for these factors and also post-stratifies to national population estimates. The terminology for types of weight used here is taken from Lessler and Kalsbeek (1992)

7. The household weight and the first individual weight incorporate design weights and non-response weights and the third also includes a post-stratification weight.

Weighting for non-response and for the design has a cost in reducing precision. Non-response weighting is based on strong assumptions. It implicitly assumes that the responses of the non-responders (had we been able to get them) would be the same as those for responders with the same measured characteristics. This is a strong assumption, but one that is the norm in survey research. It may also be one that is worth making when the purpose of analysis is to obtain national population estimates or trends. Any inferences from such analyses will be to make statements about (e.g.) the proportion of the Scottish population who hold a particular view.

Not all inferences are of this type, however. We may be interested in comparing the views of urban and rural residents on a certain question. Here the interest may simply be in a comparison and the inference might equally well (and with fewer assumptions) be made for a population consisting of the cross-section of the Scottish Population included (by design and response) in the survey and willing to answer the questions. For this analysis unweighted calculations could be appropriate and, indeed, may sometimes give more precise answers to the questions posed. Simpler analyses will allow exploratory work to be carried out more easily and are not necessarily invalid. For these reasons, design effects are calculated in the next chapter for both weighted and unweighted analyses.

« Previous | Contents | Next »

Page updated: Monday, June 27, 2005