« Previous | Contents | Next »
Listen
PUBLIC ATTITUDES TO THE ENVIRONMENT IN SCOTLAND - TECHNICAL REPORT
Chapter 6 - WEIGHTINGS
6.1 INTRODUCTION
This chapter explains the technicalities of how the various weights were calculated for this survey. The following chapter provides design factors for some key questions in the survey, and explains how to use these.
Weighting removes the bias in population totals that may be introduced by the differential sampling strategy and response rates that were part of the survey.
The final weights for use with the survey data are made up of three different components. The first is from the sample design because the households and individuals in the survey were selected with unequal probabilities - due to the rural boost, and the fact that some households contain more adults than others, but only one adult per household was selected for interview. The second corrects for internal evidence of differential response rates by households and individuals. The third is optional and applies only to individuals (not households). It uses post-stratification to match the sample to national population estimates. This final weight has been used in analyses presented in the main report of the survey.
Three weights have been provided in a separate excel file to the Scottish Executive.
In each case the weights were normalised to sum to the total sample size of 4,119.
The weights were not affected by whether questionnaire A or B was administered. The effects of design and of non-response bias together resulted in rural areas being over represented in the survey. Table 6.1 gives the base number of respondents, the sum of the household weights and the base number as a percentage of the sum of weights by questionnaire type and rural/urban category. Where percentages are above 100% a type of respondent is over-represented in the survey, as was the case for remote areas.
Table 6.1
Total base numbers and sum of household weights by questionnaire type and rural/urban indicator.
| No. of respondents | Sum of household weights | % of respondents to sum of weights |
All households | 4119 | 4119 | 100% |
Type of questionnaire | | | |
A | 1989 | 1990 | 100% |
B | 2130 | 2131 | 100% |
Urban/rural indicator | |
The primary cities | 1332 | 1616 | 82% |
Other urban | 1019 | 1255 | 81% |
Small accessible towns | 338 | 392 | 86% |
Small remote towns | 286 | 130 | 221% |
Accessible rural | 765 | 494 | 155% |
Remote rural | 379 | 233 | 162% |
Individual weights were affected by the rural/urban indicator in a similar manner to the household weights. The individual weights showed that age and gender groups were differentially represented in the sample, as is shown in table 6.2. Older people and women are over-represented in the sample. The results are shown for the data before post-stratification to population totals. After post-stratification a similar pattern, but slightly more extreme, was found.
Table 6.2
Percentage of respondents to sum of household weights (without post-stratification) by questionnaire type and age / gender.
| Males | Females |
| No of respondents | Sum of individual weights | % of respondents to sum of weights | No of respondents | Sum of individual weights | % of respondents to sum of weights |
All households | 1729 | 1816 | 95% | 2390 | 2303 | 104% |
Type of questionaire | |
A | 830 | 875 | 95% | 1159 | 1102 | 105% |
B | 899 | 941 | 96% | 1231 | 1201 | 102% |
Age group | |
16-24 | 147 | 222 | 66% | 197 | 262 | 75% |
25-34 | 236 | 265 | 89% | 404 | 386 | 105% |
35-44 | 333 | 338 | 98% | 442 | 442 | 100% |
45-54 | 293 | 329 | 89% | 346 | 401 | 86% |
55-64 | 272 | 263 | 103% | 342 | 335 | 102% |
65-74 | 284 | 256 | 111% | 321 | 265 | 121% |
75+ | 164 | 143 | 115% | 338 | 212 | 160% |
6.2 Weighting factors
Design weights
To recap on the sampling approach, the issued sample was stratified by the 6 category rural/urban indicator, defined as the predominant one for each ED. The post-code address file (PAF) was stratified by this indicator and a sample of EDs was drawn, and then 8 addresses plus 2 substitute addresses were selected per ED. The EDs were sampled with probability proportional to the number of addresses per ED and the sample was balanced by mosaic categorisation within each stratum. The sampling fractions differed between strata.
To make the issued sample representative of the PAF each selected household needs to be given a weight inversely proportional to its selection probability. Data on the number of households in the PAF and in the issued sample were provided by the sampling agency. These data have been used to calculate household design weights which are given in table 6.3. The weights have been normalised so that their sum adds to the total in the sample.
Table 6.3
Household design weights to make sample representative of the PAF
Urban-rural indicator | Weight | Number in issued sample |
Primary cities | 1.1851020 | 2660 |
Other urban | 1.1855619 | 2060 |
Small accessible town | 1.1669937 | 660 |
Small remote towns | 0.5057057 | 480 |
Accessible rural | 0.6411200 | 1440 |
Remote rural | 0.6702913 | 700 |
The characteristics of the unweighted and the reweighted issued sample was then checked against the proportions in the PAF for these two classifications
The results are given in tables 6.4 and 6.5.
Table 6.4
Percentages by Mosaic category of the ED in issued sample, PAF and weighted issued sample
Mosaic category for ED | Number in issued sample | Percentages |
Issued sample | PAF | Weighted sample |
Urban establishment | 930 | 11.6 | 12.3 | 12.4 |
Burdened borrowers | 800 | 10 | 10.6 | 10.5 |
Better off tenants | 1130 | 14.1 | 14.6 | 14.6 |
Industrial success | 570 | 7.1 | 7.3 | 7.3 |
Low rise council | 640 | 8.0 | 8.0 | 8.0 |
Council flats | 420 | 5.2 | 6.0 | 6.1 |
Low spending elders | 520 | 6.5 | 7.3 | 7.3 |
Hi-rise tenements | 430 | 5.4 | 6.3 | 6.3 |
Metro lifestyles | 690 | 8.6 | 10.1 | 10.0 |
White collar owners | 1340 | 16.8 | 13.0 | 13.2 |
Open countryside | 530 | 6.6 | 4.4 | 4.4 |
As expected, the re-weighting has made the sample match the sampling frame much better than the unweighted sample, and no further design re-weighting for these categories should be required.
Table 6.5
Percentages by local authority in issued sample, PAF and weighted issued sample
| Number in issued sample | Issued sample
(%) | PAF
(%) | Weighted sample(%) |
Aberdeen | 340 | 4.2 | 4.5 | 4.8 |
Aberdeenshire | 390 | 4.9 | 4.1 | 3.6 |
Angus | 190 | 2.4 | 2.1 | 2.5 |
Argyll & Bute | 240 | 3 | 1.9 | 1.9 |
Clackmannanshire | 80 | 1 | 0.9 | 1.1 |
Dumfries & Galloway | 270 | 3.4 | 2.8 | 2.7 |
Dundee | 190 | 2.4 | 3.1 | 2.8 |
East Ayrshire | 120 | 1.5 | 2.2 | 1.6 |
East Dunbartonshire | 90 | 1.1 | 1.8 | 1.3 |
East Lothian | 150 | 1.9 | 1.7 | 1.5 |
East Renfrewshire | 90 | 1.1 | 1.5 | 1.3 |
Edinburgh | 650 | 8.1 | 9.4 | 9.5 |
Falkirk | 200 | 2.5 | 2.8 | 2.7 |
Fife | 550 | 6.9 | 6.7 | 6.8 |
Glasgow | 920 | 11.5 | 13.5 | 13.6 |
Highland | 480 | 6 | 4.1 | 4.3 |
Inverclyde | 140 | 1.8 | 1.7 | 2.1 |
Midlothian | 100 | 1.2 | 1.4 | 1.3 |
Moray | 160 | 2 | 1.6 | 1.7 |
North Ayrshire | 160 | 2 | 2.7 | 2.1 |
North Lanarkshire | 410 | 5.1 | 5.9 | 5.7 |
Orkney | 50 | 0.6 | 0.4 | 0.4 |
Perth & Kinross | 240 | 3 | 2.5 | 2.6 |
Renfrewshire | 290 | 3.6 | 3.6 | 4.2 |
Scottish Borders | 170 | 2.1 | 2.2 | 1.8 |
Shetland | 60 | 0.8 | 0.4 | 0.4 |
South Ayrshire | 220 | 2.8 | 2.2 | 2.5 |
South Lanarkshire | 430 | 5.4 | 5.6 | 5.7 |
Stirling | 130 | 1.6 | 1.5 | 1.5 |
West Dunbartonshire | 170 | 2.1 | 1.9 | 2.5 |
West Lothian | 250 | 3.1 | 2.8 | 3.1 |
Western Isles | 70 | 0.9 | 0.6 | 0.5 |
Non-response weights - household contact non-response
The sampling frame consisted of 8000 households, 10 per ED, of which two were replacements to be used only if required to achieve a target of 8 valid addresses (replacing any properties that were empty, businesses, holiday homes, demolished, or not located).
Interviewers attempted to contact 6,743 addresses. Of these 427 were not valid addresses for households. Thus an attempt was made to contact 6,316 households. Initial contact was made with 4,582 (72.5%) of these. After the initial contact the selected person was identified and completed an interview in 4,119 households (89.9% of those where initial contact was made). This gives an overall response rate for valid addresses of 65.2%. These figures count as household non-responders the 246 cases where the SIS was not returned. This approach errs on the side of underestimating the response rates, as some of these unreturned SIS's may not have been valid addresses.
The influence of the following factors on household non-response was investigated by logistic regression: rural/urban indicator, unitary authority, mosaic classification and whether the household lived at an address with more than one household (multiple occupancy indicator). All of these had some influence on non-response, even after adjusting for the other factors. The rural/urban indicator had the greatest influence on non-response with much better response rates in rural areas. Local authority was also important, even after allowing for the rural/urban indicator.
Mosaic classification and the multiple occupancy indicator had smaller effects, but still significant even after adjusting for other factors. Interactions between these factors were investigated, but they were not found to be important. Details of simple response rates by each of these factors are in table 6.6.
Figure 6.1
Household non-response weights

A household non-response weight was calculated in proportion to the inverse of the predicted probability, from the logistic regression model, normalised to make the sum of the weights equal to the total sample size for the responders. It ranged from 0.76 to 1.64 and its distribution is shown in the figure opposite.
Table 6.6
Household contact response rates by different factors
| Number of households | Response rate | | LA Area | Number of households | Response rate |
| |
Rural/Urban Indicator | | | Aberdeen | 272 | 73.9 |
The primary cities | 2109 | 69.4 | Aberdeenshire | 312 | 72.8 |
Other urban | 1643 | 70.4 | Angus | 151 | 65.6 |
Small accessible towns | 519 | 74.0 | Argyll & Bute | 182 | 68.7 |
Small remote towns | 380 | 82.4 | Clackmannanshire | 64 | 50.0 |
Accessible rural | 1135 | 74.4 | Dumfries & Galloway | 208 | 72.6 |
Remote rural | 530 | 79.6 | Dundee | 144 | 67.4 |
Scottish Mosaic | East Ayrshire | 95 | 68.4 |
Urban establishment | 737 | 69.5 | East Dunbartonshire | 72 | 86.1 |
Burdened borrowers | 640 | 68.3 | East Lothian | 120 | 88.3 |
Better off tenants | 901 | 76.6 | East Renfrewshire | 72 | 72.2 |
Industrial success | 451 | 73.4 | Edinburgh | 514 | 67.7 |
Low rise council | 511 | 78.9 | Falkirk | 160 | 79.4 |
Council flats | 327 | 69.1 | Fife | 432 | 77.5 |
Low spending elders | 414 | 71.3 | Glasgow | 733 | 68.1 |
Hi-rise tenements | 343 | 67.9 | Highland | 360 | 82.8 |
Metro lifestyles | 541 | 67.8 | Inverclyde | 111 | 78.4 |
White collar owners | 1055 | 75.4 | Midlothian | 78 | 82.1 |
Open countryside | 396 | 74.0 | Moray | 128 | 83.6 |
Multiple occupancy | North Ayrshire | 126 | 87.3 |
Single household | 5980 | 72.4 | North Lanarkshire | 328 | 68.0 |
Multiple households | 336 | 75.9 | Orkney | 38 | 81.6 |
| Perth & Kinross | 189 | 69.3 |
Renfrewshire | 230 | 74.3 |
Scottish Borders | 134 | 78.4 |
Shetland | 48 | 89.6 |
South Ayrshire | 175 | 76.0 |
South Lanarkshire | 344 | 62.2 |
Stirling | 104 | 73.1 |
West Dunbartonshire | 136 | 69.1 |
West Lothian | 200 | 60.0 |
Western Isles | 56 | 87.5 |
Non-response weights, person contact non-response
Of the households where an initial contact was made a completed interview with the selected person was not available for a further 463 households. This was due to the person refusing or, in the case of households with more than one person, to the selected person not being contactable, or because only a partial interview could be obtained. The number of people in the household (known for all but 145 of the non responders) was the most important predictor of person non-response. Mosaic category was also important in predicting person non-response, but other factors were not important. Response rates appear somewhat lower in more affluent areas. This produced a person-non-response weight, again inversely proportional to the response probability. This weight had a relatively small range 0.93 to 1.12 when normalised to sum to the total responders.
Table 6.7
Person contact non-response
| Number of households | Response rate (%) |
Persons in household |
1 | 1716 | 96.7 |
2 | 2091 | 91.2 |
3 or more | 620 | 87.4 |
Mosaic code |
Urban establishment | 512 | 90.2 |
Burdened borrowers | 437 | 88.3 |
Better off tenants | 690 | 88.6 |
Industrial success | 331 | 91.5 |
Low rise coucil | 403 | 88.3 |
Council flats | 226 | 90.7 |
Low spending elders | 295 | 87.1 |
Hi-rise tenements | 233 | 100.0* |
Metro lifestyles | 367 | 91.0 |
White collar owners | 795 | 90.2 |
K Open countryside | 293 | 87.0 |
Total | 4582 | 89.9 |
* this is not related to the number of single adult households in these areas. Just 61% of these interviews were with single adult households and 35% were with 2 adult households.
Combined household non response weight
The combined household non response weights was produced by the product of the household and person non-response weights, and had a distribution very similar to the first of these shown at figure 6.1.
6.3 COMBINED WEIGHT FOR HOUSEHOLDS AND FOR INDIVIDUAL RESPONDENTS
The combined household weight is obtained by multiplying the design weight by the two household non-response weights. This gives a range of values from 0.40 to 1.81. This is the column 'hweight' on the excel weighting spread sheet provided. It has 1,088 distinct values, but only 123 if rounded to two decimal places. A rounded version (hweightr) is also provided on the spreadsheet.
To get a combined individual weight, which will attempt to make the sample representative of the population in private households, this weight needs to be multiplied by the number of people in the household (table 6.8 below). Households of 5 or more were given a weight of 5.
Table 6.8
Responding households by number of people in household
Persons in household | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
Households | 1663 | 1913 | 355 | 154 | 26 | 7 | 1 |
The final individual weight was normalised to sum to the number of respondents. Its range is 0.21 to 4.19 and a histogram of it is shown below at figure 6.2. The distribution of this weight (iweight1) for the responders is given below. It has 1,176 distinct values, but the version rounded to the nearest 0.02 has only 166 distinct values. The rounded version is in column 'iweight1r' of the excel weighting spread sheet. Use of rounded rather than exact weights makes almost no difference to the tables or the analysis.
Figure 6.2
Combined individual weight

6.4 AGE/SEX AND LOCAL AUTHORITY DISTRIBUTION OF RESPONDENTS COMPARED TO GRO POPULATION ESTIMATES.
The age and sex distribution of the respondents was compared to the GRO mid year population estimates by age and sex
6. The respondents were grouped into 5 year age bands and figures are presented in table 6.9 for the respondents (both unweighted and weighted) by the combined individual weight (
iweight1).
Compared with the GRO (Scotland) figures, females are over-represented in the unweighted sample, while the reverse is true for young people especially young men. The weighted sample goes some way to correct this imbalance. There are several ways in which the weighting helps here. Most obviously because younger people tend to live in larger households, but also because both the design weights and the non-response weights gives higher weights to rural areas where the proportion of young people tends to be slightly lower. But a difference from the GRO(S) figures still persists. This difference is very similar to an equivalent difference found in the Scottish Household Survey.
Table 6.9
Age and sex distribution of respondents compared with the GRO(S) mid year population estimates (2000)* by age and sex.
| Males | | Females |
Unweighted survey | Weighted survey | GRO population estimates | Unweighted survey | Weighted survey | GRO population estimates |
All ages | 42.0 | 44.1 | 47.9 | 58.0 | 55.9 | 52.1 |
| |
16-19 | 1.3 | 2.3 | 3.2 | 1.7 | 2.5 | 3.1 |
20-24 | 2.3 | 3.0 | 4.0 | 3.1 | 3.9 | 3.8 |
25-29 | 2.6 | 3.0 | 4.2 | 4.0 | 4.1 | 4.1 |
30-34 | 3.1 | 3.4 | 4.9 | 5.8 | 5.3 | 4.9 |
35-39 | 4.2 | 4.2 | 5 | 5.9 | 5.6 | 5.1 |
40-44 | 3.9 | 4.0 | 4.5 | 4.9 | 5.1 | 4.6 |
45-49 | 3.7 | 4.2 | 4.0 | 4.2 | 4.9 | 4.1 |
50-54 | 3.4 | 3.7 | 4.1 | 4.2 | 4.8 | 4.2 |
55-59 | 3.4 | 3.2 | 3.3 | 3.9 | 4.0 | 3.5 |
60-64 | 3.3 | 3.2 | 3 | 4.4 | 4.1 | 3.3 |
65-69 | 3.4 | 3.1 | 2.6 | 3.8 | 3.3 | 3.1 |
70-74 | 3.5 | 3.1 | 2.2 | 4.0 | 3.2 | 2.8 |
75-79 | 1.8 | 1.7 | 1.6 | 3.5 | 2.5 | 2.4 |
80 and over | 2.2 | 1.8 | 1.4 | 4.7 | 2.7 | 3.1 |
Note 1 Entries in the table are percentages of the total estimated population, or of the total number of respondents (weighted or unweighted).
Note 2 2,000 mid year population estimates are used as these are the only ones
available easily with single year breakdowns.
The data in table 6.9 can be used to calculate a final individual weight that will make the weighted sample match the GRO mid year population. The factors to be multiplied into the existing weights to achieve this are given in table 6.10. Notice that the weights imply that women are over-represented in the sample at all but the youngest and the very oldest age groups. The pattern for men is different, with under-representation up to age 60 and then over-representation at the oldest age groups, compared with the GRO(S) population.
A final individual weight was be obtained by multiplying iweight1 by this factor and then normalising so that the final individual weight (iweight2) sums to the total responders. The weight (iweight2) has a range from 0.167 to 5.78 and a histogram is provided below at figure 6.3. A version of this rounded to the nearest 0.02 is also provided on the excel weighting spread sheet (iweight2r).
Table 6.10
Factor required to be applied to the weights to make the sample representative of the GRO(S) population estimates.
Age group | Males | Females |
16-19 | 1.38 | 1.25 |
20-24 | 1.31 | 0.99 |
25-29 | 1.43 | 1.01 |
30-34 | 1.45 | 0.94 |
35-39 | 1.20 | 0.91 |
40-44 | 1.15 | 0.91 |
45-49 | 0.95 | 0.84 |
50-54 | 1.11 | 0.88 |
55-59 | 1.04 | 0.87 |
60-64 | 0.94 | 0.82 |
65-69 | 0.86 | 0.95 |
70-74 | 0.70 | 0.90 |
75-79 | 0.96 | 0.98 |
80 and over | 0.77 | 1.17 |
Figure 6.3
Final weight adjusted to GRO(S) population estimates

Note that this final stage of weighting was not carried out for the Scottish Household Survey. So any comparisons with this survey should be made using the weight iweight1.
As a final check the local authority population estimates provided by GRO(S) were checked against the responders, weighted according to the final individual weight (iweight2). The two sets of percentages were found to be in very good agreement with each other.
6.5 Some cautions about weighting.
This chapter has provided three different weights that could be used in conjunction with this survey. This section will discuss when they should be used, and also in what circumstances a completely unweighted analysis may be more appropriate.
The purpose of weights is intended to make the results of a survey applicable to the whole population of an area or country. Here we have two kinds of weights. The first is a household weight and as such, likely to be applicable to only a minority of analyses to be carried out here. In addition two weights are supplied that could be used for individual analyses. The first adjusts for design factors and for internal evidence of non-response by area and size of household. The second adjusts for these factors and also post-stratifies to national population estimates. The terminology for types of weight used here is taken from Lessler and Kalsbeek (1992)
7. The household weight and the first individual weight incorporate
design weights and
non-response weights and the third also includes a post-stratification
weight.Weighting for
non-response and for the
design has a cost in reducing precision.
Non-response weighting is based on strong assumptions. It implicitly assumes that the responses of the non-responders (had we been able to get them) would be the same as those for responders with the same measured characteristics. This is a strong assumption, but one that is the norm in survey research. It may also be one that is worth making when the purpose of analysis is to obtain national population estimates or trends. Any inferences from such analyses will be to make statements about (e.g.) the proportion of the Scottish population who hold a particular view.
Not all inferences are of this type, however. We may be interested in comparing the views of urban and rural residents on a certain question. Here the interest may simply be in a comparison and the inference might equally well (and with fewer assumptions) be made for a population consisting of the cross-section of the Scottish Population included (by design and response) in the survey and willing to answer the questions. For this analysis unweighted calculations could be appropriate and, indeed, may sometimes give more precise answers to the questions posed. Simpler analyses will allow exploratory work to be carried out more easily and are not necessarily invalid. For these reasons, design effects are calculated in the next chapter for both weighted and unweighted analyses.
« Previous | Contents | Next »