On this page:

Scotland's People: Scottish Household Survey Fieldwork Outcomes 2005/2006

« Previous | Contents | Next »

Listen

3. Weighting

Two types of weighting are potentially necessary with a sample of this kind. The first is intrinsic to the survey design and represents weights necessary to compensate for unequal probabilities of selection for individuals, households or other units of analysis. The second may be necessary to counteract the effects of non-response bias. Although these represent two distinct rationales for weighting, in terms of analysis the different weights are combined into a single weighting variable for each unit of analysis.

In the SHS, there are five weights that can be used - four in the main survey dataset and one specific to the travel diary. However, LA_ WT and IND_ WT are used for most analyses, with the others used for smaller specific subsets of the sample.

  • LA_ WT which is used for analysis of data about the household and data collected from or about the HIH and spouse. This includes all variables asked in the first part of the interview, apart from the questions about the random schoolchild and the random child receiving childcare.
  • IND_ WT which is used for analysis of data in derived variables about the random adult or collected from the random adult. This includes all variables in the second part of the interview.
  • KID_ WT which is used for analysis of questions related to the random schoolchild - HE9 to HE17N inclusive (see Questionnaire).
  • RANKIDWT which is used for question HE5 where a child receiving childcare is selected at random from all the children receiving childcare in the household.
  • TRAV_ WT, contained in the travel diary data, which is used for analysing that data.

Design weighting

Weighting for analysis based on household data

The weight for analysis of household data, LA_ WT, has two main elements. Firstly, it is necessary to 'weight up' those local authorities which were under-sampled and 'weight down' those which were over-sampled (this is a weight of the first type mentioned above, which adjusts for unequal probabilities of selection). Secondly, the weight addresses any disproportionality introduced by response rates differing from the target for each local authority. The combination of these two elements is shown in Table 3-1. (The weights for some local authorities vary between one quarter and the next because the number of achieved interviews fluctuates between quarters.) The final weighted sample profile across the two years should, therefore, correctly reflect the distribution of households across Scotland's local authorities.

Weights are calculated for each local authority so that each quarterly data file the data is nationally representative in each quarter. This should allow any published findings to be reproduced by selecting the relevant quarter's data. In practice, however, it may not be possible to reproduce exactly some of the results from earlier publications if the data for that quarter were subsequently changed ( e.g. to correct errors that were identified later) and because there is some overlap between the quarter in which interviews take place and the quarter's data with which it is processed. For example, the data processed as Q4 2005 contained data from interviews carried out in the first quarter of 2006 so although they were weighted as Q4, they have a value of 1 for the Quarter variable.

Table 3-1: Weights to account for disproportionate sampling and differences in household response rates by local authority and quarter, 2005/2006

2005

2006

Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Aberdeen City

1.13

1.19

1.30

1.02

1.14

1.25

1.19

1.16

Aberdeenshire

1.14

1.13

0.92

1.13

1.17

1.01

0.86

1.30

Angus

0.85

1.35

1.03

1.18

0.98

1.08

1.23

1.64

Argyll and Bute

0.88

1.12

0.84

0.95

0.98

1.16

0.76

1.18

Clackmannanshire

0.49

0.52

0.39

0.34

0.41

0.42

0.42

0.45

Dumfries and Galloway

0.91

1.41

1.26

1.47

1.00

0.97

1.12

1.34

Dundee City

0.97

1.07

1.14

1.20

0.97

1.18

1.02

1.03

East Ayrshire

0.92

0.85

1.05

1.96

1.00

1.17

0.85

1.30

East Dunbartonshire

1.11

0.97

1.13

0.78

1.04

1.00

1.31

0.99

East Lothian

1.24

0.75

1.18

1.26

0.96

0.98

0.90

1.15

East Renfrewshire

0.69

1.04

0.85

0.83

0.99

0.69

0.99

0.91

Edinburgh City

1.21

0.94

1.19

0.96

1.03

1.01

1.18

0.99

Eilean Siar

0.32

0.38

0.30

0.30

0.46

0.44

0.31

0.31

Falkirk

1.01

0.98

0.93

0.93

1.18

1.30

1.12

0.99

Fife

0.91

1.11

0.97

1.15

1.00

1.08

0.97

0.98

Glasgow City

1.22

1.17

1.16

1.21

1.22

1.16

1.23

1.00

Highland

0.94

1.20

1.11

1.34

1.13

1.17

1.05

1.17

Inverclyde

1.38

0.98

0.82

1.01

1.22

0.97

1.19

0.84

Midlothian

0.84

0.85

0.86

0.93

0.75

0.70

0.85

0.96

Moray

0.77

0.82

0.78

0.93

0.82

0.81

0.68

1.05

North Ayrshire

1.35

0.81

1.20

1.47

1.10

0.95

1.26

1.07

North Lanarkshire

1.21

1.01

1.11

1.02

0.95

0.99

1.12

0.97

Orkney

0.19

0.21

0.16

0.19

0.23

0.26

0.15

0.26

Perth and Kinross

1.11

1.29

1.40

0.87

1.50

1.33

1.35

1.03

Renfrewshire

1.15

1.24

1.00

0.85

1.12

1.12

1.04

1.18

Scottish Borders

0.81

1.07

1.32

0.71

0.98

1.09

1.11

0.97

Shetland

0.17

0.21

0.17

0.25

0.26

0.20

0.22

0.23

South Ayrshire

1.17

1.00

0.97

1.14

1.11

1.06

1.13

1.23

South Lanarkshire

1.07

1.27

1.13

1.05

1.14

1.10

1.01

1.14

Stirling

0.93

0.72

0.80

0.73

0.68

0.72

0.74

0.91

West Dunbartonshire

1.09

0.99

1.12

1.48

1.05

0.99

1.28

0.96

West Lothian

1.34

0.85

1.08

1.03

0.79

1.00

1.25

0.86

No other weight is applied across all cases in order to compensate/adjust for the unequal probabilities of selection. Strictly speaking, however, a corrective weight should be applied in those cases in which the Multiple Occupancy Indicator ( MOI) on the Postcode Address File ( PAF) is found to be inaccurate. The reason for this is that a property-type bias might otherwise be introduced. For example, if tenement properties were consistently found to contain multiple dwellings when the MOI had indicated that they contained just one, each achieved interview at such an address should be given a weight proportional to the actual number of dwellings, to compensate for the reduced probability of selection for each dwelling at that address. All properties within that local authority area should then be weighted back down slightly in order that the actual and weighted sample sizes remain the same.

In practice, the MOI has been found to be inaccurate in only about 2% of cases. The impact of weighting to correct for these would have been negligible so it was decided not to weight by the MOI in order to avoid additional complexity in the weighting scheme for the survey.

Similarly, in theory an additional weight should be applied in cases where a dwelling contains more than one household, only one of which is interviewed, in order to adjust for the lower probability of selection for each of the households in that dwelling. In practice, however, as only a very small number of dwellings were found to contain more than one household, the use of such a weight would make very little difference to the overall results, and it was therefore felt that it was not worthwhile introducing further complication to the weighting calculations.

Weighting for analysis based on individual (random adult) data

Using the Postcode Address File produces a sample of households, so for analysis of individual level data it is also necessary to weight the responses of the random adult by the number of adults resident in the household who were eligible for interview. 2 The reason for this is that individuals living in larger households have a lower probability of selection than adults in, for example, single adult households where that one person must be sampled.

As a result of this, the unweighted profile of 'random adult' respondents will tend to be skewed towards those sections of the population most likely to live in households with fewer adults (older people and older females in particular) and away from those likely to live in households with larger numbers of adults (younger people). Once the data are weighted by the number of eligible adults in the household, however, one should see the profile correct itself significantly. In most surveys of this kind, however, some under-representation of younger people and males, and over-representation of older people and females, is likely to remain because of the effects of non-response bias. Depending on the extent of the remaining skew, it may be necessary to adopt further corrective measures but this has not been the case so far.

Analysis of data based on the random adult also requires a further weight to take account of differences between the number of such interviews completed in each local authority area and the actual adult population of such areas. Like the element of the household data weight which adjusts for differences in fieldwork outcomes by local authority, this is intended not to compensate for unequal probabilities of selection but to ensure that the final profile of 'individual' data correctly reflects the relative populations of the different local authority areas once variations in fieldwork outcomes have been assessed. This is not identical to the weight described for analysis of household data, since variation in response rates for the second part of the interview may have produced a slightly different distribution from that of 'householder' interviews. The weights required for each local authority (which are then multiplied by the number of adults in the household to create the weight for each case, which is then scaled so that the number of weighted cases is the same as the total number of random adult interviews) are summarised below.

Table 3-2: Weights to account for disproportionate sampling and differences in random adult response rates by local authority and quarter, 2005/2006

2005

2006

Q1

Q2

Q3

Q4

Q1

Q2

Q3

Q4

Aberdeen City

1.10

1.28

1.22

1.11

1.21

1.25

1.21

1.21

Aberdeenshire

1.10

1.02

0.86

1.05

1.15

0.86

0.81

1.07

Angus

0.87

1.39

1.13

1.14

0.92

1.10

1.20

1.56

Argyll and Bute

0.87

1.13

0.87

0.88

1.00

1.16

0.76

0.99

Clackmannanshire

0.46

0.49

0.38

0.36

0.37

0.39

0.41

0.42

Dumfries and Galloway

0.76

1.43

1.23

1.52

0.93

0.90

1.07

1.12

Dundee City

1.07

1.34

1.23

1.57

1.01

1.32

1.16

1.13

East Ayrshire

0.92

0.84

1.14

1.68

0.93

1.13

0.78

1.22

East Dunbartonshire

1.09

0.86

1.03

0.69

1.12

0.96

1.23

0.84

East Lothian

1.28

0.79

1.32

1.29

1.09

1.11

1.00

1.30

East Renfrewshire

0.61

0.89

0.78

0.77

1.00

0.60

0.96

1.06

Edinburgh, City of

1.32

1.03

1.22

1.03

1.12

1.08

1.31

1.06

Eilean Siar

0.28

0.37

0.31

0.27

0.48

0.39

0.29

0.29

Falkirk

0.97

0.97

0.96

0.92

1.16

1.36

1.21

1.06

Fife

0.89

1.03

0.96

1.08

0.97

1.05

0.98

0.97

Glasgow City

1.31

1.20

1.12

1.28

1.33

1.39

1.37

1.14

Highland

0.95

1.27

1.31

1.28

1.05

1.16

1.00

1.10

Inverclyde

1.35

0.92

0.71

1.08

1.34

0.87

1.14

0.91

Midlothian

0.73

0.84

0.77

1.04

0.71

0.65

0.84

0.90

Moray

0.67

0.72

0.76

0.85

0.75

0.67

0.63

0.99

North Ayrshire

1.35

0.83

1.27

1.31

1.08

0.93

1.17

0.95

North Lanarkshire

1.26

0.95

1.17

1.01

1.05

0.99

1.13

1.01

Orkney

0.20

0.19

0.15

0.17

0.19

0.22

0.13

0.23

Perth and Kinross

1.29

1.44

1.43

1.02

1.57

1.37

1.47

1.11

Renfrewshire

1.07

1.20

0.90

0.76

1.04

1.13

0.93

1.01

Scottish Borders

0.75

1.09

1.52

0.76

1.12

1.15

1.12

1.07

Shetland

0.19

0.20

0.17

0.23

0.21

0.20

0.18

0.20

South Ayrshire

1.16

1.06

0.99

1.05

1.23

1.07

1.25

1.25

South Lanarkshire

1.08

1.27

1.11

0.99

1.06

1.11

0.96

1.15

Stirling

1.05

0.72

0.82

0.73

0.61

0.74

0.71

0.77

West Dunbartonshire

1.62

1.06

1.10

1.51

0.96

0.98

1.31

0.94

West Lothian

1.14

0.79

1.12

1.14

0.78

0.94

1.33

0.93

Weighting for analysis based on the 'random schoolchild'

Data relating to the information collected about a 'random schoolchild' needs to be weighted so that this information will represent correctly the population of schoolchildren resident within households. If not, it will proportionately over-represent the characteristics and experiences of 'only' children and under-represent those of children from larger families. The weight for the random schoolchild case is created by combining the number of schoolchildren in the household and the relevant local authority weight, and scaling the result so that the number of weighted cases is the same as the total number of random schoolchildren about whom the questions were asked.

Weighting for the selection of a random child receiving childcare

In households with more than one child using some form of childcare, one child is selected randomly by the CAPI script and questions about the use of childcare are asked in relation to that person. This data needs to be weighted to account for the lower probability of each child being selected in households with multiple children. The weight for the random child is created by combining the number of children in the household using childcare and the relevant local authority weight, and scaling the result so that the number of weighted cases is the same as the total number of children about whom the questions were asked.

Weighting for analysis based on the Travel Diary

Examination of the SHS data suggests that significantly fewer interviews take place on Fridays, Saturdays and Sundays than on other days of the week. As differences in the proportions of adults interviewed on each day of the week will affect the Travel Diary data's representativeness of travel patterns for the week as a whole, it was decided to introduce a weight to compensate for this. This simply 'up-weights' interviews carried out on days of the week on which fewer than one-seventh of all interviews have taken place and 'down-weights' those carried out on days on which more than one-seventh of all interviews have been completed.

It is also apparent that the distribution of interviews by the day of the week differs for certain sub-sections of the adult population. For example, disproportionately more adults in full-time employment are interviewed at the weekend (due to their greater availability then), thus yielding an inaccurate picture of the travel patterns of those in full-time employment. The Travel Diary weighting factor is therefore refined to compensate for this.

The weight created for any analysis of the Travel Diary combines the above weighting factors and the existing 'random adult' weights. Further information about the Travel Diary, including a comparison to the National Travel Survey, is available in the Travel Diary User Guide. 3

No additional corrective weighting

The weighting scheme for the SHS is intentionally simple. This reflects, in part, a desire to keep the processes of the survey straightforward so that the data can be made available for analysis as quickly as possible. It also reflects the limited extent to which the SHS data differs substantially from comparator data, as shown below. Thus, no additional corrective weighting has ever been applied to the data beyond that required to account for sample design and differential response rates between local authorities.

This aspect of the survey has been subject to review by the Office for National Statistics as part of a major study comparing non-respondents to the SHS with Census data. 4 This study concluded that while comparison with the Census showed some bias in the SHS, this was not substantial although some corrective weighting would be recommended. Further work looking at the scope for corrective weighting has been undertaken and this is likely to be developed further with revised weighting arrangements developed for the 2007-2010 phase of the survey.

« Previous | Contents | Next »

Page updated: Monday, July 30, 2007