« Previous | Contents | Next »
Listen
PUBLIC ATTITUDES TO THE ENVIRONMENT IN SCOTLAND - TECHNICAL REPORT
Chapter 7 COMPLEX STANDARD ERRORS AND DESIGN EFFECTS
7.1 Introduction
Design factors are required to adjust the confidence intervals calculated from standard packages that assume simple random sampling.
The design factors for means and proportions ranged from 1.0 to 1.6 . Design effects will be lower when comparisons are made between groups that are differentially weighted and also when comparisons are standardised (as in a regression analysis) for factors that are differentially weighted. Post-stratification will usually reduce design effects, but for this survey it was fairly neutral since the age/sex differences in most questions were fairly small.
It is proposed that these average design factors could be applied for exploratory analyses. A set of bootstrap results is available that allows the standard errors of important comparisons to be calculated using macros written for standard packages.
For exploratory analyses it is suggested that design factors of 1.2 are used for unweighted means, and 1.4 for weighted means. For sex comparisons the equivalent design factors might be taken as 1.0 and 1.2. As a crude approximation, the values of chi-squared tests might be adjusted by dividing the chi-squared statistics by the square of the design factors.
7.2 Complex standard errors and design effects.
Since the survey design involves stratification and clustering, the standard errors of means and proportions calculated from survey data will be different from those assuming a simple random sample. If we use the mean of the survey responses for an item x to estimate the population mean, m
x , then the standard error of the mean expresses our uncertainty in this estimate. For a simple random sample of size
N the expression for the standard error of the mean is
s.e.(x) = s.d.(x)/÷ n, where
s.d.(x) is an estimate of the standard deviation of
x calculated from the sample and
n is the sample size.
There are two occasions on which we may wish to calculate a weighted mean, with weights
w, rather than an unweighted mean. The first, which does not apply in this survey, is when the observations in a data file are in fact means of subgroups. So, for example, a record with a weight of 2 would correspond to the mean of two observations. This is the assumption that is made in many statistical analysis packages when a weighted analysis is requested (e.g. in simple tables and frequencies in SPSS and in PROC means in SAS
8). For this type of weighted mean [
w.mn(x)] the expression for the standard error is
……………………………………………….…………………….(1)
where
is the weighted estimate of the standard deviation of
x. The formulae for proportions are identical, since a proportion can be considered as the mean of a 0/1 variable, except that by convention the
(N-1)/N adjustment is usually not carried out. These are the formulae used by SPSS in calculating the standard error of weighted means and proportions. Since the weights provided for the EAS 2002, have been normalised to add to the total sample size,
, when the whole sample is used and there are no missing observations.
The second type of weighting, the one that applies here, is weighting to make the sample representative of the population because of unequal sampling fractions. The advantage of such a reweighting is to reduce the bias associated with different survey responses in different strata. However, this reduction in bias comes at a cost of increasing the variability of the results. In the case of the Environmental Attitude Survey responses, the bias introduced by using an unweighted means or proportions rather than a weighted means is usually small (see chapter 6).
The standard error of a weighted mean where no bias is introduced is
…………………….………………………….(2). The ratio of expression (2) to expression (1) is always greater than 1.0. The square of this ratio represents a design effect (DE) from unequal probability sampling where there is no bias, and no allowance for stratification or clustering. For the EAS sample these design effects are 1.10 for hweight, 1.37 for iweight1, and 1.46 for iweight2. The design factor (
deft) is the simple ratio of the standard errors, without taking the square. The
defts here are thus 1.05, 1.17 and 1.21.
In addition to the effect of weighting, further design effects are induced by the clustering and stratification of the sample. The bootstrap methods described below can be used to calculate standard errors that are appropriate to the specific sampling design used in the EAS. The extent to which these complex standard errors (
s.e.c.(x)) differ from the simple weight adjusted standard errors (expression (2)) values will depend on the extent to which the particular
x is associated with the stratification factors and is clustered within the primary sampling units (PSU's).
Bootstrap methods to calculate complex standard errors and design effects.
A bootstrap methodology has been used to calculate complex standard errors and design effects. Special methods are required to ensure that bootstrap methods are adequate to deal with clustered and stratified samples. These methods have been evaluated favourably in a recent paper that compared the accuracy of different re-sampling methods for the Labour Force Survey.
9A simple bootstrap method involves taking many samples from the observed data,
with replacement, and calculating the variation between bootstrap samples. When a sample is clustered by PSUs and stratified we require to mimic the sample design by selecting a sample of PSUs, with replacement, within each of the strata. This procedure itself does not produce bootstraps that are equivalent to the original stratified sample, so a recalibration step is required to adjust the results
10.
The EAS sample selection procedure was initially stratified by the rural/urban indicator (6 categories). Then a sample of EDs was selected, balanced by the predominant Scottish Mosaic classification (11 categories) of the ED. This is equivalent to a stratified sample of the cross-classification of the rural/urban index and the Mosaic classification. The respondents were thus grouped into strata by this cross-classification, but some cells were empty and strata with few selected responding EDs were pooled to ensure a minimum of 4 EDs per stratum. The bootstrap re-sampling of clusters was then carried out within each of these 39 strata and a calibration adjustment applied. Weighted (without post stratification, ie using iweight1) and unweighted design effects were then calculated from the bootstrap samples. The post-stratified (using iweight2) design effects and
defts were calculated by first adjusting each bootstrap sample to the population totals. The results presented here are from 200 bootstraps, which assure accuracy to 0.05 for the
defts.
The results were checked for robustness in various ways. They were repeated to ensure consistency and they were compared with model-based inference using multi-level modelling. Consistent and robust results were obtained. All calculations were carried out in R, with code specially written for this application
11.It is only necessary to carry out the bootstrap procedure once. Sets of bootstrap results are produced that can be applied to any statistic of interest. These results have been made availableto the Scottish Executive, along with some basic instructions for their use. This allows users to calculate the design effect directly for any statistic of interest.
Sample
defts for means and proportions and their differences are given for selected questions. The differences can have smaller design effects than the overall means, particularly when the groups being compared are internally homogeneous in the weights of the observations. Some examples are included with the results below. Other features, such as chi-squared tests and regression analyses will not give valid inferences for the survey data. Again the set of bootstrap results can readily be used to carry out a correct analysis.
The section below summarises results for a few variables. A full explanation of how to use the method is given in relation to the first set of environmental attitude questions (G1), and summary results are given for a selection of other variables.
7.3 Results for design factors and complex standard errors
General methods
A single set of bootstraps was used for all the calculations. This took the form of four sets of 200 bootstrap weights (unweighted, and weighted with hweight, iweight1, iweight2) that could be applied to the original data. Each separate bootstrap estimate can then be obtained by calculating the weighted statistic from the data file. A sample statistic can readily be calculated for the 200 bootstraps by calculating 200 weighted means from the survey data. It is likely that publicly-available SPSS and SAS macros could be identified that would facilitate their use.
Results for question G1
This is a rating scale question, shown in full in the questionnaire at appendix 3. Results for the answers to the percentages reporting ' very worried' for each of the topics described in Question G1 are given in table 7.1. As expected, the unweighted design factor was lower than the weighted ones. The percentages were calculated after excluding the respondents who replied 'Don't know' or 'not heard of', so the number of respondents differs for each part of the question. The design effects are due (to an approximately equal extent) to clustering of answers to questions and to reweighting where this was used.
The following example illustrates how to use the table to calculate the standard error of the proportion answering 'very worried' to Question G1a. The expression for the standard error of a weighted proportion calculated as
p%, assuming simple random sampling, is given by
. This is what would be calculated by SAS or SPSS. Here we will calculate it directly, which will not be necessary if computer output is used to get the s.e.
The weights for the EAS have been normalised to sum to the sample size (4119), so the number of respondents answering the question (4061) will be very close to the sum of the weights in the bottom line of this formula. So the formula for the standard error becomes
. We can use this formula to get the standard error of the first question (G1a) for the three types of weighting that can be applied in the survey. The calculations are summarised in table 7.2. In order to get the correct complex standard error one simply multiplies this standard error by the design factor and then calculates the approximate 95% confidence interval from the formula
p 1.96 x complex s.e.
Table 7.1
Design factors (deft) for %s reporting very worried to topics in Question G1 ('don't know' and 'not heard of' treated as missing)
Q no | Topic | N responding | % reporting'very worried' | Design factor |
un-weighted | iweight1 | iweight2 | un-weighted | iweight1 | iweight2 |
G1A | Pollution of rivers, lochs and seas | 4061 | 30.8% | 31.1% | 30.5% | 1.18 | 1.38 | 1.35 |
G1B | Raw sewage put into the sea | 4015 | 50.0% | 50.9% | 49.9% | 1.19 | 1.39 | 1.42 |
G1C | Quality of drinking water | 4084 | 26.3% | 27.0% | 26.9% | 1.21 | 1.36 | 1.38 |
G1D | Nuclear Waste | 4018 | 47.5% | 48.4% | 47.9% | 1.26 | 1.4 | 1.4 |
G1E | Damage to the ozone layer | 3977 | 35.4% | 36.1% | 35.5% | 1.16 | 1.37 | 1.39 |
G1F | Road traffic | 4063 | 27.0% | 27.9% | 27.1% | 1.22 | 1.32 | 1.34 |
G1G | Fumes and smoke from factories | 4038 | 20.0% | 20.7% | 20.3% | 1.03 | 1.22 | 1.23 |
G1H | Global warming by greenhouse effect | 3916 | 26.0% | 26.5% | 26.2% | 1.09 | 1.29 | 1.25 |
G1I | Acid rain | 3807 | 21.1% | 21.8% | 21.5% | 1.08 | 1.34 | 1.36 |
G1J | Pesticides, fertilisers and chemical sprays | 4015 | 28.3% | 27.7% | 26.7% | 1.2 | 1.36 | 1.32 |
G1K | Waste disposal | 4011 | 25.1% | 26.3% | 25.6% | 1.16 | 1.35 | 1.33 |
G1L | Protection of wildlife | 4019 | 29.3% | 28.9% | 28.4% | 1.2 | 1.32 | 1.28 |
G1M | Generation of electricity by nuclear power | 3911 | 21.3% | 20.9% | 20.5% | 1.09 | 1.25 | 1.28 |
G1N | Using up non-renewable resources | 3862 | 21.6% | 22.0% | 21.9% | 1.15 | 1.31 | 1.37 |
G1O | Overfishing | 3956 | 20.7% | 20.0% | 19.5% | 1.15 | 1.31 | 1.3 |
G1P | Forestry | 3961 | 11.8% | 11.9% | 11.9% | 1.11 | 1.25 | 1.23 |
G1Q | Farming methods | 3914 | 12.5% | 12.4% | 12.0% | 1.16 | 1.28 | 1.27 |
G1R | Protection of areas of conservation interest | 3971 | 16.9% | 16.8% | 16.5% | 1.09 | 1.18 | 1.18 |
G1S | Derelict land in town and cities | 3988 | 13.0% | 13.3% | 13.0% | 1.04 | 1.1 | 1.1 |
G1T | New development in the countryside | 4012 | 16.5% | 16.2% | 15.8% | 1.26 | 1.36 | 1.33 |
G1U | Lack of access to parks | 4000 | 9.7% | 10.2% | 10.2% | 1.19 | 1.31 | 1.31 |
G1V | Fish farming | 3897 | 8.5% | 7.9% | 7.6% | 1.1 | 1.15 | 1.18 |
G1W | Genetically modified crops | 3868 | 26.3% | 25.7% | 25.1% | 1.16 | 1.25 | 1.25 |
| Average design factor | | | | | 1.15 | 1.30 | 1.30 |
Table 7.2
Illustration of calculation of confidence intervals from design factors for %very worried responding to Question G1a (n respondents = 4061)
Weighting | % | simple s.e. | deft | complex s.e. | 0.95 | | C.I. |
Unweighted | 30.8% | 0.72% | 1.18 | 0.86% | 29.1% | - | 32.4% |
iweight1 | 31.1% | 0.73% | 1.38 | 1.00% | 29.2% | - | 33.1% |
iweight2 | 30.5% | 0.70% | 1.35 | 1.0% | 28.6% | - | 32.4% |
Question G1 can also be considered as a score and a mean score calculated for those who gave a valid response from 4 ='very worried' to 1 ='not worried at all'. Design factors can be calculated for these mean scores. The results (details not shown) gave a mean design factor for unweighted analysis of 1.21 and of 1.41 and 1.49 for weighted analyses The results to illustrate the calculation of confidence intervals for question G1a are provided below. The expression for the standard error of a weighted mean is given by equation (1) above. Again, usually this will come straight out of computer output but here we calculate it for the formula, assuming as above that the sum of the weights for the responders is 4061. Table 7.3 give the results using the same procedures as used above for the proportion very worried.
Table 7.3
Illustration of calculation of confidence intervals from design factors for score for Question G1a (no of respondents = 4061)
Weighting | mean score | st.dev. | simple s.e. | deft | complex s.e. | 0.95 | | C.I. |
Unweighted | 3.00 | 0.858 | 0.013 | 1.21 | 0.016 | 2.97 | - | 3.03 |
iweight1 | 3.01 | 0.873 | 0.014 | 1.41 | 0.019 | 2.97 | - | 3.05 |
iweight2 | 3.00 | 0.860 | 0.013 | 1.39 | 0.019 | 2.96 | - | 3.04 |
We can use the same approach for calculating differences between groups in means or proportions. To obtain a standard error for a difference in means or proportions we simply calculate this difference for each set of bootstrap weights and then calculate the standard deviation of the bootstrap results. To calculate the design factor we calculate the ratio of this quantity to the standard error calculated on the basis of simple random sampling. For question G1, design factors have been computed for differences between males and females and for differences between rural areas (defined as remote rural, accessible rural and small remote town) and the other three more urban areas. Table 7.4 shows the mean design factor for these comparisons along with a summary of the results for the means and proportions.
Table 7.4
Average design factors for different statistics calculated for the 23 items in Question G1
Summary measure | Design factor |
Weighting | Unweighted | iweight1 | iweight2 |
% very worried | 1.15 | 1.30 | 1.30 |
Mean score | 1.20 | 1.36 | 1.37 |
Difference by sex in %very worried | 0.99 | 1.16 | 1.18 |
Difference by urban-rural in %very worried | 1.19 | 1.13 | 1.15 |
We can see a very marked reduction in the design factors, especially for the weighted analysis, for the differences compared to the means. This is true for the two factors, sex and rurality because they were factors that made a large contribution to the weighting. This might not be true for other comparisons.
To consider how one might use these average design factors to calculate a confidence interval for the mean sex difference in the proportion very worried for Question G1a. We first calculate the percentages very worried by sex and their difference. If this were being done on SPSS then we would obtain the simple standard error that is illustrated in table 7.5
12. The individual s.e.s for the sub-groups are calculated in the same manner as described above, and the s.e. for the difference is just the square root of the sum of the squares of the s.e.s for the two groups. Calculating a 95% confidence interval from the complex standard error gives (-4.93% to 1.75%) compared to the interval (-4.44%-1.24%)
Table 7.5
Illustration of calculation of the design effect for a difference using weight iweight2 for question G1a. by sex of respondent.
| sum of weights | %very worried | Simple s.e. | deft | Complex s.e. |
Men | 1960 | 29.70% | 1.03% | | |
Women | 2106 | 31.29% | 1.01% | | |
Difference | | -1.59% | 1.44% | 1.18 | 1.70% |
To summarise the results for question G1 very briefly, design factors of around 1.15 for unweighted analyses and 1.3 for weighted analyses seem appropriate for overall proportions. Similar results apply to derived scores for these variables and the proportions in other categories (results not shown). Design factors for comparisons of sex and rurality are considerably lower, generally below 1.2 for both weighted and unweighted analyses.
Design factors for other questions
Tables 7.6 and 7.7 show design factors for questions in set G6a and those in set G9 (see questionnaire at appendix 3 for further detail). For questions G6a the percentages who would be unhappy living next to each type of hazard were considered, and for G9 the percentage agreeing or strongly agreeing with the statement is used. In both cases don't know responses were excluded.
The tables show that the average design factors for these questions are very similar to those for question G1. Some individual questions in G6a, however, show larger or smaller design factors than the average, which may reflect a geographic clustering of experience of such installations. In question G9 the question about attitudes to making industry pay also had a much larger design effect.
Table 7.6
Design factors (deft) for %s reporting unhappy to topics in Question G6a ('don't know' and 'not heard of' treated as missing)
| % reporting'unhappy' | Design factor |
Q no | Topic | N responding | un-weighted | iweight1 | iweight2 | un-weighted | iweight1 | iweight2 |
G6a.a | Motorway | 4021 | 86.6% | 85.5% | 85.0% | 1.26 | 1.47 | 1.50 |
G6a.b | Nuclear power station | 4050 | 94.7% | 95.1% | 95.0% | 1.11 | 1.17 | 1.19 |
G6a.c | Waste incinerator | 4021 | 94.9% | 95.2% | 95.0% | 1.11 | 1.25 | 1.30 |
G6a.d | Nuclear waste processing plant | 4053 | 97.3% | 97.5% | 97.3% | 1.06 | 1.20 | 1.32 |
G6a.e | Rubbish dump / landfill site | 4043 | 96.2% | 96.4% | 96.4% | 1.16 | 1.26 | 1.32 |
G6a.f | Coal-fired power station | 3950 | 85.6% | 86.0% | 85.9% | 1.21 | 1.36 | 1.38 |
G6ag | Wind farm | 3859 | 37.8% | 37.3% | 36.6% | 1.39 | 1.42 | 1.41 |
G6ah | Recycling centre | 3929 | 60.0% | 59.0% | 58.0% | 1.35 | 1.50 | 1.53 |
G6a.i | Storage site for nuclear waste | 4036 | 97.8% | 98.0% | 97.9% | 1.09 | 1.15 | 1.24 |
G6a.j | Oil terminal | 3850 | 90.6% | 91.0% | 90.8% | 1.20 | 1.36 | 1.42 |
| Average design factor | | | | | 1.20 | 1.32 | 1.36 |
Table 7.7
Design factors (deft) for %s reporting agree or strongly agree to topics in Question G9 ('don't know' treated as missing)
| | % agree or strongly agreee' | Design factor |
Q G9 | Topic | N responding | un-weighted | iweight1 | iweight2 | un-weighted | iweight1 | iweight2 |
.a | Industry should be prevented from causing damage to environment | 3925 | 43.3% | 43.7% | 43.7% | 1.20 | 1.33 | 1.35 |
.b | New jobs should be created even if this sometimes causes damage | 3943 | 76.2% | 76.7% | 76.6% | 1.22 | 1.35 | 1.35 |
.c | Those who pollute the environment should be made to pay for it | 3926 | 25.7% | 25.5% | 25.5% | 1.31 | 1.58 | 1.61 |
.d | The Scot Exec(Cen Gov) should find money to protect the environment | 4045 | 95.5% | 95.1% | 95.1% | 1.13 | 1.33 | 1.38 |
.e | Industry should be prevented from causing damage to environment | 3731 | 51.3% | 51.1% | 50.7% | 1.21 | 1.39 | 1.43 |
| Average design factor | | | | | 1.22 | 1.40 | 1.42 |
The calculation of design factors was again carried out for differences by sex and by rurality for each of these questions. Results are shown in table 7.8. Again, results were broadly similar to those for Question G1 with much reduced design effects for comparisons.
Table 7.8
Average design factors for different statistics calculated for the items in Question G6a (10 items) and question G9 (5 items).
Summary measure | Design factor |
Weighting | Unweighted | iweight1 | iweight2 |
Questions G6a | | | |
% unhappy | 1.20 | 1.32 | 1.36 |
Difference by sex in %unhappy | 1.00 | 1.17 | 1.22 |
Difference by urban-rural in %unhappy | 1.13 | 1.05 | 1.08 |
Questions G9 | | | |
% unhappy | 1.22 | 1.40 | 1.42 |
Difference by sex in %very worried | 0.96 | 1.14 | 1.19 |
Difference by urban-rural in %very worried | 1.17 | 1.10 | 1.12 |
Design factors for questions in versions A and B
The same approach as above can be used for questions that appear in only one of the two versions of the questionnaire. The only difference is that the maximum number of respondents is lower (1989 for version A and 2130 for version B) and thus the corresponding confidence intervals will to be wider than those for questions asked of all respondents.
Means and design effects for selected questions from each version are given in tables 7.9 and 7.10. It can be seen that the scale of the design effects is similar to the questions where all respondents answered.
Questions with larger design effects tend to be those that vary more with the rural/urban classification. For example, only 24% of urban respondents had heard of sustainable development, compared with 36% of remote rural respondents (question SD1). In remote rural areas, 42% of respondents thought they were more than 100 miles from a source of radioactivity and only 15% did not know the answer to this question. In contrast, only 7% of respondents in the primary cities thought they were more than 100 miles from a source of radioactivity and 41% did not know the answer to this question. Similarly, in version B, respondents in rural areas were less likely to be within 20 minutes walk of a recycling facility and were less worried about litter.
Table 7.9
Design factors (deft) for selected questions from Version A ('don't know' treated as missing except for R13)
| % reporting | Design factor |
Q uestion. number | Topic | N responding | un-weighted | iweight1 | iweight2 | un-weighted | iweight1 | iweight2 |
SD1 | Heard of sustainable development | 1903 | 28% | 28% | 28% | 1.31 | 1.53 | 1.58 |
SD3 | Strongly agree people should change | 1874 | 82% | 82% | 82% | 1.23 | 1.35 | 1.42 |
SD4 | Strongly agree self should change | 1886 | 44% | 48% | 49% | 1.10 | 1.33 | 1.36 |
CC1 | Definitely agree climate changing | 1858 | 66% | 66% | 67% | 1.22 | 1.30 | 1.34 |
CC4 | Flooding risk very high many or few areas | 1909 | 24% | 24% | 24% | 1.26 | 1.34 | 1.34 |
R13
Don'tknow | Know how far home is from source of RA | 1989 | 29% | 29% | 29% | 1.27 | 1.40 | 1.41 |
R13
ifknow | Home is less than 100 miles from source of RA | 1410 | 80% | 81% | 82% | 1.52 | 1.48 | 1.52 |
| Average design factor | | | | | 1.27 | 1.39 | 1.42 |
Table 7.10
Design factors (deft) for selected questions from Version B ('don't know' treated as missing )
| % reporting | Design factor |
Question number | Topic | N responding | un-weighted | iweight1 | iweight2 | un-weighted | iweight1 | iweight2 |
WR1 | Recycling within 20 mins walk | 1901 | 58% | 61% | 62% | 1.58 | 1.67 | 1.69 |
WR6a | Sometimes or always compost | 2100 | 21% | 20% | 20% | 1.26 | 1.35 | 1.36 |
DW2 | Satisfied with water quality | 2126 | 76% | 76% | 76% | 1.15 | 1.32 | 1.36 |
DW7a | Use bottled or filtered water | 2122 | 29% | 29% | 29% | 1.14 | 1.33 | 1.33 |
NP1 | Aware of national park proposals | 2074 | 71% | 71% | 71% | 1.16 | 1.39 | 1.38 |
WH1 | Wildlife habitats very important | 2103 | 72% | 71% | 71% | 1.18 | 1.33 | 1.31 |
L1 | Litter very big problem | 2110 | 58% | 58% | 57% | 1.27 | 1.45 | 1.44 |
| | | | | 1.25 | 1.41 | 1.41 |
The same conclusion was drawn for a few other attitude questions that were explored. Further tables could be produced but the benefit of this needs to be considered. It seems unlikely that the design effects could cover exactly the analyses that are likely to be carried out for the EAS. A practical approach would be to assume average design effects for exploratory analyses, but use the sets of bootstrap results to calculate standard errors for important comparisons, as they are carried out.
A similar approach has been proposed by Statistics Canada in relation to some of their surveys, who point out that there may be an issue of confidentiality because it is possible to identify sets of respondents from the same PSU. This does not seem to be a serious problem here, since so many PSUs are included.
For exploratory analyses it is suggested that design factors of 1.2 are used for unweighted means, and 1.4 for weighted means. For sex comparisons the equivalent design factors might be taken as 1.0 and 1.2. As a crude approximation, the values of chi-squared tests might be adjusted by dividing the chi-squared statistics by the square of the design factors.
« Previous | Contents | Next »