« Previous | Contents | Next »
Listen
2. Using the information in this
report
How data is displayed in tables
All tables are presented in the format "dependent
variable by independent variable" where the independent
variable is being used to examine or explain variation in
the dependent variable. Thus, a table titled 'housing
tenure by household type' shows how housing tenures vary
among different household types. Where the tables show
column percentages, the dependent variable is shown in the
rows and the columns show the independent variable. Where
the tables show row percentages, this is switched and the
dependent variable is shown in the columns.
All tables have a descriptive and numerical base showing
the population or population sub-group examined in it.
While all results have been calculated using weighted data,
the bases shown give the unweighted counts. It should
therefore be noted that the results and bases presented
cannot be used to calculate how many respondents gave a
certain answer.
In general, percentages in tables
have been rounded to the nearest whole
number. Zero values are shown as a dash
(-), values greater than 0% but less that
0.5% are shown as 0% and values of 0.5% but
less than 1% are rounded up to 1%. Columns
or rows may not add to 100% because of
rounding or where multiple responses to a
question are possible. In some tables, percentages have been
removed from columns and replaced with '*'
where the base on which percentages would be
calculated is less than 100. This data is
judged to be insufficiently reliable for
publication. |
Variations in base sizes for tables
Because the questionnaire is administered using
CAPI, item non-response is kept to a
minimum. Bases occasionally fluctuate slightly due to small
amounts of missing information (where, for example, the age
or sex of household members has been refused and where
derived variables such as household type use this
information).
Some questions apply only to individual survey years and
the bases are correspondingly lower. Occasionally,
questions are introduced in the course of a survey year and
again the base size is lower.
The sample base appendix gives details of frequencies
and bases for the main dependent variables.
Income imputation
One section of the questionnaire is substantially
affected by missing information. In the section on
household income, approximately 33% of respondents either
refuse to answer the questions or are unable to provide
information that is sufficiently reliable to report, for
example, because there are no details of the level of
income received for one or more components of their income.
After the survey, statistical analysis of the
characteristics of households where income is available
allows income data to be imputed for households where
income data is missing. After imputation, missing income
data is reduced to only 3% of households (see Glossary for
more details).
The income information in the report includes the income
of the Highest Income Householder and their spouse or
partner (where there is one). Income from employment,
pensions and benefits and income from other sources is
included. The income of other household members is only
included if it represents 'other' income for the
HIH or spouse i.e. the other household
member contributes to household resources by paying 'dig
money'.
The current income information collected through the
SHS, is only intended to provide
estimates by income band. The survey asks for income only
for use as a "background" variable when analyzing other
topics, or for selecting the data for particular sub-groups
of the population (such as the low paid) for further
analysis. The
SHS cannot be used as a source of
figures on average income or average earnings. (See
Scottish Household Survey: Methodology
2003/2004 for further details).
Statistical significance
Where reference is made in the text to differences
between sub-groups of the sample, these differences have
been tested and found to be significant at the 95%
confidence limits.
8
All survey data has a degree of error associated with it
because it is based on a sample of the population. Any
proportion measured in the survey has an associated
sampling error, usually expressed as ±x% at the 95%
confidence limits. Technically, all results should be
quoted in this way. For example, based on the survey
results we can be 95% confident that between 27.7% and
26.3% of adults smoke. However, it is less cumbersome to
simply report the percentage as 27% (See Table 6.55). Where
sample sizes are small or comparisons are made between
sub-groups of the sample, the sampling error needs to be
taken into account. There are formulae to calculate whether
differences are statistically significant (i.e. they are
unlikely to have occurred by chance) and Appendix 1
provides a simple way to estimate if differences are
significant.
« Previous | Contents | Next »