On this page:

Scottish Household Survey Analytical Topic Report: Volunteering

« Previous | Contents | Next »

Listen

ANNEX 5 REGRESSION ANALYSIS

As part of the programme of analysis conducted for this project, a detailed investigation of the factors associated with volunteering in Scotland has been conducted via the use of a regression approach.

While the cross tabulations and associated statistical tests reported elsewhere provide considerable information on the prevalence of volunteering in specific segments of Scotland's population (e.g. women, younger people) it is desirable to supplement these individual views with a more comprehensive investigation of how key factors such as age and gender are related to whether or not an individual engages in voluntary service.

The regression approach reported here offers two specific advantages, as follows.

Firstly, it provides an estimate of how much of the variability in volunteering - that is, whether someone does or does not perform voluntary service - is 'explained' (in statistical terms) by a group of key demographic factors.

Secondly, it allows the effect of each factor on the likelihood of volunteering to be assessed when adjusted for the possible effects of other related factors. This latter point is best illustrated by example. Suppose a simple cross tabulation suggested that women were more likely to be volunteers than men, but it was also known that women were less likely than men to be in full-time employment. In such a case, a cross tabulation cannot reveal whether the apparently increased propensity of women to volunteer is really an effect of gender, or is wholly or in part related to the fact that women may have more free time due to not being in full-time employment. What is really of interest is the independent effect of gender on volunteering after adjusting for employment status.

The regression approach provides this type of adjustment. In this way, regression - while no substitute for the explicit estimates of prevalence given in the cross tabulations - does provide additional valuable insights into the relationships between a number of key factors and volunteering in Scotland.

Method

The approach used to identify volunteers among respondents to the 2005 Scottish Household Survey has been described earlier. This approach yields a binary or dichotomous quantity - that is, a respondent is either a volunteer or not.

To assess the effect of factors such as age or economic status on the probability of an individual's being a volunteer, the appropriate statistical technique is that of logistic regression. A full explanation of this method is not feasible here, but interpretation of the results which follow requires that three technical terms be broadly understood.

The objective is to predict the probability that a respondent engages in voluntary service. The relationship between a predictive factor (for example gender) and the probability of volunteering is expressed via a quantity known as the odds ratio. The odds ratio is a measure of the likelihood of a binary factor (here, whether the individual is or is not a volunteer worker) being observed in one group relative to the corresponding likelihood for a second group. An odds ratio of one indicates equal likelihood for both groups; odds ratios greater / lower than one indicate unequal likelihoods. An odds ratio may be presented with an associated confidence interval (see below). A simple worked example of odds ratio calculation now follows.

Odds Ratio - Worked Example

Suppose there are two groups of individuals. One hundred and fifty of these are women, of whom 32 perform voluntary work. The odds of volunteering among these women is calculated as the number who do volunteer divided by the number who do not

Formula

The second group consists of 144 men, of whom 17 indicate that they engage in voluntary work. The odds of volunteering among these men is given by

Formula

The odds ratio - that is, the odds of a woman being a volunteer relative to the odds of a man being a volunteer - is given by

Formula

The odds ratio is greater than one, so women are more likely to volunteer than men. The above example illustrates the calculation of a 'raw' or unadjusted odds ratio. In the statistical model developed for this project, the odds ratios are adjusted for the effects of other factors which might plausibly be relevant.

Confidence Intervals

A second concept which should be understood is that of the confidence interval. Simply stated, this is a range of values - expressed as a lower and an upper limit - within which the unknown 'true' value of an estimated quantity (here, an odds ratio) is expected to fall. Confidence intervals are expressed in terms of specific levels of uncertainty. For example, a 95% confidence interval indicates a 95% probability that the true value will lie within the stated lower and upper limits. Interpretation of the confidence interval depends on the nature of the analysis which generated it. In a logistic regression model such as that used here, the inclusion of the value one in the confidence interval around an estimated odds ratio indicates a result which is non-significant in statistical terms.

P Value

Finally, the p value is the probability that the result of a statistical test is attributable to the random play of chance, rather than to the presence of an actual effect in the population of interest. All p values fall within a range bounded by zero and one. Large p values (e.g. 0.2) are interpreted as indicating that the observed result could plausibly have arisen due merely to chance, while small p values (e.g. 0.01) suggest that the result reflects an effect which is actually present in the population from which the sample is drawn.

A value of p = 0.05 is commonly regarded as an informal 'threshold' of statistical significance, values of 0.05 or lower being considered significant (i.e. indicative of a real effect) while values greater than 0.05 are treated as non-significant. While this is a useful guideline, it can be potentially misleading - it is incorrect to place a completely different interpretation on the result of a statistical test simply because the observed p value is (say) 0.06 rather than 0.05.

In the results presented in this report, the p values reported are those representing the probability that the odds ratio for a specific factor is one; that is, that there is no relationship between the factor (e.g. gender) and the probability of being a volunteer.

Example Regression Analysis Table

Odds ratio estimates for the probability of engaging in nine or more hours of voluntary work in an average month

Factor

odds ratio

95% CI

p

AGE (relative to 45-59 years):-

16-24 years

0.73

0.43 to 1.22

0.23

25-34 years

1.04

0.70 to 1.54

0.85

35-44 years

0.71

0.52 to 0.96

0.03

60-74 years

0.96

0.60 to 1.54

0.86

75 years and over

0.79

0.40 to 1.54

0.49

SEX (relative to female):-

male

1.23

0.97 to 1.55

0.09

ECONOMIC STATUS (relative to full-time employment):-

self-employed

1.68

1.05 to 2.68

0.03

in part-time employment

1.13

0.79 to 1.62

0.50

looking after home / family

1.85

1.12 to 3.08

0.02

permanently retired

1.59

0.93 to 2.72

0.09

unemployed

1.39

0.63 to 3.08

0.41

in higher or further education

1.18

0.57 to 2.47

0.66

permanently sick or disabled

0.93

0.44 to 1.97

0.84

other

1.64

0.81 to 3.32

0.17

INCOME (relative to £10,001-£15,000):-

£0-£6000

0.77

0.42 to 1.40

0.39

£6001-£10000

0.82

0.53 to 1.28

0.39

£15001-£20000

0.79

0.52 to 1.20

0.27

£20001-£25000

0.69

0.45 to 1.06

0.09

£25001-£30000

0.87

0.56 to 1.36

0.55

£30000-£40000

0.96

0.63 to 1.45

0.84

£40000+

0.57

0.37 to 0.89

0.01

URBAN/RURAL (relative to large urban):-

other urban

1.18

0.89 to 1.58

0.25

small accessible towns

0.86

0.60 to 1.24

0.43

small remote towns

0.59

0.34 to 1.05

0.07

accessible rural

1.15

0.82 to 1.60

0.42

remote rural

0.86

0.55 to 1.32

0.48

ETHNIC GROUP (relative to white):-

non-white

1.00

0.47 to 2.14

0.99

DEPRIVATION (relative to non-deprived):-

deprived

1.33

0.88 to 2.00

0.18

LONGSTANDING ILLNESS / DISABILITY (relative to none):-

disability only

1.08

0.60 to 1.94

0.80

illness or health problem only

0.66

0.44 to 0.99

0.04

both disability and illness/health problem

0.54

0.27 to 1.06

0.07

NOTE: The model is based on 1,392 respondents for whom non-missing values of all variables are available.

« Previous | Contents | Next »

Page updated: Friday, January 18, 2008