« Previous | Contents | Next »
Listen
CHAPTER 5 OPTIONS FOR THE DESIGN OF A LONGITUDINAL STUDY
5.1 In this chapter we discuss the key issues in the design of a new longitudinal study and make recommendations. In developing these design options we have drawn on the consultations with the Scottish Government and external stakeholders, reviews of similar longitudinal studies, nationally and internationally, and our investigation of administrative data. We have also drawn on our own expertise as a research team in longitudinal research and survey methodology.
5.2 We first present an overview of the different design options (Figure 5.1) and then discuss each in turn considering their advantages, disadvantages and implications. The design options cover:
- Age range
- Timing and frequency of contact(s) in compulsory education
- Timing and frequency of contacts after S4
- Frequency of the recruitment of new cohorts
- Interim measures
- Survey administration, including the potential for carrying out the first sweep in school.
- Design of the sample, including whether it should be random, clustered and/or boosted.
- Sampling frame
- Extent to which information can be derived and linked from existing administrative sources
Age range
5.3 As noted in the previous chapter, stakeholders suggested that a new longitudinal study should contact young people while they are still in compulsory education, which is earlier than has been the practice with the SSLS. Opinion, however, varied about the most appropriate age and stage at which the study should start, ranging from P7 to S3. In considering this question, we return to the focus of the study: it is about ' youth transitions'.
5.4 Youth, as an economic and social concept, refers to a separate stage in the lifecycle between childhood and adulthood. As such it is a period of transition where young people have to negotiate a complex set of changes at the individual, institutional and societal level as they move from dependence to independence. This move generally involves at least four distinct aspects: finishing full-time education; settling into a more or less stable source of livelihood through employment and/or career choice; moving from the parental home and setting up new living arrangements; and forming close personal relationships outside of their family ( OECD 1996). A consideration of what 'youth transitions' refer to therefore helps to set the parameters for the study. While it is important to examine the antecedents of young people's choices and destinations, a decision has to be made as to what are the core concerns of a study of youth transitions and what is feasible within the available resources. Thus, for example, while a study starting in P1 would provide fascinating data, it would be difficult to argue that the collection of data on the early primary stage has priority in a study that aims to document and understand youth transitions.
5.5 It is also relevant, as we outlined in chapter 2, to consider the concept of a 'branching point' in respect of the most appropriate starting age for a study of youth transitions. This refers to the stage at which young people's educational pathways start to diverge as they make different choices and follow different routes within the education system, with implications for their future destinations. In Scotland in recent years, the end of S4 has marked the end of compulsory education and the point at which a majority of the cohort is eligible to leave school at age 16. It has also marked the end of the stage in which most young people follow a relatively common curriculum based on the national curricular framework and the point at which they take their first set of external examinations. S4 has therefore been a key 'branching' or transition point and is the reason why the SSLS has surveyed young people after this stage.
5.6 But recent and further possible changes in Scottish education mean that the key transition or branching points that have hitherto applied are now less certain, and this has implications for decisions about the timing of a study of transitions. The status of S4 is set to change in the light of a Curriculum for Excellence (CfE) but the precise implications of a CfE for the timing and frequency of a survey are difficult to determine at this stage. Currently there is uncertainty about what CfE will mean for the nature and timing of curricular pathways and young people's transitions or branching points. A Curriculum for Excellence sets out the purposes and principles for education from 3 to 18 in Scotland and aims to develop a more flexible and personalised curriculum (Scottish Executive 2004). Critically, CfE also conceptualises the stages of secondary education in a different way from previously: 15 or the end of S3 rather than 16 and S4 is seen as the threshold age. S4 will be regarded as part of the final stage of secondary education up to S6. CfE conceives of S3 as the end of a distinct phase of education after which young people will follow more differentiated programmes and greater specialisation on an incremental year by year basis. Thus the end of S3 rather than S4 may well become a key transition or branching point. Nevertheless, a Curriculum for Excellence also envisages that external exams will not take place before the end of S4. 6
5.7 We suggest that there is therefore a strong case that a new longitudinal study should contact young people while they are still in compulsory education, reflecting the timing of young people's 'branching points' and the need to assess and track the effect of the reforms of the compulsory schooling system and the growing curricular flexibility and differentiation on young people's later experiences and outcomes.
5.8 The end age of a longitudinal study of youth transitions can be seen as when young people have reached independent adulthood. We noted in chapter 2 that young people's transitions are increasingly prolonged. Firstly, it is becoming more common for young people to embark on their post-16 paths later than has traditionally been the case, for example to enter higher education later after a gap year(s) or to undertake work-based training in their later teens rather than at 16. Secondly, it now takes longer for young people, including graduates, to establish themselves in the labour market and short-term destinations do not necessarily provide a good indication of medium and longer-term outcomes ( OECD 1996, Bynner et al 2002, Purcell and Elias 2004). A third consideration is the policy interest in the motivations, experiences and outcomes of those who return to education in their 20s, perhaps after an unsatisfactory period in the labour market. A further issue in assessing the age at which the study should end is the practical questions: how long can one sustain the participation of respondents in a study and what are the costs of doing so? Overall, the mid to late 20s would seem to be a reasonable outer bound of a longitudinal study of youth transitions.
Figure 5.1: Design options for a longitudinal study
OPTION | Advantages | Disadvantages |
|---|
Timing of contact(s) in compulsory education |
S1 & S3 | Potential to examine pupils' views of Primary/Secondary transition, and experiences up to S3 (end of a CfE stage). Capture early career choices, subject choices, and factors leading to non-participation in education, employment or training | 2 sweeps in compulsory stages cost more than 1 sweep. S4 exam results not available for linkage for a considerable time after survey. |
S2 & S4 | As S1 & S3, plus S4 stage of a CfE starts. S3/S4 exam results can be linked | 2 sweeps in compulsory stages cost more than 1 sweep. Longer period elapsed since Primary/Secondary transition |
S3 only | 1 contact less expensive than 2. Potential to examine pupils' experiences and choices at end of S3 (end of a CfE stage). | Less information about experiences and choices in early stages of secondary school. S4 exam results not available for linkage for a considerable time |
S4 only | 1 contact less expensive than 2. Potential to examine pupils' experiences and choices at a CfE transition point S3/S4 S3/S4 exam results can be linked | Less information about experiences and choices in early stages of secondary school. |
S4+1 year | 1 st contact point for SSLS - consistent with time-series S3/S4 exam results can be linked immediately | Too late to capture timely explanations of young people's choices - only retrospective information. Limited information on their school experience and lack of prior data to use to control for later outcomes |
Timing & frequency of post-S4 contacts |
age 16/17 (ie S4+1 as above) | Key transition point at start of adulthood S3/S4 exam results can be linked Continuity with SSLS | No disadvantages identified |
age 18/19 | Key transition point between school, HE and training SQA results in S5-S6 and FE can be linked Continuity with SSLS | No disadvantages identified |
age 22/23? | Some young people may finish HE or training by this age - capture outcomes Some early entrants to labour market may have returned to education Some previous SSLS surveys at this age | Too soon to capture outcomes of late entrants to FE/ HE or Government-supported training, especially gap year students & false-starters. |
OR |
age 23/24? | Most young people may finish HE or training by this age - capture outcomes Some previous SSLS surveys at this age | Destinations of graduates may be temporary at this stage - not time to get settled in careers A gap of 4 years between this sweep and the previous one at 18-19 |
age 26/27? | Capture longer-term outcomes in labour market, housing, family formation & life-style. Pick up later entrants to education and training | Sample attrition more likely over time 5 post-S4 sweeps may be too costly |
Recruitment of new cohorts |
Every year | Allows for timely evaluation of policy interventions | Very costly, and burdensome on all involved. Time to analyse data is too short before next survey |
Every 2 years | Earlier SSLS were 2-yearly Not as costly/ burdensome as annual surveys Frequency allows for consolidation of expertise, interest and awareness of survey Allows for timely evaluation of policy interventions | Cost and burden influenced by both frequency of cohorts and frequency of sweeps |
Every 3, 4 or 5 years | Recent SSLS have had 4 year gaps Not as costly/ burdensome as annual or bi-ennial surveys Allows analysis of longer term effects of social and policy change | Long gap between surveys and sweeps associated with loss of expertise, interest and awareness of survey Long gap makes data less relevant to policy makers |
Interim S4 cohort (5 years since last SSLS) | The most recent SSLS cohort was recruited at S4+1 in 2003. The earliest that a new cohort can be recruited is 2008. An interim S4 cohort surveyed at 16/17 and 18/19 etc will provide data on post-16 outcomes with a gap of 5 years between cohorts | This option is proposed to run in parallel with the recruitment of a new cohort in the early stages of secondary school |
Methods of administration |
Postal questionnaire | Method used for SSLS - Sent to home address - Cheaper than other methods of administration | Low response rates leading to poor quality data |
Questionnaire administered at school | Administered within school by survey organisation. Relatively low-cost data collection with very high response rates/minimal attrition Potential for drawing large within-school samples Easy identification and follow-up of non-attenders who may be truants | Needs 2-stage sample - selecting sample of schools - to be cost-effective. Potential burden on schools |
Telephone interview | Contacted at home address Better response rates than postal questionnaire, and more potential for filtering questions. Cheaper than face-to-face interviews | More expensive than postal questionnaire. Potential problem of obtaining telephone numbers for some respondents |
Face-to-face interview | Better response and data quality than postal questionnaire. May be the only way of obtaining response from hard-to-reach groups | Most expensive option Potential problem of obtaining addresses, and finding respondents at home. Fieldwork may be more convenient and cheaper if sample is a school cluster rather than random. |
Internet survey | Cheapest option for some respondents. May be more attractive to some young people than other survey methods. Better response and data quality than postal questionnaire. | Not everyone has access to a computer. Problems of obtaining accurate email addresses. |
Mixed methods | Most effective option may be to use a range of the above options ie 1 st telephone interview, but use other methods if can't get telephone number | Need to analyse responses gained from different methods to control for resulting bias. |
Sample |
Random - all Scotland | Ideal for postal questionnaire survey - no design effects | More costly for methods that include face-to face interviews because of dispersal of the sample across Scotland |
2-stage cluster sampling | Focusing on selected schools (or selected post-code sectors) brings reduced cost for face-to-face methods. Cluster based on sample of schools best suited to administration within school S1-S4 Post-code sector sampling best suited to administration at home at post-16 stages | Clustered sample brings danger of design effects because pupils in a school are more homogenous than population as a whole (smaller design effects for post-code sector sampling) |
Sample of schools + sample of pupils | As above | As above + small sample If administered within school, some disruption to schools in drawing random sample |
Sample of schools + whole year group | As above + larger sample size within sample schools, and less disruptive to schools. Provides capacity to analyse non response to later sweeps + construct weights Enables over sampling of groups of interest in later survey sweeps | Slightly more costly because slightly larger sample numbers |
Boosted sample of those at risk of non -participation in education, employment or training & other sub-groups | Larger sample numbers of sub-groups of interest, at later sweeps of the survey. | Initial target sample is biased (eg in favour of low attainers and areas of deprivation) and needs weighting to compensate. |
Sampling frame |
Administrative data from schools via ScotXed | Pupil-level data for all pupils and all stages at publicly-funded schools, including name, address and telephone number. Includes SCN for data linkage, and pupil characteristics to inform weighting and boosted samples. | Less information available for independent schools and special arrangements will need to be made to collect these. Need to address legal and practical issues re use of unique pupil identifiers |
SQA data used for sample | Method used for SSLS Covers all pupils attempting NQ. Includes name, date of birth and SCN | Does not include pupils not presented for NQ (eg low attainers and those sitting other exams). Limited coverage of independent schools For SSLS the details were updated by schools |
Linking administrative data |
ScotXed data on (1) pupil characteristics (2) attendance etc | Includes ethnicity, 1 st language, free-meal entitlement, additional support needs, other schools attended Included termly data on attendance/absence and exclusions. Available at all school stages. Can be linked using SCN | Less information available for independent schools and special arrangements will need to be made to collect these. Needs derived variables to be defined. Need to address legal and practical issues re use of unique pupil identifiers |
SQA data | All NQ per year, including details of subject, level of course, institution and result. Available to link to all survey sweeps. Can be linked using SCN. | Needs some work to derive cumulative attainment variables, institution types and progression within subject/faculty. Does not include non- SQA exams |
Data from other agencies eg SFC, HEFCE, UCAS | Data on educational courses attempted and qualifications achieved in post-compulsory stages. Should be possible to link by SCN if MIAP strategy progresses. | Data linkage not yet tried and tested. May need permission of respondent. |
Data on Government-supported training from Scottish Enterprise | Data on government-supported training including date started, date completed and qualifications achieved - solving problem of lack of awareness by trainees. Should be possible to link by SCN if MIAP strategy progresses. | Data linkage not yet tried and tested. May need permission of respondent. |
Options for the timing and frequency of contacts in compulsory education
5.9 We have suggested that a new study should start in the compulsory stage of secondary education but the timing and frequency of contacts at this stage are not easy to determine. We suggest that there are several possible variations in respect of the number and timing of contacts:
- S1 and S3
- S2 and S4
- S3 only
- S4 only
5.10 One obvious consideration in deciding the number and timing of contacts is cost: if the initial sweep is S1 or S2 then another contact would be necessary at least two years later to collect information on young people's more recent experiences and views and also to maintain contact with them.
5.11 As discussed at various points in this report, data on young people's school experience and earlier attitudes and aspirations are necessary to understand their post-school transitions, and this is the rationale for recommending that the longitudinal study should start in the compulsory stage of schooling. But what is the appropriate balance? As noted in chapter 1, this appraisal aims to analyse the needs for longitudinal data on young people's experiences in secondary school and subsequent transitions to further/higher education, training and employment. To achieve this aim, is it sufficient to contact young people towards the end of compulsory education (in S3 or S4) or is a prior contact necessary? A contact in S1 (perhaps in October/November) would, for example, provide an opportunity to collect information about young people's experience of the primary - secondary transition; and their expectations of secondary school, attitudes to education and aspirations at their entry to the secondary stage. This would provide useful baseline information. A second sweep in S3 would help monitor the greater personalisation and choice envisaged in a Curriculum for Excellence. But more intensive surveying in the lower secondary school would alter the focus of the longitudinal study of young people's post-school transitions to some extent.
5.12 Whether one or two contacts take place in the compulsory stage of school, there would need to be a contact in either S3 or S4. Which would be better? A survey towards the end of S3 would provide the opportunity to collect contemporaneous data on young people's experiences and views at the end of the first stage of their secondary education. On the other hand, surveying them in S4 would gather data on the new provision that they have embarked on while also asking about their early experiences at school.
5.13 A factor to take into account is when attainment data for most young people will be available. Currently most young people take the bulk of their first external exams at the end of S4. Initial results are available from SQA in September and post appeal results early in the following year. If the intention of a Curriculum for Excellence not to bring forward the timing of external exams is realised then this timescale will continue to be the case. This would mean that for young people surveyed in S3 there will be a gap of at least 18 months before their attainment data are available.
Recommendations
- The initial contact survey should take place during the compulsory stage.
- If possible, there should be two contacts in the compulsory stages, S1/S2 and S3/S4.
Options for the timing and frequency of post S4 contacts
5.14 The most recent SSLS cohorts have been surveyed four times, at the ages of 16-17; 18-19; 21-22 and 23-24. We have reported the strong support among stakeholders for an earlier sweep and also the considerable interest in maintaining contact with young people up to their mid or late twenties. But extending the study at both ends and maintaining the SSLS practice of survey sweeps at two yearly intervals is likely to be prohibitively expensive. It may be necessary to consider a trade off between the age range covered and the number of sweeps. It may be appropriate to survey respondents less intensively after 18-19 (while taking measures to maintain contact details) to enable the age range covered to be extended into the mid/late twenties.
5.15 Taking account of both substantive and financial considerations, we suggest that the initial post S4 contacts should be at the ages of 16-17 (the year after compulsory education) and two years later at 18-19. These contact points are well established by SSLS and all of those consulted during the research thought that they continue to be appropriate time points at which to monitor young people's transitions. Surveys at these points capture young people's movements out of school into and through other forms of full-time education and/or the labour market and training, employment and unemployment. Continuing to survey young people at 16-17 and 18-19 will also provide continuity and comparability with the earlier data collected by SSLS. The maintenance of the time series of youth transitions in Scotland (which dates from the mid 1970s) is a unique resource for the country and there is a strong argument for seeking to preserve it, certainly at these ages.
5.16 If it is decided that the cohort will be first contacted in the compulsory stage, it would not be necessary to make decisions about the number and timing of sweeps after the ages of 18-19 until at least 2012 so there is scope to review this. Apart from consideration of the more protracted nature of young people's transitions, another factor to take into account is comparability with SSLS. We suggest that comparability is less of an issue at these later sweeps since the contact at 23-24 is a recent addition to SSLS and response rates have been poor. Considering these various factors, we suggest that a minimum of two sweeps after the ages of 18-19 would be necessary, possibly a sweep at age 22-23 followed by another at 26-27. This, however, is only a tentative suggestion at this stage.
Recommendations
- We recommend that there should be at least four survey sweeps after S4.
- The first two sweeps should be at ages 16-17, 18-19.
- Subsequent sweeps could possibly be at 22-23 and 26-27.
Options for the recruitment of new cohorts
5.17 For a number of years the SSLS recruited a new cohort every two years; this changed latterly and there was a gap of four years between cohorts 3 and 4. There has been no fixed pattern in terms of the gaps between different cohorts in the American and Australian longitudinal studies. The American studies have had the longest gaps between cohorts with studies being carried out in 1972, 1980, 1988 and 2002. The current series of Australian studies have smaller gaps and have started new cohorts in 1995, 1998 and 2003. In England, YCS initially started with annual cohorts but since 1992 these have been biennial with the exception of cohort 13 which was pushed back by a year to coincide with LSYPE.
5.18 In considering how often a new cohort of young people should be surveyed, there are several issues to take into account: what is timely from a policy and research perspective? what is affordable? and what is administratively feasible? We have proposed extending the cohort (starting earlier and finishing later) so from a cost and an administrative perspective it would be better to have longer gaps between each cohort. We suggest the recruitment of a new cohort every four or five years. This suggestion for longer gaps between cohorts was generally viewed favourably in the consultation process, and was seen as reflecting the policy cycle.
5.19 Tables 5.1 - 5.3 give an overview of possible options in respect of the frequency and timing of surveys and the recruitment of new cohorts.
Table 5.1: One initial sweep in compulsory education
| Sweep 1 S4/14-15 | Sweep 2 16-17 | Sweep 3 18-19 | Sweep 4* 22-23 | Sweep 5* 26-27 |
|---|
1 st cohort | 2008 | 2010 | 2012 | 2016* | 2020* |
|---|
2 nd cohort | 2012 | 2014 | 2016 | 2020* | 2024* |
|---|
3 rd cohort | 2016 | 2018 | 2020 | 2024* | 2028* |
|---|
etc | | | | | |
|---|
* provisional only
Table 5.2: Two initial sweeps in compulsory education
| Sweep 1 S1or S2 | Sweep 2 S3 or S4 | Sweep 3 16-17 | Sweep 4 18-19 | Sweep 5* 22-23 | Sweep 6* 26-27 |
|---|
Cohort 1 | 2008 | 2010 | 2012 | 2014 | 2018* | 2022* |
|---|
Cohort 2 | 2012 | 2014 | 2016 | 2018 | 2022* | 2026* |
|---|
Cohort 3 | 2016 | 2018 | 2020 | 2022 | 2026* | 2030* |
|---|
Etc | | | | | | |
|---|
* provisional only
Table 5.3: Initial sweep in the year after compulsory education (post S4)
| Sweep 2 16-17 | Sweep 3 18-19 | Sweep 4 22-23 | Sweep 5* 26-27 |
|---|
Cohort 1 | 2008 | 2010 | 2014 | 2018* |
|---|
Cohort 2 | 2012 | 2014 | 2018 | 2022* |
|---|
Cohort 3 | 2016 | 2018 | 2022* | 2026* |
|---|
Etc | | | | |
|---|
* provisional only
Interim Measures
5.20 The contract for SSLS has now ended (March 2007). Table 5.4 shows the coverage of the two most recent cohorts of young people: members of cohort 3 were last surveyed in 2006 at the age of 22-23 while cohort 4 was last surveyed at age 18-19 in 2005.
Table 5.4 Most recent SSLS cohorts and survey sweeps, and proposed interim cohort
| Finish S4 in June of … | Sweep 1 (recruitment) | Sweep 2 | Sweep 3 | Sweep 4 |
|---|
Age 16-17 | Age 18-19 | Age 21-22 | Age 23-24 |
|---|
SSLS Cohort 3 | 1998 | 1999 | 2001 | 2004 | 2006 |
|---|
SSLS Cohort 4 | 2002 | 2003 | 2005 | | |
|---|
Interim Cohort | 2007 | 2008 | 2010 | 2014 |
|---|
5.21 If a new longitudinal study starts in the compulsory stage of schooling then there will be a considerable gap in the data on young people's transitions in Scotland. The very earliest that a new study could be commissioned and in the field is likely to be 2008; this means that the earliest point at which data would be available on young people at age 16-17 is 2010 or alternatively 2012 (tables 5.2 and 5.3). This means a gap of at least seven years in the data available from SSLS and the new longitudinal study. We recommend that if the Scottish Government decides to proceed with an initial contact at the compulsory stage, it should seriously considering carrying out an interim study to fill the gap in data between SSLS and the new study. The design and methodology of the interim study could be the same as that of Design B (see chapter 6).
Recommendations
- There should be a 4-year gap between cohorts
- As an interim measure, a new cohort of those in S4 in 2007 should be surveyed in order to reduce the gap in data since the most recent SSLS cohort.
Options for survey administration
5.22 In the past, the SSLS has been conducted by postal questionnaire. It first contacted young people in the year after the end of compulsory education when a proportion had already left school and a questionnaire was posted to them at their home address. This method of contact has become less effective in recent years because of problems associated with low response rates. However, timing the first contact with young people while they are still in compulsory education opens up the possibility of administering the survey to them in school.
5.23 The key advantage of this approach would be the benefit in terms of response rate. A recent example of a survey administered to a national sample of pupils in schools across Scotland achieved a response rate of 89%; this was without any follow-up procedures to capture those absent on the day (Howieson et al 2006). The ESYTC is another example of a survey that has gained the co-operation of schools and been successfully administered to pupils in the school setting. The ESYTC is particularly interesting since it is a longitudinal study, it has annual sweeps, and it includes the whole of an S1 cohort in schools in Edinburgh. In the publicly-funded schools, the survey achieved a 99.4% response rate to the first sweep and 96% at sweep 2 (McVie 2001).
5.24 The main potential disadvantage of a school-administered questionnaire is the possible burden on schools and their willingness to participate. Several interviewees noted, for example, the negative response of some schools to participation in the Scottish Survey of Achievement ( SSA). However a longitudinal survey would differ in certain key respects from the SSA in terms of the demands it might place on schools. The SSA is demanding in that it entails pupils taking attainment tests in a particular subject, as well as filling out questionnaires; it also requires teachers to assess the levels of attainment of sample pupils. Its sample design means that schools have to extract specific pupils from their class. A longitudinal study should be less demanding on schools than the SSA since it would only require pupils to complete a questionnaire and would not involve any teacher assessment.
5.25 Assessing the response of schools to the idea of a survey administered on school premises was not part of the remit of the Options Appraisal. However, a small number of head teachers with whom we have had informal contacts indicated that they would be willing to participate in such a survey. The Association of Directors of Education in Scotland and the Scottish Council for Independent Schools were also positive about a school-based survey. They perceived a longitudinal study of youth transitions as a worthwhile exercise to be involved in and one that could provide them with critical information about the outcomes of schooling. Several commented that it would be helpful if they had easy access to the findings of the study and the scope to identify analyses of specific relevance to them.
Methods for administering the survey in school
5.26 We believe that there are two main options for administering the survey in school to pupils. The first is a paper self-completion questionnaire for the young people to complete and the second is a web-based questionnaire to be completed online. We have provided costs for both approaches.
5.27 Each option has potential strengths and weaknesses. Paper-based self-completion questionnaires have been used in many school surveys and most schools will be used to the administration procedures. Typically the survey agency commissioned will produce the paper questionnaires and these will either be sent to the school in advance of the survey day or brought to the school on the survey day by the administrator. It is less demanding for the school for the survey organisation staff to administer the questionnaire. Although this is more expensive, it is important in minimising demands on the school and in reassuring pupils about confidentiality. Questionnaires can then be completed by pupils in an appropriate lesson and the only facilities that are required are an appropriate room and the supply of pens. This means that the survey burden for schools can be largely kept to a minimum.
5.28 One of the key drawbacks to using paper based self-completion questionnaires is that data quality is not as high as interviewer-administered and electronic questionnaires. There is no check to ensure that individual items are completed correctly or indeed completed at all. In addition to this, any routing in the questionnaire must be kept to a minimum. These errors or omissions in the data then require edit procedures to be set up which can add greatly to the research costs.
5.29 In a web-based approach data quality is greatly enhanced as the survey software ensures that individual questions are completed correctly. In addition to this, web-based questionnaires can incorporate more complex routing and text substitution in their design.
5.30 The web-based approach also has significant cost and time savings in comparison with paper-based self-completion questionnaires. The cost of programming and hosting a web-based questionnaire is significantly lower than that of printing and scanning the equivalent number of paper questionnaires. The difference in our guideline costs between using a web-based questionnaire and paper questionnaires for an achieved sample size of 10,000 interviews is around £45,000.
5.31 Obviously the web-based approach requires schools to have IT resources and Internet access available for use in survey administration. Where schools have good facilities, a web-based approach should help to minimise the demands on them. In these schools pupils could log onto a website and complete the questionnaire on-line at a time convenient to the school, perhaps as part of a PSE or other class. The questionnaire could be hosted on the survey organisation's website or elsewhere.
5.32 However, while we envisage that all schools will have at least some IT resource and internet access at the time of the new study, the extent of this will vary. Schools with fewer resources would find it significantly more difficult to set aside time for their computer rooms to be used to complete a survey. This would then have the potential to introduce quite serious bias in to the study as school level non-response could be associated with the availability of IT resources within the school. It remains to be seen whether developments associated with GLOW, the national schools intranet, which aims to offer access to all young people in publicly-funded schools in Scotland schools by 2009/2010, might help to facilitate access. 7
Confidentiality
5.33 Respondents to any survey need to be assured that their responses will be treated in confidence but in a school based survey it is even more critical to reassure young people about confidentiality. If young people are to respond fully and honestly to questions about their school experience they need to be confident that their teachers and other school staff will not see their answers. In addition to the normal statements about confidentiality, various strategies can be used to reinforce the principle of confidentiality, for example, only survey staff being present when young people complete a paper based questionnaire and the provision of envelopes so respondents can seal their completed questionnaire. A web-based questionnaire where respondents send the completed document straight to an external website can also provide reassurance.
Pupils Opting -in
5.34 While pupils in schools are to some extent a 'captive audience' for a survey, it is clearly a fundamental principle they and, depending on their age, their parent/carer must consent to participation. We suggest that this should be done on an opt-out consent method rather than an opt-in basis. Evidence suggests that the requirement to opt-in to a study results not only in lower response rates but may skew the sample because certain sub-groups in the population are less likely to opt-in. The experience of the ESYTC gives an indication of impact of the opt-out consent method on response rates. In this case, the ESYTC team sent out letters to parents explaining the study and offering them the possibility of opting their child out of the study. This resulted in an overall opt out rate of only 3% (McVie 2001).
Coverage of young people
5.35 While a survey administered in school offers the enormous advantage of high overall response rates, the problem of contacting those who are absent or in alternative provision remains. There is also the question of whether a study should cover young people in special schools.
5.36 In respect of absentees, they need to be identified and asked to complete the questionnaire on a subsequent occasion by the most appropriate method. In all likelihood this would best be achieved by following up the young person away from the school as this would minimise the disruption that would be caused to lessons etc. by holding a second survey day on school premises.
5.37 As absentees will potentially be of high interest for the study we would recommend that extensive efforts be put in place to follow up these young people. We would expect that multiple methods of data collection would need to be employed to reach absentees. While initial attempts would be made using postal questionnaires (likely to be completed by some absentees) telephone and face to face follow-ups would be required to gain the participation of the more reluctant.
Pupils who move schools
5.38 Pupils who move schools are a group of interest in policy terms, for example, frequent school moving can be associated with low attainment and increase the likelihood of them not being in education, training or employment after leaving school. We suggest that a longitudinal study should aim to follow up respondents who move schools within Scotland even if they subsequently attend a school not in the sample.
Pupils not in school
5.39 A longitudinal study will also need to consider specific arrangements to include those pupils who do not attend mainstream schooling on a regular basis. This group includes pupils who are in alternative educational provision, those who have been excluded and chronic truants. These are among the groups of most interest in policy terms.
5.40 Interviewing young people who are in alternative educational provisions can follow a similar model to that of the young people in mainstream education. The institutions would be contacted by researchers in the first instance in order to gain co-operation, and then the survey would be administered on site. Special measures may be required for young people with additional support needs including the interviewer on site reading out the questions to respondents.
5.41 However, interviewing chronic truants and the long term excluded will require extra measures and, as a result, will be considerably more costly than interviewing other pupils. Potential approaches for interviewing all absentee pupils are detailed above and it must be assumed that these particular groups of absentee pupils will require the most proactive data collection with telephone and face to face follow ups.
Pupils with additional support needs
5.42 SSLS has not included special schools which cater exclusively for young people with additional support needs although those pupils who had been presented for SQA awards are likely to have been included. 8SSLS has included young people with additional support needs attending mainstream schools; this has become increasingly common in a policy context which promotes the 'mainstreaming' of school pupils with such needs (Scottish Executive 2002).
5.43 We suggest that a new study should consider including young people with additional support needs in mainstream schools and units attached to mainstream schools. But to do so effectively, it has to be recognised that this will require additional resources. For example, to ensure the meaningful participation of young people with mild to moderate learning difficulties, it would be necessary to employ various strategies such as extra survey staff to act as readers or to administer the survey on an individual basis using CAPI. If and when a longitudinal study is put out to tender, these would be aspects that organisations tendering should be asked to address. But we would point out that there is a limit to what a general survey can do before it becomes prohibitively expensive or tokenistic - for some young people a specific study designed to address their circumstance and needs would have to be funded.
Using mixed methods for administering the survey
5.44 Mixed methods of administering the survey are most appropriate for surveying young people outside of the school context - especially (but not exclusively) for surveys at the post-compulsory stages. Respondents are contacted at their home address using a mixed methodology of telephone, postal, web and face to face interviewing. The principle behind this mixed-method approach is to ensure a high level of coverage while at the same time maintaining cost efficiencies.
5.45 Response rates using mixed methods are considerably higher than surveys carried out by postal questionnaire alone. If there is a telephone number for the young person s/he can be interviewed by telephone, but where this proves impossible the survey team can send an interviewer to "chase" and carry out a face-to-face interview. Experience with the YCS found that telephone numbers were only available for a minority of the sampled young people (30%). In addition to this there was actually a marked difference in the availability of telephone numbers by attainment (telephone numbers were available for 40% of the young people with 8+ A-C GCSEs compared with 11% of those with no qualifications). Given that having a telephone number significantly increased the response rate across all groups, this exaggerated a trend that was already present in the postal-only respondents that made up the bulk of the YCS sample.
5.46 However, for the Scottish sample there will be telephone numbers for a larger proportion of respondents, and we are proposing a face-to-face "chase" for non-responders, so the difference between attainment groups should not be pronounced. In addition, all respondents can be given the option of an internet questionnaire. Further details of this method, with appropriate cost estimates, are shown below as a section of our recommended options.
5.47 While a major advantage of using a mixed-method approach is higher response rates at a reasonable cost, care must be taken in questionnaire design because different modes of data collection can sometimes produce different results. It is therefore important to ensure that the questionnaire is designed using "unimode construction", that is the writing and presenting of questions to respondents in a way that produces (as far as practicable) a common mental stimulus regardless of survey mode (Dillman 1978). The main principles of unimode construction are
- Making the response options the same across all modes.
- Not changing the basic question structure from mode to mode.
- Reducing the number of response categories to achieve mode similarity.
- Dealing effectively with routing on the self-completion questionnaire.
5.48 These principles will have to be followed when designing the questionnaire for the follow-up sweeps, in particular, the questionnaire structured for telephone interviewing as this approach differs most from the other forms of data collection.
Recommendations
- Surveys at the compulsory stages should be administered within the school context by a survey organisation using paper and internet questionnaires. Mixed methods should be used to contact non-attenders and those in alternative educational provision.
- Surveys at the post-compulsory stages should be administered by mixed methods including telephone interviews, face-to-face interviews, postal and internet questionnaires, in order to maximise response.
Sampling Options
Sample design
5.49 The overall aim of sample design is to achieve a nationally-representative sample of young people in Scotland so that findings of the survey provide a representative picture of young people's experiences and pathways. However, creating a nationally-representative sample is problematic because of differences in response rates. In recent years, the SSLS had very low response rates from some groups of young people, and this meant that the sample did not adequately represent the low attainers, truants or those not in education, training or employment or those at risk of being so, even after applying weighting procedures. Achieving high response rates is the most important way of reducing this problem.
5.50 Ideally, the overall sample size of the study will be fairly large, in order to cover the main variations in young people's experiences, including urban and rural differences. We estimate that an achieved sample of 10,000 at sweep 1 will provide a good level of coverage of those young people at risk of not engaging in education, employment or training. However, there are issues concerning geographical variations in the composition of the population. For example, young people from minority ethnic groups form a significant minority of the school population (over 10,000 secondary school pupils in Scotland were from minority ethnic groups at the 2005 school census (Scottish Executive 2006c)). But they tend to be fairly concentrated in particular areas and schools, for example, they were 10% of the Glasgow secondary roll, compared with 3% in the country as a whole. Thus, a simple random sample of the whole of Scotland, such as that used in the SSLS, produced relatively small sample numbers of minority ethnic pupils. On the other hand, young people living in the rural areas of Scotland are sparsely distributed and attend relatively small schools.
5.51 We believe it is essential for the longitudinal study to cover the whole of Scotland so that policy issues such as migration from rural areas can be investigated. We are aware that some surveys restrict their samples to the mainland areas of Scotland south of the Great Glen in order to reduce field work costs, but we do not believe this practice is appropriate for a longitudinal study of Scottish young people.
5.52 It may be possible to boost the initial samples to ensure that sub-sample numbers in particular categories are large enough for special studies - for example, young people who live in areas of deprivation or young people from minority ethnic groups could be over-sampled in order to ensure large numbers for analysis. However, boosting the sample in this way would alter the overall representativeness of the sample, and care would be needed to identify and document over-sampled sub-groups.
5.53 The target population from which the sample for the new study would be drawn is the cohort of young people in the same year stage (eg S4) of compulsory schooling in Scottish schools, including both publicly-funded and independent schools, but not special schools. 9 The proposed sample includes pupils with additional support needs in mainstream schools, but excludes pupils in special schools, and children educated at home because in each case sample numbers would be too small for analysis, and a disproportionate amount of survey resources would be needed to include them. (Special studies based on boosted samples would be more appropriate for these groups).
5.54 To some extent, the sample design is dependent on the methods to be used to administer the survey. The SSLS is a postal questionnaire survey, and for this type of survey a random sample is ideal because all addresses in Scotland can be reached by post at no extra cost to the survey. However, as we have noted, response rates to postal questionnaire surveys have been poor. Other methods of administering the survey are recommended in order to ensure the highest possible response rates, but these are more labour-intensive and so require some clustering of the sample in order to be cost-efficient. For example, if the first contact of the survey is to be administered within the school, it becomes necessary to select a sample of schools in which to conduct the survey. In this case, a 2-stage process of selecting a sample is needed, and the most appropriate method seems to be first to select a representative sample of schools, and then a representative sample of pupils within each school. The main disadvantage of a sample that is clustered by school is that this type of sample tends to be associated with a "design effect" because the within-school sample of pupils is more homogeneous than pupils in the overall population, and the design effect must be controlled for in all analyses.
5.55 Similarly, if the first (or subsequent) contacts are to be administered at the home address by mixed methods that include face-to-face interviewing with hard to reach groups, it is more cost-efficient to have the sample clustered by geographical areas rather than scattered across Scotland. In this case, an alternative method of sampling would be to cluster by post-code sectors (as has been done in YCS). Clustering by post-code sector is more efficient for fieldwork where mixed methods of survey administration are used, including telephone interviewing backed up by face-to-face interviewing. There is some indication from BMRB's work in England that there are lower design effects where samples are clustered by post-code sector.
Selecting a sample of schools
5.56 At the 2005 school census, there were 442 secondary schools in Scotland - 385 publicly-funded schools and 57 independent schools (Scottish Executive 2006c). A decision needs to be made as to the number of schools to include in the sample - and this is a trade-off between reducing survey costs on the one hand, and reducing the design effects of clustering on the other. Although in general selecting fewer schools will reduce both the operational and fieldwork costs, pupils selected from the same secondary school will tend to be more homogenous than pupils in the general population. The design effect of selecting a sample that is clustered in this fashion is equivalent to a reduction in the total sample size. Keeping everything else constant, the fewer schools selected the higher the design effects due to clustering and the smaller the effective sample size.
5.57 With this in mind, we propose that 150 schools (approximately one third of Scottish secondary schools) should be included in the surveys of each cohort - 130 publicly-funded schools and 20 independent schools (see Table 5.5). The school-roll size and socio-economic characteristics of secondary schools vary enormously across Scotland, and it will be important that the sample of schools selected for the survey should be representative of this range.
5.58 The method of sampling needs careful consideration. One approach would be to take a simple random sample of schools. In this case, we must take account of the fact that there are a larger number of small schools than large schools. So, the proportion of pupils sampled within each school would need to be constant, otherwise the sample of pupils might be un-representative because the larger number of small schools (mostly rural) would have a greater chance of selection than the smaller number of larger schools (in city areas). Another method of selecting schools might be to choose a sample stratified by known characteristics, including size, geography and local-area deprivation. A further possibility is to sample secondary schools with probability proportional to the number of pupils, in which case, if the same number of pupils is selected within each school, each pupil will have the same chance of being selected in the sample; however, a disadvantage of this approach is that it might reduce the representation of small rural schools. The SSA already has an established methodology for selecting schools, and any future longitudinal survey should learn from this experience.
Approaches to securing the participation of schools
5.59 The main source of non-response at the first sweep of the research, and therefore potential bias, will be schools which refuse to participate. While there will also be non-response from parents who opt out of the study and from young people who cannot be located, these will be of a significantly lesser order. As such, achieving the participation of the proposed 150 schools will need to be the main target at the first sweep of the survey. To achieve this we recommend that a reserve sample of schools is drawn at the same time as the main sample to compensate for any problems that might be encountered at the fieldwork stage.
5.60 A recent report for DfES by the University of Surrey and BMRB set out a number of ways in which school level response rates can be increased (Sturgis, Smith and Hughes 2006). This research was commissioned as a critical review of the very low participation rates among English schools in PISA 2003, but contains useful observations and recommendations at a more general level. One of the report's key findings was that schools felt overburdened by the number of research requests that they receive and this is something that, anecdotally, we understand is also happening within Scottish schools. We suggest that the Executive should review its other school based research so that, as far as possible, schools are not asked to take part in more than one survey in any year. It would help to encourage schools to participate if they could be reassured that efforts are being made to rationalise and co-ordinate the demands being made on them in respect of the variety of national and international surveys in existence.
5.61 Ensuring that the study is high profile and viewed as relevant and useful will also be an important factor in increasing schools' participation. In addition to the support of the Government, it will also be beneficial to secure endorsements from other relevant parties such as local authorities. In the American study ELS:2002 the sampled schools received letters from responsible officers at the state and/or district level endorsing the study and a large number of endorsements were gained from various professional and voluntary organisations such as the American Federation of Teachers, National Association of Independent schools and the National Parents and Teachers Association. We suggest that similar endorsements should be secured for any new longitudinal study from the relevant organisations and groups in Scotland.
5.62 Flexibility in approach and a respect for the concerns that the schools have in taking part in the research will also help to increase participation. The ESYTC managed to recruit all mainstream schools in Edinburgh for the study, in part due to the flexibility they were able to show. The structure of the study was such that schools were given a relatively long fieldwork period within which they could conduct the survey. In addition to this, the research team negotiated with the schools as to whether the survey would be conducted in a single day or spread over a number of weeks to suit timetable patterns. While there were practical benefits for the research team in conducting the survey over one or two days, school preferences were paramount.
Sample Size
5.63 We estimate that an achieved sample of 10,000 respondents is required for the first sweep of the study, so that there will be sufficient sample numbers of low attaining young people, and those who are not in education, employment or training, at future sweeps which will inevitably suffer some attrition.
5.64 The 2005 School Census shows the overall number of pupils in a given year stage eg over 60,000 in S1 (Scottish Executive 2006c). If, as proposed, 150 schools are selected then the numbers per year stage in 150 schools will be over 20,000 on average (Table 5.5).
Table 5.5: Estimated sample numbers
National Population (2005 census) | Publicly-funded | Independent | Total |
|---|
Number of secondary schools | 385 | 57 | 442 |
|---|
Number of pupils in S1 | 58879 | 2846 | 61725 |
|---|
Average per school | 152.9 | 49.9 | 139.6 |
|---|
Initial Sample estimate |
|---|
Number of secondary schools | 130 | 20 | 150 |
|---|
Total number of pupils in S1 | 19881 | 999 | 20880 |
|---|
Numbers for administration of 1 st contact within school |
|---|
Target sample @ 50% of year stage within sample schools | 9941 | 499 | 10440 |
|---|
Estimated achieved sample based on 97% response rate | 9642 | 484 | 10127 |
|---|
Pupils in Scotland 2005, Statistical Bulletin Edn/B1/2006/1. Revised version April 2006
5.65 If the 1 st contact survey is administered within the school, we can hope for a 97% response rate on the basis of other surveys. Thus, a 50% sample of the year stage in sample schools will be sufficient to provide a sample of 10,000 if the sample is administered within the school.
5.66 However, if the 1 st contact survey is administered at home using mixed methods (ie interviews by telephone, face to face, or internet) expected response rates may be between 65% - 70%. In this case, to get an achieved sample of 10,000 it would be necessary to start with an initial sample of 14,250 - 15,250 pupils.
5.67 If we are more pessimistic, and assume that the response rates among young people who are low attainers or not in education, employment or training are as low as 60%, we estimate that the achieved sample numbers at the 1 st contact will be will be around 1,400 - which should be sufficiently large for most analyses.
Selecting sample members within schools
5.68 As shown by Table 5.5, a 50% sample of pupils within selected schools should provide a 10,000 achieved sample if the first survey sweep takes place at some point during the compulsory school stages S1-S4. Nevertheless, we recommend that the whole year group of pupils should be included in the initial contact, since it tends to be easier and less disruptive for schools to provide access to a whole year group rather than requiring them to extract individual pupils from a class on a random basis (as is the case with the SSA). We considered an alternative approach of selecting whole classes from the year group, but fear that the sample would be biased since membership of classes may be subject to ability-grouping. Surveying the whole year group within the sampled schools would reduce the risk of bias.
5.69 A major advantage of surveying the whole year group in the sample of schools is that it would provide the capacity to analyse the nature of non response to subsequent sweeps and to construct weighting variables on a sound basis. It would also provide the basis for over sampling of certain sub-groups in later sweeps of the survey.
5.70 The collection of data on the whole year group in the selected schools also offers the possible basis for a range of other studies on a cost-effective basis. It would be possible, for example, to link the survey data to other administrative data which is routinely collected for the full year group; such a combined dataset would provide a rich resource for cross-sectional analysis of specific issues. It may also be worth exploring the potential of a survey of a whole year group to provide the basis for the selection of samples for other follow-up surveys.
5.71 The main disadvantage is cost; as we detail in the next chapter, we estimate that the difference between surveying a sample of young people (10,000) compared with a year group (assuming 20,000 young people) is in the region of £10,000 and £32,500 depending on whether a paper or web-based the method is used.
Selecting a sample where the 1st contact survey is administered at home by mixed methods
5.72 If the survey is to be administered at home using mixed methods as in Design B, the sample could potentially be drawn as a random sample of young people, as was the design for SSLS. However, the field work costs for a completely random sample would be relatively high, and we recommend that a sample be drawn on the basis of post-code sectors to reduce the field work costs. ScotXed records should be used as a sampling frame to draw random samples by post-code sector. We estimate that a target sample of 14,250 to 15,250 pupils may be necessary to achieve 10,000 respondents in view of the higher levels of refusal or non-response that are likely.
Sample frame
5.73 The sampling frame for SSLS has been the records held by the SQA. These records only cover those who have been candidates for SQA examinations and so exclude the minority of students who have taken examinations offered by other providers as well as pupils who have not been entered for public examinations. Schools are therefore asked to supplement the sample drawn from the SQA records with other eligible students. The SQA record provides the young person's name, home address, date of birth, gender, presenting centre, exam presentations and awards.
5.74 We propose that the data collected as part of the ScotXed data exchange scheme 10 should be used as the sampling frame for a new longitudinal study. The data that ScotXed holds are drawn directly from schools by Local Authorities who make a return to ScotXed, and they offer a number of advantages over the SQA record as a sampling frame. If the first sweep of the longitudinal study is to take place earlier in compulsory education then many young people will not have yet been entered for external exams and so will not have an SQA record. ScotXed data have a further advantage in that they also include pupils who are not presented for external exams. The ScotXed data therefore offer a more comprehensive sampling frame than the SQA record.
5.75 Another major advantage of ScotXed data as a sampling frame is the additional information that they hold on young people including free school meal entitlement; looked after status; ethnicity; nationality; asylum status; home language; previous school attended; and a variety of information about additional support needs including whether the pupil has an Individualised Education Programme or Record of Needs, the level and nature of their difficulty, the extent of integration, and the extent of support and adaptation needed. This extra information provides the capacity to boost the sample of particular sub-groups of policy concern.
5.76 The main disadvantage of ScotXed as a sampling frame is that it does not include data from independent mainstream schools 11 (it covers publicly-funded schools and independent special schools). The ways pupil data are held by schools in the independent sector are diverse and several different management information systems are used. However, a number of the independent schools, especially the larger co-educational day schools, use the Phoenix Management Information System while others use SEEMIS. These are the two management information systems in use by publicly-funded schools in Scotland from which data are supplied to ScotXed. It would be necessary for any future survey team to work with the independent schools to collect appropriate pupil-level data for inclusion in the sampling framework.
5.77 It also must be remembered that the data to ScotXed are supplied by schools and so reflect their knowledge of pupils and their circumstances. In the case of looked-after children, for example, they may not have complete knowledge of this and it may be necessary to supplement it with information from the Local Authority.
Parental information and possible parental interviews
5.78 The SSLS asks young people for information on their parents'/carers' education, status and occupation; these data are used to construct socio - economic variables which are central to the analysis of social inequalities. Since the SSLS is sent to young people at their home it is possible for them to check such information with their parents/carers. Obviously if the survey is administered in school then this is not possible. One strategy would be as part of the briefing process to ask young people to consult their parents about this in advance of completing the questionnaire; this could be reinforced through the communications with parents/carers as part of the consent procedure.
5.79 An alternative approach would be to include a direct contact with parent/carers to collect data on their education and employment history as well as information on their involvement in their child's education and their aspirations for their child. This was an option supported by a number of Scottish Government staff and external stakeholders in the consultation.
5.80 In England parental interviews form an integral part of LSYPE and have also been used in the American study ELS:2002. If parental interviews were to be adopted as part of the new study, we would recommend for reasons of economy that the approach taken should be similar to that of ELS:2002, that is a 15 minute telephone interview conducted at the time of the first sweep. If a sample of pupils has been surveyed then all of their parents/carers would be contacted. If a whole year group in the sample schools has participated, the costs of interviewing all parents/carers would be prohibitive and so it would be necessary to sample parents/carers. This would be done by first selecting the sample of pupils to take part in the post S4 sweeps and interviewing their parents/carers.
Recommendations
- The size of the achieved sample at the initial sweep should be 10,000 pupils.
- The sample should be representative of the whole of Scotland.
- If the first survey sweep is to be carried out within the school context, the sample should be clustered by school, with a sample of 150 schools selected to provide a nationally representative sample. The whole year group within the sample schools should be included in the initial sample to avoid disruption within the schools.
- If the first sweep is to be conducted at the home address using mixed methods, the sample should be clustered by post-code sectors, to provide a nationally representative sample.
- It may be appropriate to over-sample sub-groups of pupils at the initial stages.
Options for linkage of administrative data
5.81 There is considerable potential for using factual data from administrative sources to provide components of a longitudinal study of young people's transitions. Administrative data can complement the proposed survey by providing information for sampling and tracing as well as data of a factual nature on qualifications and destinations thus reducing the amount of factual information that needs to be asked in a survey. However, administrative data are not an alternative to a survey because they cannot provide information on young people's attitudes, aspirations and choices, and cannot be used to explain differences in young people's transitions. Possible sources of administrative data at different ages and stages are illustrated by Figure 5.2, and further details of each are listed in Appendix 4.
5.82 We suggest that administrative data should be used for three key purposes:
- To provide background details for selecting a cohort sample, including boosted samples of young people who may be the focus of policy interest, such as those most likely to not to engage later in education, employment or training.
- To track the destinations and statuses of sample members at later stages of the study to alleviate problems of non-response to the survey.
- To use data on education, training and other activities at different stages or phases of young people's transitions that can be linked to survey data within the longitudinal study.
Figure 5.2: Administrative data by age/stage
Ages/stages | |
12-18 | ScotXed Personal characteristics: Sex, date of birth, first language, ethnicity, nationality, religion, free school meal entitlement, looked-after status, disability, special educational needs, postcode (to which area deprivation can be linked), Attendance: Measures of attendance, different types of absence, and exclusions Schools attended: Indicators of pupils who moved schools between stages; school contextual information |
12 onwards | Scottish Qualifications Authority ( SQA) National Qualifications: Record of each NQ attempted and its result at |
15-18 | Careers service survey of school leavers Post-16 destinations: education; training; job; not in education, employment or training |
16+ | Further Education Statistics ( FES): Information on courses attempted at FE college, including level and result | Scottish Qualifications Authority Information on national qualifications, Higher National awards and Scottish vocational qualifications | Scottish Enterprise Information on government-supported training including type, length and result | |
17+ | UCAS Applications to HE: Information on which courses applied for and which accepted | As above | As above | |
17+ | HESA Higher Education courses attempted, completed or dropped | As above | As above | |
18+ | As above | As above | As above | Department for Work and Pensions/ Work Centre Plus Information on participation in government programmes eg New Deal, Employment Zones and Action Team for Jobs |
Policy developments relating to administrative data
5.83 Recent developments in computerised systems for administering education, training and qualifications have made it possible for the Scottish Government to access data from schools, colleges, universities, training providers, the Careers Service, and Scottish Qualifications Authority. These developments have greatly increased the quality of information on the outcomes of education and training. However, at present data from different sources are not routinely brought together for longitudinal analysis purposes. For example, in Scotland there are no plans to copy practice in England with respect to the creation of the Department for Children, Schools and Families data warehouse and National Pupil Database (Jones and Elias 2006).
5.84 The use of existing data from a range of sources is part of the Scottish Government's strategy on Managing Information Across Partners ( MIAP). 12 The aim of the Scottish MIAP strategy is to "generate an environment whereby high quality information is available to inform decision-making at all levels in such a way as to provide benefits to all learners and other stakeholders in the Scottish Education Sector" (Mason 2006).
5.85 The MIAP strategy is part of the Open Scotland Information Age Framework ( OSIAF) which provides common frameworks for sharing person-level data between government departments. An example is the Scottish Exchange of Educational Data (ScotXed) framework which enables the sharing of pupil-level data between schools, local authorities and government departments and agencies. Already statisticians within the Schools Directorate link ScotXed and National Qualifications data in order to analyse differences in attainment among school leavers (Scottish Executive 2007a).
5.86 However, a recent report on MIAP explored difficulties in linking data from different stages of learning, especially from school to further and higher education, and argued: "Trying to track learners through programmes of education ... can be difficult. It is however very valuable information. From analysing returns already in existence one can monitor activity in a particular phase of education but the real added value comes when data sets can be linked across phases of learning." (Mason 2006, section 8 - our emphasis).
5.87 The longitudinal study proposed in this report could provide a focus for linking information about Scottish learners across different phases of their learning and transitions. For selected cohorts the study would link factual data collected by different government bodies with survey data from young person on their experiences, their attitudes, aspirations and choices in order to track and explain the routes taken by different types of young people through education, training and the labour market.
Use of administrative data for sample details and weighting
5.88 Throughout its history the sample for the SSLS has been provided from administrative sources. Initially the name, address, sex and date of birth of the sample was transferred manually from school administrative records. From 1993 onwards the sample was extracted electronically from the SQA records, and passed to schools for updating. These methods produced nationally-representative samples of young people in the Scottish S4 cohort, including independent schools.
5.89 We noted earlier that the development of the ScotXed framework provides the opportunity to select samples of young people from publicly-funded secondary schools with more detail, and much less effort, than in the past, and it also provides the opportunity to collect sample details at earlier year stages and has the potential to identify boosted samples of certain sub-groups of young people.
5.90 Linkage of ScotXed data would also be extremely useful in weighting procedures. Data on the characteristics of sample members are useful in order to provide weighting systems to compensate for non-response bias. In past SSLS a weighting system was developed which compared the numbers of respondents in each category of attainment and sex with the relevant numbers in the S4 population as a whole. Linkage of ScotXed data would enable the proposed longitudinal study to derive a weighting system based on a wider range of variables (which will be particularly important if the initial sweeps are in S1 or S2, when National Qualifications data are unlikely to be available).
Use of administrative data for tracking sample members
5.91 Reducing sample attrition is very important for the proposed longitudinal study, so that sample members are not lost through non-contact or non-response at later survey sweeps. In some cases administrative data sources may provide a means of renewing contact with sample members, and in other cases they may provide some limited information of the characteristics of non-respondents.
5.92 While young people are under the age of 19, they may be tracked through the ScotXed and Careers Scotland datasets using the Scottish Candidate Number ( SCN) as a means of identification. For example, young people may have changed addresses and schools between the S1 and S6 stages, and this information is recorded in ScotXed returns which include SCN. In addition, Careers Scotland has procedures for contacting young people up to the age of 19, and has particular interest in contacting (and helping) those who are not in education, employment or training.
5.93 At later ages the tracking of non-contacts and non-respondents becomes messier. In the near future, when the system of data sharing becomes better established, it should be possible for a list of SCN to be checked against the records of the Scottish Funding Council, the Higher Education Statistics Agency ( HESA) or Scottish Enterprise in order to ascertain whether they are currently recorded in education or training. (It may also be possible for a list of names and other details to be checked against the records of the Department of Work and Pensions ( DWP), but this may be more problematic.)
5.94 If prior consent has been given by each sample member, these agencies can be asked to provide status information and new contact details for those tracked through their databases. However, if the relevant agency is unwilling to provide this information they could be asked to forward a letter to the sample member.
Longitudinal data from administrative sources
5.95 Administrative data can provide valuable factual information about young people's qualifications and destinations. For example, there are already many analyses of young people's attainment derived from SQA administrative data. In particular, the factual administrative data can be linked to survey data which provides complementary information such as young people's choices, perceptions and attitudes.
5.96 In England efforts are being made to create longitudinal datasets by linking data from different administrative sources, for example, the National Pupil Database developed by the Department for Children, Schools and Families (formerly DfES) which records pupil characteristics, attendance and attainment at school, linked to the Individual Learner Record developed by the Learning and Skills Council ( LSC) which records college courses and achievements including work-based learning, to which data on Higher Education are now being added (see Jones and Elias 2006, chapter 2).
5.97 In Scotland, the linkage of administrative data has not progressed to the same extent as in England, but recent reports commissioned by the Scottish Government MIAP group suggest some ways forward (Mason 2006). Key requirements of data for developing a longitudinal study are that they should be available at individual-level, 13 and should include a unique identifier in the form of the Scottish Candidate Number ( SCN). These issues are discussed in the recent MIAP report (Mason 2006) and further research has been commissioned by the MIAP group to investigate the potential for using the SCN as a unique learner number in tertiary education and training.
5.98 The most detailed and comprehensive data on Scottish young people are those covering the school stages between ages 12 and 17/18; ScotXed data cover all pupils who attend state-funded schools, while SQA data cover all students who study Scottish National Qualifications ( NQ) and Vocational Qualifications ( SVQ). The main gaps with these data relate to pupils attending independent schools, and students studying for non- SQA qualifications. Some thought needs to be given as to how to collect comparable data on these small minorities of pupils in the cohorts included in the longitudinal study.
5.99 In addition, Careers Scotland holds comprehensive data on young people in state-funded schools from age 12-19 for purposes of providing appropriate careers advice. These data are subsequently linked to information on young people's first destinations after leaving school, collected by Careers Scotland through its annual survey. The Careers Scotland survey contacts each young person directly to find out what they are doing, and an important aim of this survey is to identify young people who are not in education, employment or training so that these young people can be offered help from a careers adviser. Thus, the Careers Service survey of leavers' destinations provides comprehensive coverage of a single piece of information - which can be linked to the proposed longitudinal study using SCN. Independent schools have separate provision for careers advice, and their information on leavers' destinations is provided by each school, rather than by the young person.
5.100 Potentially, individual-level data on young people studying in further and higher education, including qualifications attempted and whether or not they are achieved, should be available from FES and HESA (and also perhaps data on higher education applications from Universities and Colleges Admissions Service ( UCAS)), but at present the linkage of these data to ScotXed records is not well established. The recent Scottish MIAP report discusses models for improving the quality of such data and linkage (Mason 2006). 14 The English MIAP is developing Individual Learner Records from data supplied by Colleges and linking these with the National Pupil database. Work is also underway to match HESA data to the National Pupil database for a research study on "Widening participation in Higher Education" (Jones and Elias 2006). Although at present it is not straightforward to link data from school and post-school education sectors, it seems that new frameworks are being developed to facilitate data sharing so that in future it will be possible to link data on Scottish learners from administrative sources for the proposed longitudinal study. Staff at the Scottish Funding Council described to us the work they were undertaking to link data from Further Education Statistics ( FES) to data from the Higher Education Statistics Agency ( HESA), Universities and Colleges Admissions Service ( UCAS) and Students Awards Agency for Scotland ( SAAS). At present SCN cannot be used for linkage, and the data link uses surname, initials, gender, date of birth and postcode, but in future the use of a unique learner number will facilitate linkage.
5.101 Similarly, data collected by Scottish Enterprise on young people in government-supported training ( GST) could provide valuable information on the experiences of young people in training. Scottish Enterprise holds records of young people undertaking a range of training schemes, such as Skillseekers and Training for Work, including dates of starting, leaving and completing the programme, details of any qualification attempted/completed, and personal details such as date of birth, gender, and phone number. SCN is included in the record so that qualifications can be verified from the SQA and other vocational qualification boards. Data on qualifications are thought to be robust because payment to the Training Provider is dependent on qualifications achieved. Other information recorded by Scottish Enterprise includes whether the young person has "employed-status" as a trainee, information on the employer including Standard Industrial Classification ( SIC), and subsequent progression by the young person after completion of a programme. Scottish Enterprise data on government-supported training would be especially valuable for a longitudinal study of young people's transitions in view of the problems with eliciting information on training from surveys. Although linkage of these data to ScotXed is still difficult, staff at Scottish Enterprise explained the moves they were currently making towards an internet-based system of data sharing, and it should be possible for data on government-supported training to be linked to the cohorts included in the proposed longitudinal study. (For example, following the previous SSLS design, a cohort of S4 pupils in session 2007/8 might be surveyed at age 18 in 2011, by which time processes of data linkage should be much better developed.)
Summary of administrative data that should be linked to the survey data
5.102 Administrative data to link with pupils' responses at 1st contact in compulsory stage:
- ScotXed data eg ethnicity, additional support needs, looked after status, FME etc
- SQA records (and other exam boards)
- Careers Scotland database
- Contextual information on the school
- Contextual information on pupils' home area
5.103 Administrative data for updating or new linkage in subsequent sweeps:
- ScotXed data
- SQA records (and other exam boards)
- Careers Scotland database
- Contextual information on neighbourhood
- FES data on participation and attainment in further education
- HE data - UCAS, HESA, SAS
- Scottish Enterprise data on participation and achievement in government-supported training
- JobCentrePlus data on adult programmes
The legal framework for administrative data linkage
5.104 At present the legal position regarding administrative data sharing and linkage is not well defined. Section 33 of the Data Protection Act 1998 states that administrative data may be used, where appropriate, for "statistical and historical research". Sharing of administrative data is also subject to the Human Rights Act 1998, which protects individuals' right of confidentiality.
5.105 Jones and Elias (2006) suggest that the application of the legal framework means that "departments must be careful to ensure that data sharing is lawful and that confidentiality is maintained. In practical terms this means anonymising (or pseudonymising) data before release. Anonymising requires the removal of name, address, full postcode and any other detail or combination of details that might support identification." (p71).
5.106 There is considerable impetus within the UK government and Office for National Statistics ( ONS) to clarify these issues and move forward the use of administrative data. Currently, the Statistics and Registration Service Bill is being considered by the UK Parliament (with Scotland included through a Legislative Consent Motion of the Scottish Parliament). The Bill provides for the creation of a new Statistics Board, operating at arms-length from Ministers as a non-Ministerial department, responsible for promoting and safeguarding the quality and comprehensiveness of official statistics. It includes provision for improved data sharing and an enhanced role for the Scottish Parliament in the scrutiny of Scottish statistics.
The pros and cons of administrative data
5.107 A report to the ESRC by Jones and Elias (2006) identified the following benefits of using administrative data for research and analysis:
- 100% coverage of target population;
- Larger sample sizes for sub-groups such as young people not in education, employment or training;
- Attrition is minimized;
- Accuracy - administrative data are less subject to recall error or mis-reporting;
- Timely data - administrative data are regularly updated;
- Non-intrusive;
- Cost-saving - the data already exist;
- Linkable.
5.108 However, we would suggest that administrative data are very limited in scope, and are best used to complement survey data, for the following reasons.
- Administrative data only cover the facts that have been recorded by routine systems in schools, colleges and government agencies for their particular purposes. They tell us nothing about the young people's perceptions of their experiences, problems encountered, their attitudes and aspirations, or their reasons for making particular choices;
- Although the SQA record formal qualifications, administrative data provide no information on informal learning, or wider aspects of achievement;
- They do not provide any information on wider aspects of young people's lives such as family background, health, leaving home, housing choices or other aspects of the transition to adulthood.
Recommendations
- Administrative data should be linked to the cohort surveys to complement the information provided by respondents.
« Previous | Contents | Next »