On this page:

Meeting the Needs for Longitudinal Data on Youth Transitions in Scotland - An Options Appraisal

« Previous | Contents | Next »

Listen

CHAPTER 6 DESIGN AND GUIDELINE COSTS

6.1 In the light of the options discussed in the previous chapter, we recommend two main designs for a new longitudinal study:

  • Design A: first survey young people while they are still in compulsory education with subsequent surveys at key points up to their mid or late 20s. Interim measures to maintain the continuity of data collection need to be considered.
  • Design B: first survey young people in the year after compulsory education as in SSLS but change the data collection strategy to mixed methods and boost sample numbers in certain sub-groups of young people.

Design A

6.2 The key features of Design A include:

  • young people surveyed while in compulsory education (one or two sweeps);
  • a sample of publicly-funded and independent schools;
  • administrative data for linkage by Scottish Candidate Number to be provided by ScotXed;
  • within the sample of schools, all pupils in the year group OR a random sample;
  • survey to be conducted in school using paper or web-based questionnaires;
  • measures to follow-up absentees and coverage of young people in alternative education provision;
  • an additional element of telephone interviews with parents to establish background information;
  • follow-up sweeps conducted with young people in their own homes using a mixed mode approach;
  • boosted sampling of young people in sub-groups of policy interest;
  • incentives to be used at follow-up sweeps to maximise response;
  • administrative data for linkage by SCN to be provided by SQA, Careers Scotland and other MIAP organisations;
  • regular communication with respondents between sweeps to increase sense of survey identity; study website.

Figure 6.1: Summary of Design A

One sweep in compulsory education

Two sweeps in compulsory education

sweep 1: S3 or S4

Sweep 1: S1 or S2

Sweep 2: 16-17

Sweep 2: S3 or S4

Sweep 3: 18-19

Sweep 3: 16-17

Sweep 4*: 22-23

Sweep 4: 18-19

Sweep 5*: 26-27

Sweep 5*: 22-23

Sweep 6*: 26-27

* provisional

Costs for Design A

First sweep of research - in schools

6.3 As discussed in Chapter 5 there are two main options for data collection in schools - paper based questionnaires and web-based questionnaires. In addition, there are two options in respect of the proportion of pupils to be surveyed: the entire year group or a sample. Guideline costs are included for each approach and potential sample size. All costs exclude VAT and are based on 2007 rates so inflation may need to be added. Costs are given for one survey sweep in schools between S1 and S4, and the cost should be doubled for two sweeps.

6.4 Figure 6.2 presents the costs for Design A, outlining the assumptions on which these costs are based. Figures 6.3 and 6.4 then summarise the costings.

Figure 6.2: Costs for Design A

Sweep(s) in compulsory education: paper based questionnaires

Assumptions

150 schools to be recruited

Initial recruitment of head teacher for approval via telephone unit

Face to face interviewer arranges appointment with school

Interviewer responsible for co-ordinating administration of survey on appropriate day and follow-up of absent pupils

Questionnaire piloted

Data produced in SPSS

24 page colour questionnaire

Results scanned

Guideline costs

paper based data collection in schools with a sample of 10,000 young people: £75,000 + VAT

paper based data collection taking all pupils in the relevant school year, estimated at 20,000 young people: £107,000 + VAT

Sweep(s) in compulsory education: web-based questionnaires

Assumptions

150 schools to be recruited

Initial recruitment of head teacher for approval via telephone unit

Face to face interviewer arranges appointment with school

Interviewer responsible for co-ordinating administration of survey on appropriate day and follow-up of absent pupils

Questionnaire piloted

Data produced in SPSS

Survey length equivalent of 15 minutes

Survey website hosted by the agency

Guideline costs

web-based data collection in schools with a sample of 10,000 young people £30,000 + VAT

web-based data collection taking all pupils in the relevant school year, estimated at 20,000 young people: £40,000 + VAT

Parental interviews - telephone

Assumptions

7,500 sample members issued

6,000 achieved sample

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

parental interviews: £170,000 + VAT

Post S4 sweep of research at home address

Assumptions

10,000 sample members issued

7,500 achieved sample

80% eligible for telephone interviewing

£5 conditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

post S4 sweep of research: £400,000 + VAT

Third sweep of research at home address

Assumptions

7,500 sample members issued

6,400 achieved sample

90% eligible for telephone interviewing

£5 unconditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

third sweep of research: £360,000 + VAT

Fourth sweep of research at home address

Assumptions

(Note that it is difficult to judge whether assumptions so far into the future are valid)

6,400 sample members issued

5,750 achieved sample

95% eligible for telephone interviewing

£5 unconditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

fourth sweep of research: £330,000 + VAT

Fifth sweep of research at home address

Assumptions

(Note that it is difficult to judge whether assumptions so far into the future are valid)

5,750 sample members issued

5,150 achieved sample

95% eligible for telephone interviewing

£5 unconditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

fifth sweep of research: £320,000 + VAT

Figure 6.3: Summary of costs for Design A - one sweep in compulsory education

Sweep

Age/stage

Paper-based survey

Web-based survey

Mixed methods

Sample of year group

Whole year group

Sample of year group

Whole year group

Sweep 1

S3 or S4

£75,000

£107,000

£30,000

£40,000

parents

£170,000

Sweep 2

16-17

£400,000

Sweep 3

18-19

£360,000

Sweep 4*

22-23

£330,000

Sweep 5*

26-27

£320,000

NB: Costs exclude VAT

Figure 6.4: Summary of costs for Design A - two sweeps in compulsory education

Sweep

Age/stage

Paper-based survey

Web-based survey

Mixed methods

Sample of year group

Whole year group

Sample of year group

Whole year group

Sweep 1

S1 or S2

£75,000

£107,000

£30,000

£40,000

parents

£170,000

Sweep 2

S3 or S4

£75,000

£107,000

£30,000

£40,000

Sweep 3

16-17

£400,000

Sweep 4

18-19

£360,000

Sweep 5*

22-23

£330,000

Sweep 6*

26-27

£320,000

NB: Costs exclude VAT

Design B

6.5 The second potential design for the new study would be to continue the SSLS approach of first surveying young people in the year after compulsory education but with major changes to the sampling frame, to the sample, to the data collection strategy and with greater linkage of administrative data. As we have suggested, this approach could also be used as an interim measure if it was decided to proceed with Design A as the longer term strategy.

6.6 The key design features of Design B include:

  • 1st contact in the year after compulsory education (16-17)
  • sample frame ScotXed
  • sample - boosted by eg attainment, deprivation
  • 1st sweep achieved sample = 10,000
  • contacted at home using mixed methods - telephone, web, postal, face-to-face
  • contacted again at: 18-19; 22-23?; 26 -27?
  • incentives to encourage response
  • administrative data for linkage by SCN to be provided by ScotXed, SQA, Careers Scotland and other MIAP agencies
  • website and other measures to maintain contact

6.7 In Design B, the first contact with young people would be in the first year after the completion of compulsory education with subsequent sweeps as outlined for Design A.

Figure 6.5: Summary of Design B

First sweep after S4

Sweep 1:

16-17

Sweep 2:

18-19

Sweep 3*:

22-23

Sweep 4*:

26-27

* provisional

6.8 A key difference of Design B compared with Design A is that in Design B, the contact with young people in the year after compulsory education will be the first contact with them whereas in Design A they would already been surveyed one or more times in school. This will impact on costs and may affect response rates.

Costs

6.9 One issue is that without the prior contact in school, telephone numbers may not be available for as many respondents as in Design A. In Design B the survey would be reliant on what schools can provide and what can be found via telephone number matching. As such we have assumed that only 50% of the initial sample would have available telephone numbers in our costs compared with our assumption of 80% in Design A.

6.10 The second difference is that since this would be the initial sweep of research it would therefore lack the recognition that could have been built up by continuing "keep in touch" exercises following the school based survey. This may affect response rates.

Data collection methods

6.11 In putting together our guideline costs we have assumed that the data collection strategy in Design B will be the same as that proposed for the post S4 sweeps of Design A. Thus in Design B all sweeps of research would be conducted with respondents at their home address using a mixed methodology of telephone, postal, web and face to face interviewing.

6.12 Respondents would initially be assigned to either telephone interviewing or a combined postal / face to face data collection strategy depending on whether telephone numbers are available. In addition to this, all respondents would be able to complete the survey online. This design would then be repeated for each of the subsequent sweeps.

Web-based data collection

6.13 In the first instance all respondents should have the option of completing the follow-up survey online. Where e-mail addresses are available respondents should be invited to participate via a hyperlink and password/study number sent out by e-mail. In addition to this, the existence of the online completion option could be highlighted in all advance correspondence sent to respondents along with the password/study number they would need in order to access the site.

6.14 Respondents who complete the survey online would then be removed from the sample for the other data collection strategies to which they had been assigned. We have assumed that around 500 respondents (5%) would choose to complete the survey online for the purpose of these guideline costs.

Telephone data collection

6.15 As noted above, we have assumed that telephone numbers would be available for approximately 50% of all sample members in our guideline costs. For costing purposes we have assumed that 5,000 respondents would be assigned to CATI data collection and that 3,000 (60%) would complete the interview over the telephone or have completed the interview online. We have also assumed that 1,000 respondents would be eligible to be re-assigned to the combined postal and face to face data collection strategy due to an incorrect telephone number or failure to make contact.

Combined postal and face to face data collection

6.16 A combined postal and face to face data collection strategy would be used for those for whom no telephone number is available or who were not contactable during the telephone interviewing stage. We have assumed that 6,000 respondents would be eligible for this data collection strategy.

6.17 In the first instance a postal data collection strategy would be used for these respondents with non-responders followed up by face to face interviewers using Computer- Assisted Personal Interviewing ( CAPI). In our guideline costs we have assumed that 2,400 respondents (c.40%) assigned to this strategy would return a postal questionnaire or have completed the survey online. The remaining 3,600 sample members would then be transferred to face to face data collection.

6.18 Face to face interviews would be used with the most difficult to reach young people. In our costs we have assumed that around 1,800 interviews would be conducted face to face from an initial start sample of 3,600 contacts.

6.19 In total 7,200 interviews would be completed at Sweep 1.

Figure 6.6: Guideline costs for Design B

First sweep of research at home address (post S4)

Assumptions

10,000 sample members issued

7,200 achieved sample

50% eligible for telephone interviewing

£5 conditional incentive

Questionnaire piloted

Data produced in SPSS

survey length 15 minutes / 20 pages

Guideline costs

first sweep of research: £435,000+ VAT

Second sweep of research at home address

Assumptions

7,200 sample members issued

6,100 achieved sample

90% eligible for telephone interviewing

£5 unconditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

second sweep of research: £350,000+ VAT

Third sweep of research at home address

Assumptions

6,100 sample members issued

5,500 achieved sample

95% eligible for telephone interviewing

£5 unconditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

third sweep of research: £325,000 + VAT

Fourth sweep of research at home address

Assumptions

(Note that it is difficult to judge whether assumptions so far into the future are valid)

5,500 sample members issued

5,000 achieved sample

95% eligible for telephone interviewing

£5 unconditional incentive

Questionnaire piloted

Data produced in SPSS

Interview length 15 minutes / 20 pages

Guideline costs

fourth initial sweep of research: £310,000 + VAT

Figure 6.7: Summary of costs for Design B - 1 st sweep after S4

Sample using mixed methods

Sweep 1

16-17

£435,000

Sweep 2

18-19

£350,000

Sweep 3*

22-23

£325,000

Sweep 4*

26-27

£310,000

Guideline costs for a 'Keep in Touch' Exercise

6.20 In order to maintain response between sweeps and minimise attrition we would recommend conducting a number of "keep in touch" exercises with respondents and these are outlined in the next chapter. Here we provide a unit cost for a single "keep in touch" exercise based upon an achieved sample of 10,000 respondents; this can be pro-rated for smaller sample sizes.

6.21 Guideline costs for a single "keep in touch" exercise £8,500 + VAT.

Guideline costs for data linkage and derived variables

6.22 We have recommended the linkage of administrative data as an integral part of the longitudinal study design and below give some guideline costs for this (Figure 6.8). These costs exclude overheads. It should also be noted that the costs are based on the assumption that the MIAP strategy will have been successfully implemented and that data will be available for linkage.

Figure 6.8: Data linkage and derived variables

1. Data linkage of survey(s) at compulsory stages

Data source(s)

Activities

Data from ScotXed for linkage to pupil data by SCN

Derive variables on pupil characteristics, attendance, absence and exclusions

Derive weighting variables comparing achieved sample to target sample

Identify pupils who have moved schools

Local-area statistics for linkage to pupil data by postcode of home address

Derive variables on local area deprivation, and housing characteristics ( MOSAIC)

Data from SQA for linkage to student data by SCN

Derive variables on students' attainment at S2 and S3, in further education and from Skills for Work courses from SQA records of Standard Grade, NQ and VQ.

Derive variables from subject and qualification levels to describe curricular tracks

Derive variables on school-college combinations from institutional codes

School-level information for linkage to pupil data by school identifier

Obtain variables for school type, denomination, roll size, free-meal entitlement and urban-rural categorisation

Link variables for the travel-to-work area of the school, including unemployment rates and occupational structure

Derive school context variables from the average characteristics of pupils in the sample eg percentage of minority ethnic pupils, percentage with English as additional language

Estimated time for linkage in the compulsory stage : 15 days per survey sweep

Guideline costs: £3750 + Overheads/ VAT

2. Data linkage of surveys at 16/17 and 18/19

Data source(s)

Activities

Data from ScotXed for linkage to student data by SCN

Derive variables on attendance, absence and exclusions

Derive weighting variables comparing achieved sample to target sample

Identify pupils who have moved schools, and those who have left school by stage

Data on Education Maintenance Allowances ( EMA) from Scottish Executive for linkage to student data by SCN

Derive variables on period funded by EMA and link to student record

School-level information for linkage to student data by school identifier

Link variables for the travel-to-work area of the school, including unemployment rates and occupational structure relevant to age of pupil at month of survey

Data from SQA (and other qualifications authorities15) for linkage to student data by SCN

Derive variables on student's attainment at S3, S4, S5, S6, and in further education and training from SQA records of Standard Grade, NQ and VQ ( GCSE and A-level).

Derive variables from subject and qualification levels to describe curricular tracks

Derive variables on school-college combinations from institutional codes

Data from Careers Scotland (and Job Centre Plus) for linkage to student data by SCN

Identify school leavers by stage and compare with ScotXed data

Link variables on first destinations of school leavers and participation in GST

Obtain data from Scottish Enterprise on Government-supported training ( GST) for linkage to student data by SCN

Identify trainees and compare with Careers Scotland data

Derive and link variables describing participation in GST, including start/finish date, length of training, outcome of training, type of provider, occupational category

Obtain data from Scottish Funding Council for linkage to student data by SCN

Identify students in further education, and compare with Careers Scotland data

Derive and link variables describing participation in further/higher education, including start/finish date, length of study, mode of study, type and level of course, subject areas

Obtain data from UCAS on applications for higher education for linkage to student data by SCN

Derive and link variables describing type and location of institutions, and type and subject of courses, for which the student has applied.

Derive and link variables describing outcome of application, including whether they get their first choice etc.

Estimated time: 25 days per survey sweep at 16/17 and 18/19 .

Guideline costs : £6,250+ Overheads/ VAT

3. Data linkage of surveys at 22/23 and 26/27

Data source(s)

Activities

Link survey and admin data

Derive weighting variables comparing achieved sample to target sample

Data from SQA (and other qualification authorities) for linkage to student data by SCN

Derive variables on student's attainment in further education and training from SQA records of NQ, HN and VQ.

Derive variables from subject and qualification levels to describe curricular tracks

Derive variables on institutional type and location from institutional codes

Data from Job Centre Plus for linkage to student data by SCN

Derive and link variables on those not in education, training or employment

Data from Scottish Enterprise on Government-supported training ( GST) for linkage to student data by SCN

Identify trainees

Derive and link variables describing participation in GST, including start/finish date, length of training, outcome of training, type of provider, occupational category

Data from Scottish Funding Council for linkage to student data by SCN

Identify students in further education

Derive and link variables describing participation in further/higher education, including start/finish date, length of study, mode of study, type and level of course, subject areas

Data from UCAS on applications for higher education for linkage to student data by SCN

Derive and link variables describing type and location of institutions, and type and subject of courses, for which the student has applied.

Derive and link variables describing outcome of application.

Data from HESA on participation in Higher Education for linkage to student data by SCN

Derive and link variables describing type and location of institution, level and subject of course, start date, length of period of study, qualification and outcome.

Estimated time: 30 days per survey sweep at 22/23 and 26/27

Guideline costs : £7,500+ Overheads/ VAT

Figure 6.9: Summary of costs for data linkage and derived variables (excludes overheads/ VAT)

Per sweep

Linkage at the compulsory stage(s)

£3,750

Linkage at the 16/17 and 18/19 stages

£6,250

Linkage at the 22/23 and 26/27 stages

£7,500

Costs for analysis and reporting

6.23 For the sake of completeness, we also include here the tasks and costs for analysis and reporting. This is discussed further in Chapter 8.

Descriptive reporting

6.24 For each survey sweep there will be:

  • data manipulation, including joining of survey sweeps;
  • descriptive cross-sectional analysis (frequencies, crosstabs etc);
  • descriptive longitudinal analysis (eg comparing destinations by timepoints);
  • reporting of key topics (eg attainment, attitudes/aspirations, career progression, migration, family formation - each of which will need to take account of progression between sweeps, and differences by gender, social class and area);
  • writing Briefing papers.

Time required for analysis will vary between sweeps, as the study becomes more complex as more data is accumulated, linked and analysed.
Estimated time for analysis of initial sweep: 30 days
Guideline costs : £7,500

Estimated time for analysis for subsequent sweeps: 60 days per survey sweep
Guideline costs : £15,000 + Overheads/ VAT

Special studies of policy areas

6.25 As the longitudinal study develops over time, there will be areas of policy interest that need in-depth longitudinal analysis. These will probably require statistical modelling to explore causal relationships. It is very difficult to estimate the amount of time required for such analysis and reporting but we have attempted to give an indication of this.

Time required for analysis will vary depending on the number of sweeps involved.
Estimated time for a narrow analysis based on 2 sweeps of data: 40 days.
Guideline costs : £10,000 + Overheads/ VAT

Estimated time for a wider analysis based on 4 sweeps: 80-100 days.
Guideline costs: £ 25,000 + Overheads/ VAT

« Previous | Contents | Next »

Page updated: Friday, October 17, 2008