On this page:

2007 Scottish Survey of Achievement (SSA) - Science, Science Literacy and Core Skills - Supporting Evidence

« Previous | Contents | Next »

Listen

Annex II: Assessment materials and procedures

II.1 Science knowledge and understanding

Nature of the assessment

The 5-14 Curriculum Guidelines define Science knowledge and understanding in terms of the following three attainment outcomes:

  • Earth and Space
  • Energy and Forces
  • Living Things and the Processes of Life

Each of these three outcomes is subdivided into three strands and a total of 155 attainment targets. Together these outcomes, strands and targets provide a level-based framework to assess pupil progress in Science.

There were twelve different booklets at each stage. Each booklet covered all three Science outcomes and each booklet covered three levels (two at P3). Within each booklet tasks were grouped into 4-mark task groups relating to specific Science topics, such as 'magnets', 'water' and 'the human body'; a task group comprised anything between one and four tasks, marked in such a way that the final group total would be four marks, and a task itself comprised one or more test items.

A total of 216 task groups were created for use in the survey, 36 at each of Levels A to F, with twelve task groups for each of the three curriculum outcomes. The task groups were distributed into twelve sets of nine (six at P3), ensuring that each set contained one task group from each of the three outcomes at each level. Each set represented the content of a test booklet, of which there were twelve in total at each stage. Within each set (booklet), task groups were presented in outcome blocks, and within each block task groups were presented in order of level. Each booklet was printed in three different versions with each version varying the order of presentation of the outcome blocks in order to minimise any potential effects resulting from test fatigue. Any outcome or topic would thus appear an equal number of times at the beginning of a booklet, in the middle or towards the end of a booklet.

Every pupil was intended to attempt two different booklets, the tasks at any particular level across the two booklets then comprising a 'level test' for that pupil. Cut-off scores were applied to the test score associated with each level test and pupils classified accordingly as 'good start', 'well-established' or 'very good' at the level.

Rationale for developing task groups

The previous (2003) national ( AAP) Science survey involved the administration to pupils of 360 tasks, randomly presented within outcome and level. Each booklet presented pupils with two tasks per level per outcome. In presenting advice on the content of the 2007 survey, the Science Reference Group advised that pupils would be better able to meet the demands of the survey if tasks were linked to lead them through a Science topic or context.

"We believe the focus should be firmly on understanding and that wherever possible we should try to assess that. We would want to ensure an appropriate balance between items which depend largely on the recall of facts and those which require pupils to demonstrate understanding. We do not think that the design used in the 2003 survey, moving in a fairly random way from target to target and outcome to outcome, is the best way to achieve this. …We think that knowledge and understanding could be assessed through sets of questions linked to a theme (possibly one of the overarching ideas) or set in contexts (our preferred option where this is possible) to which children at the target level could easily relate."

(Report of Science Reference Group to SSA Planning Group, 2006)

The Reference Group further advised that the Improving Science Education ( ISE) framework (Feb 2005) would provide a useful organiser for developing survey tasks. The ISE framework clusters the 155 targets into 34 topics which are designed to provide a useful framework for teaching and learning in the classroom. The ISE topics cover from one to four levels, and there are more topics at the upper levels than at the lower levels. Taking a simple numerical count of the levels by topic, A and B are addressed by eight topics, C by eleven topics, D by 14, E by 16 and F by twelve (Table II.1).

Each outcome was equally represented by twelve task groups at each level while the number of task groups representing a particular topic at a level varied according to the number of topics within an outcome and level and the relative importance or weight given to that topic within the curriculum guidelines (Table II.1).

A variety of different item types were used: multiple choice, matching, sequencing, short response and open-ended.

  • multiple choice
  • matching tasks typically required pupils to match properties or attributes
  • sequencing required pupils to put statements or pictures in the correct order, for example to show the life cycle of a frog
  • in short response tasks pupils were required to give one or two word answers to an open question, for example to name a particular class of animal or plant
  • open-ended responses typically required pupils to 'describe' or 'explain', for example to describe how to separate a mixture of sand and rice or to explain why sound travels faster through water than through air.

The tasks were designed to assess pupils' scientific knowledge as well as their understanding of scientific concepts. Figure II.1 shows part of a Level A task group on the topic of 'forces'. The first of the tasks assesses whether a pupil can identify objects that sink or float (straightforward 'knowledge') while the second question probes their understanding of the scientific concept in more depth by presenting a task which requires them to apply their understanding in a specific situation (what would happen to a floating toy boat if marbles were added). A similar progression can be seen in the Level C example (Figure II.2) which gives a whole task group. Another example of a task group, this time at Level E, is given in Figure II.3.

Task development and pre-testing

A total of 164 new 4-mark task groups were developed for use in the survey. These were supplemented by individual tasks used in the previous (2003) Science survey to create a total of 216 4-mark task groups, 36 at each 5-14 level.

Task developers (who were all secondary Science teachers or practising primary teachers with a particular interest in Science education) were commissioned to produce an agreed number of 4-mark 'task groups' at a specified 5-14 level. Each task group was to focus on an identified target or targets within one of the science outcomes and on a specific ISE topic at that level. After review, the tasks were pre-tested in a randomly selected number of primary and secondary schools throughout Scotland, with pupils assigned tasks appropriate to their age and stage. Teachers marked the pre-tests and offered comments on the tasks, the topics and the appropriateness for their pupils. In addition, all the task groups and items were validated using between 3-5 independent teacher judgements.

After pre-testing, the task groups were reviewed and items amended or discarded as appropriate, taking into account comments from teachers, pre-test results and validators' judgements and comments, in order to create 4-mark task groups for use in the survey. A number of tasks were repeated from the 2003 survey.

Booklet administration

Test booklets were randomly allocated to pupils using multiple matrix sampling, each pupil receiving two different booklets. Test sessions were organised and administered by the pupils' own teachers. The assessments were not timed and teachers were instructed to allow pupils sufficient time to complete each booklet. Teachers could assist pupils by, for example, reading a word or phrase or question or explaining if pupils did not understand the method of answering; they were not allowed to give any assistance in answering the questions.

Marking

All pupil scripts were returned to SQA for processing. Pupils' item responses were transcribed onto coding sheets which provided a range of possible options for each item. The response transcription was carried out by individuals from a temping agency under the supervision of Science specialists. After keying, the items and tasks were automatically scored, test results computed, cut-off scores applied and the results weighted (see Annex I) to provide the population estimates reported in Chapter B.

Table II.1 ISE Framework organised by outcome and showing the number of task groups used in the survey for each topic and level

Group

Topic

Outcome

Levels

number of task groups

A

B

C

D

E

F

4

Energy

EF

AB

6

3

5

Energy for living things

LT

B

5

7

Forces

EF

AB

3

6

10

Light and Sound

EF

C

3

13

Electricity

EF

AC

3

6

16

Friction and Air Resistance

EF

CD

3

2

19

Energy Sources

EF

DE

0

2

20

Electric circuits

EF

D

4

24

Sound and Light

EF

DEF

3

2

3

25

Heat transfer

EF

DEF

2

3

2

30

Electricity and Microelectronics

EF

EF

2

3

31

Force and Gravity

EF

DEF

1

3

4

1

Introducing materials

ES

A

6

3

Making materials change

ES

B

6

6

Sun, Moon & Stars

ES

AB

3

3

9

Water

ES

ABC

3

3

3

12

Mixing and Separating

ES

CD

3

2

15

The Earth and its Resources

ES

D

5

18

Space and the Solar System

ES

CDEF

3

3

3

2

22

Model of matter

ES

CDEF

3

2

3

1

23

Acid and metals

ES

E

3

28

Periodic Table

ES

EF

0

4

29

Chemical reactions

ES

EF

3

5

2

Introducing living things

LT

A

9

5

Energy for living things

LT

B

5

8

Plants and animals

LT

AB

3

3

11

Minibeasts

LT

BC

6

0

14

Living on Earth (Plants)

LT

CD

2

4

14a

Living on Earth (Animals)

LT

C

5

17

Human Body and reproduction

LT

CDE

5

5

3

21

Our Environment

LT

DE

3

0

26

Energy flow and living things

LT

EF

4

2

27

Towards evolution

LT

EF

4

2

32

Biotechnology

LT

EF

0

5

33

Cells

LT

DEF

1

3

6

6

6

6

6

6

Figure II.1
Level A - Energy and Forces

Level A - Energy and Forces

Figure II.2
Level C - Living things and the processes of life

Level C - Living things and the processes of life

Figure II.3
Level E - Earth and Space

Level E - Earth and Space

II.2 Science literacy assessment

Nature of the assessment

This was the first year that Science literacy is reported separately in a Science survey and as such was treated as a pilot. Investigation in the early stages of the design of the 2007 SSA identified several definitions of Science literacy. The SSA planning group agreed that the Nuffield definition best met their understanding of Science literacy and that it was close to the concept of 'understanding Science across society' as set out in the Science Strategy for Scotland:

  • appreciate and understand the impact of science and technology on everyday life;
  • take informed personal decisions about things that involve science, such as health, diet, use of energy resources;
  • read and understand the essential points about matters that involve science;
  • reflect critically on the information included in, and (often more important) omitted from such reports;
  • take part confidently in discussions with others about issues involving science.

The items for assessing Science literacy through a written booklet can be mapped to the first four statements of the Nuffield definition.

Eighteen Science literacy tests were administered in the 2007 survey; three at each of Levels A to F. A variety of different themes featured in the tests, as the titles in Table II.2 will illustrate. Tests were developed with particular 5-14 levels in mind.

Table II.2 The 2007 survey Science literacy tests classified by 5-14 level

Level

F

Asthma Breakthrough

A Brand New Solar System

Global Warming

E

The Pluto problem

Mobile Phones

Good Day, Sunshine

D

Hedgehog Cull

The Water Cycle

Saving Energy

C

The Big Bug Count

Keeping Fit and Healthy

Healthy Teeth

B

Ali's Thermometer

Bobby's Magnet

Sounds

A

The Ice Cube Race

Floating and Sinking

Growing Plants

Each test at Level A to Level F took the same general form; a source text followed by a series of questions (test items). All tests had a set number of items at each level; Level A and B tests had 21 items, Levels C tests had 24 items, Level D tests had 27 items and Levels E and F tests had 30 items. The items used at all levels required only a minimal written response (particularly at Levels A and B), and could be easily marked.

The source texts could contain a mix of narrative and numbers, depending on the relevance to the Science topic. The first section of each Science literacy booklet contained a summary completion exercise, comprising one third of the total marks, for pupils to demonstrate their understanding of the main and supporting ideas in the source text. The remaining two thirds of each test comprised questions addressing skills and concepts such as hypothesising, predicting, making informed choices and fair testing. The purpose of this second section was to think critically about the information.

Table II.3 describes one test at each of Levels A, C and E.

Table II.3 Overview of three newly developed Science literacy tests

'Good Day, Sunshine' - Level E

The 600-word source describes the risks and the health benefits of the sun. As for other tests at this level, 10 gap summary completion exercise tests understanding of the text. A further 20 items comprising multiple choice and short answer formats test pupils' ability to think critically and make hypotheses, and their understanding of the concept of fair testing.

'The Big Bug Count' - Level C

The 300-word source including a graph and a fact box describes the recent survey carried out by the Royal Society for the Protection of Birds on the number of midges and flying insects in Scotland. As with other tasks at this level, an 8 gap summary completion exercise tests understanding of the text. A further 16 items comprising multiple choice and short answer formats test pupils' ability to make predictions and think critically and their understanding of the graph and the concept of fair testing.

'Growing Plants' - Level A

The 200-word source including a chart describes how 3 children carry out an experiment to find out what plants need in order to grow. As with other tests at this level, a 7 gap aided summary completion exercise tests pupils' understanding of the text. The multiple choice and short answer items that follow test pupils' understanding of the chart, the concept of fair testing and their ability to make basic predictions.

Test development

The three tests at each of the 5-14 levels were developed by a group of three to four teachers comprising secondary Science teachers, secondary English language teachers and primary teachers with expertise in English language and/or Science.

Independent validation of the 5-14 level of each source text and test was carried out as well as gathering validation evidence from teachers participating in the pre-testing. Each source text and test was validated at the level used in the survey.

Comprehensive pre-testing of the Science literacy tests was carried out. Each Science literacy test was completed by at least 200 pupils. Items that were found to be problematic in any way were either amended or discarded. Pre-testing was also used to inform the development of coding sheets to be used for marking.

Teachers participating in pre-testing were given the opportunity to comment on the source text, items and mark schemes.

Task administration

All pupils in the P3 sample attempted one Science literacy test at each of Levels A and B. At each of the other stages pupils assessed in Science literacy attempted a combination of two tests from three levels. At P5 from Levels B, C & D, at P7 from Levels C, D & E and at S2 from Levels D, E &F.

The three different Science literacy tests to be completed by individual pupils were presented to them in individual test booklets with separate source material. At P3 there were six different booklets; three booklets at each of the two levels. At the remaining stages there were nine different booklets; three booklets at each of three levels.

The Science literacy tests were administered by the schools themselves. The supervising teachers were asked to give out the assessment materials and to supervise the pupils while they were working. The teacher could explain what had to be done, but was not allowed to provide answers or confirm that a pupil's answers were correct. The tests were not timed, and schools had flexibility in how they were administered. They were encouraged to give pupils a break between each test, and if possible set each test on a different day. Schools were asked to return all the test material for marking once all the tests had been administered.

Task development within the 5-14 level structure

Tests were developed and validated by Science and English teachers and were pre-tested by pupils at the appropriate stages. A total of 18 tests were developed, resulting in three tests per level available for use in the survey. Alongside the pencil and paper tests, practical tasks were also developed to assess pupils' ability in take part confidently in discussions with others about issues involving Science. It would have been impossible to assess this area in a written test, and so practical tasks were developed for this purpose. A full description of the development and delivery of these tasks is included later in this chapter.

Overall achievement results by cut-off scores, as reported in Chapter C, show the expected pattern of achievement, with better achievement in earlier stages and in lower levels within stages. Average test scores indicate that tests validated at similar 5-14 levels were of similar difficulty. Looking across levels within stages shows little evidence of a clear distinction between Levels D, E and F. This phenomenon, at least as regards Levels D and E, is also seen in the main knowledge and understanding test results reported in Chapter B. Table II.4 shows the mean percentage score of each of the 18 test booklets by stage.

Table II. 4 Mean test scores, by stage and level

Stage

Level

Test ID

Mean Percentage Score

Stage

Level

Test ID

Mean Percentage Score

P3

A

SL01

82

P5

B

SL04

87

SL02

78

SL05

88

SL03

74

SL06

75

B

SL04

75

C

SL07

57

SL05

73

SL08

60

SL06

63

SL09

60

D

SL10

36

SL11

41

SL12

38

P7

C

SL07

73

S2

D

SL10

56

SL08

76

SL11

63

SL09

70

SL12

63

D

SL10

46

E

SL13

59

SL11

51

SL14

52

SL12

54

SL15

60

E

SL13

51

F

SL16

53

SL14

43

SL17

46

SL15

49

SL18

54

II.3 Writing

Assessing writing achievement - writing in a Science context

As the focus of the SSA 2007 was Science, teachers were asked to submit a functional piece of writing for a given sub-sample of pupils which had been generated through a Science context. The writing could be an extended, complete piece, assessed using the 5-14 national criteria, or a short, continuous piece assessed using a 'best-fit' approach. Details of the 'best-fit' approach and some suggested topics for administration were included in the Guidance for Selecting Writing document distributed to all participating schools.

The best-fit approach

The best-fit approach was developed for assessing writing in the Assessment of Achievement Programme 2003 to address difficulties in applying extended writing criteria to short pieces of writing. The best-fit descriptors at each 5-14 level are allied to the corresponding level of the 5-14 national writing criteria. There should be no difference in the quality of the writing expected, but by using a holistic rather than an analytical approach, the assessor is able to assign a level which best 'fits' a short piece of writing. It may well be that a piece of writing reflects standards contained at a number of levels, but, using professional judgement, the teacher decides which level is the best-fit.

Teachers opting to use the best-fit approach were advised that pupils should be given ample opportunity to discuss a topic before beginning writing and should be reminded of the criteria which would be used to assess their writing.

Examples of topics for generating short pieces of writing:

Should animal culling be allowed?

Should humans interfere with nature or should natural selection be allowed to prevail?

Should drugs for treating illness be available to everyone regardless of the cost?

Is experimenting on animals ever acceptable?

Should pupils be forced to eat healthy meals at school?

Should people who drive big cars pay more for their petrol?

Should people be fined for not recycling their domestic rubbish?

Level F

There are no national writing criteria at Level F for assessing functional writing. However, schools may have devised their own criteria or be using professional judgement to identify writing at Level F. For the purpose of the survey, schools were asked to record on the register either Level F or Level E* for pupils considered to be writing beyond Level E.

Level descriptions using the best-fit approach

[reduce the typeface within the scheme so it fits the page?]

The following best-fit scheme was provided to evaluators:

Read the piece of writing, ideally more than once.

  • Do the language and structure meet the conventions of the genre?
  • Does the writing address the purpose of the task?

Once you are satisfied that the writer has addressed the task set, using professional judgement, mentally award the writing a level. Read the description for the appropriate level and decide if the piece of writing fits the description. Emphasis must be placed on the criteria highlighted in bold. Because you are using a best-fit approach, the piece of writing might not meet the criteria fully. This is acceptable. If the writing appears to sit equally well at two levels, look for the relative strengths and weaknesses within the writing and decide if the strengths outweigh the weaknesses or vice versa.

If in your professional judgement a piece of writing is insufficient to meet the requirements for Level A, record it as an 'N'.

Level A

The writing conveys one or two details which are linked and mostly relevant. Common linking words are used to organise ideas (e.g. and, then). A capital letter and a full stop are used to mark at least one sentence. Commonly used words are spelt accurately.

Level B

The writing conveys a main idea with sufficient information to make the message clear. The information is mostly organised logically. Common linking words are used to organise ideas into sentences (e.g. and, then, but, so, that) and punctuation is beginning to support what has been written. An increased range of commonly used words is spelt accurately.

Level C

The writing conveys a clear sense of ideas that are organised logically in the main without significant omission or repetition. There is a simple conclusion, where appropriate. The punctuation mainly supports what has been written. Less commonly used words are spelt with increasing confidence and accuracy.

Level D

Ideas are described in detail and are logically and clearly organised throughout. The writing includes relevant and consistent supporting detail. There is a simple but effective conclusion, where appropriate. There is some variety in sentence structure and most sentences are punctuated accurately. Most of the words needed for the task are accurately spelt.

Level E (or above)

The writing begins to convey discernment. Ideas are logically and clearly organised throughout and are well-linked and supported with appropriate detail. There is a well developed, effective conclusion. There is appropriate variety in sentence structure and sentences are accurately constructed, linked and punctuated. Spelling is accurate in the main.

Selecting and assessing class-based writing

A Guidance for Selecting Writing document distributed to all participating schools provided advice on selecting appropriate material and how much teacher support was permitted. The piece of writing selected was to reflect the level at which the pupil was currently working. Schools were advised that each piece of writing should be assessed by the class teacher and one other teacher or a promoted member of staff from the school using the 5-14 national writing criteria or the best-fit criteria. Teachers were asked not to annotate the level with '+' or '-' or to record two levels e.g. D/E. The level awarded was to be recorded on the register provided but not recorded on the script itself. Where there was disagreement between the two markers, teachers were asked to discuss and come to a final decision. Teachers were also asked to indicate on the register which set of criteria was used to assess each piece of writing.

Schools were asked not to submit letters, lists of bullet points, group work, leaflets or posters, imaginative or personal writing. They were advised that the piece of writing could come from the pupil's folio, jotter or from a wall display, but that scripts would not be returned to schools so a photocopy could be sent rather than the original.

Schools were informed that a proportion of randomly selected scripts would be centrally moderated by a group of teachers nominated by their education authorities.

The moderation of writing in a Science context

The moderation event took place over five days, in late October 2007. Every local authority was offered one fully funded place with the option of sending additional partially funded representatives. There were 69 teachers representing 30 local authorities.

In addition to judging scripts, the participants were addressed by guest speakers from Learning and Teaching Scotland, the University of Belfast and the University of London.

The week was punctuated with plenary discussions focusing on particular pieces of writing (which were not themselves moderated), in order to facilitate a shared understanding of the standard and an evaluation of the 5-14 national writing criteria. At the request of a number of teachers, discussions took place specifically targeting P3 teachers and P7/S2 teachers.

A qualitative evaluation of the scripts took place towards the end of the week where moderators had the opportunity to share their perceptions of the general strengths and weaknesses of the writing they had moderated.

Moderators were also given the opportunity to discuss issues and concerns relating to assessment policy and practice in general.

The moderators were organised into pairs and assigned scripts from one stage only; P3, P5, P7 or S2. Teachers assigned P3 or P5 scripts had experience in the lower or middle stages of primary, and those assigned P7 or S2 scripts had experience in upper primary or secondary. Primary teachers with upper stage experience were paired with secondary teachers.

All scripts submitted with a teacher judgement were selected for moderation and organised into batches of approximately 30 at P3, 25 at P5 and P7 and 20 at S2. Each batch was marked by two teachers working independently of each other so that each writing script had three independent judgements as to the level, two moderator judgements and the original class teacher's judgement. Over 9,000 scripts were submitted for the survey and double marked during the moderation week.

Throughout the week exemplar scripts were discussed in plenary sessions, and levels agreed. The process began with teachers reading the piece of writing and then offering comments on the strengths and weaknesses. Participants were encouraged to challenge any comment with which they disagreed. The next step, using professional judgement, was to suggest a 5-14 level for the piece of work. The outcome typically straddled two levels. Either an analytical approach was then adopted using the bullet point descriptions of achievement stated in the national criteria for extended writing, or an holistic approach was adopted using the 'best-fit' descriptions. Some discussions involved using both approaches. Each bullet point across the levels was discussed until a level was finally agreed. This was a time consuming process, but all teachers agreed that it was a necessary and invaluable experience. In addition to promoting the understanding of standards, the discussion permitted the production of material for creating exemplification of extended writing and evaluation of the writing criteria.

The moderators echoed the message from the 2005 and 2006 moderation exercises that the experience of working with colleagues from different schools, authorities and sectors is invaluable.

Writing in the context of Science

When interpreting the figures below it is important to understand exactly what they represent. Teachers were asked to provide examples of continuous writing within the context of Science that reflected the pupils' current level of writing. Where no writing was available, schools were given the option of generating 'short pieces of writing' and provided with a list of suggested topics for producing writing in a Science context. The exact requirements were made clear in the guidance sent to schools.

It is important to note the distinction between 'writing in the context of Science' as we have defined it and what could be considered as 'scientific writing'. Writing in the context of Science is marked on the basis of the 5-14 national writing criteria. Whilst credit might in principle be available for Science knowledge, understanding or interpretation in some other assessment context, pupils' lack or misunderstanding of Science knowledge would not mitigate against them in this particular assessment exercise. In other words, they were here assessed on their ability to communicate their scientific thinking even if that thinking might have been flawed. The Science context was simply a vehicle to allow pupils to display their generic writing ability. However, some teachers might have submitted 'scientific writing', for example the write up of an experiment, if such writing was available and it met the requirements set out in the guidance document; this would nevertheless be evaluated for general writing ability and not for scientific accuracy.

The alternative interpretation of 'scientific writing' would be writing about Science that required pupils to display knowledge, understanding or interpretation of specific scientific topics or concepts. Examples of this would be essays describing a scientific theory or a lab report. In a lab report it would have been perfectly acceptable (and in fact often desirable) to summarise points in bulleted lists; in our definition of writing in a Science context bulleted list are not acceptable. Should the Science content of the writing be the focus of assessment, then pupils would gain credit for displaying their knowledge of the Science in question, and although writing ability would be essential for the pupil to be able to transmit their ideas, it is the ideas that would be important and it is their demonstration that would be being judged. This was not the case here, and it was made clear in the guidance that this is not what was required.

Many schools opted to provide short pieces of writing which addressed one of the suggested topics in the guidance document, which suggests that continuous writing in a Science context is not prevalent in schools. In the opinion of the moderators much of the pupils' writing attempted to transmit all their knowledge on a specific subject rather than displaying the pupils' general writing ability within the context of the Science subject in hand. This is arguably the result of asking pupils to write about a topic of which they may have very limited knowledge. This would not have gained them credit in this assessment, but may have in their normal class work. Care should be taken when comparing the teacher judgements with the moderated results as it is possible the judgements have been arrived at on a slightly different basis.

Comparison of achievement levels between different methods of writing assessment.

A 'moderated level' was calculated from the independent level judgements of the two moderators and original class teacher. If there was at least majority agreement among the three judgements then that level was allocated to the script, i.e. if the two moderators agreed on the level, or if one moderator and the class teacher agreed, then their decision would be the moderated level. In cases where there was no agreement a moderated level was not defined as described above.

Of the 9,031 writing pieces that were received (and that were able to be linked with the relevant teacher's judgement) 7,212 had a moderated level assigned. The vast majority of those that did not receive a moderated level were cases where there was no majority agreement about level. There were also a few cases where moderators were unable to judge a piece of work because it was too short or otherwise unsuitable.

For the pupils that did receive a moderated level Figure II.4 shows the difference between this level and that originally assigned by the class teacher.

Figure II.4:
Profiles of writing achievement based on moderated and unmoderated pieces of writing, by stage
Figure II.4:

The differences in achievement, as assessed by the two methods, vary across the four stages. In P3 the pupils' class teachers judged most of them, over half, to be at Level A, with fewer, 40 per cent, at Level B. Looking at the moderated results the picture is reversed, with a larger proportion of pupils' work, almost half the scripts, put at Level B. The pupils' own class teachers therefore made less optimistic judgements of their pupils' achievement than the moderated levels would suggest were warranted.

At P5 the moderation exercise hardly changed the original picture. At P7 the moderated picture is less positive than that based on class teachers' judgements, while at S2 this discrepancy is even more severe: according to the pupils' teachers, 84 per cent of the S2 pupils were at Level D or above in writing compared with 56 per cent on the basis of moderated results.

Of course teachers have more evidence available to them than the moderators who must base their assessments on one piece of work. It is worth noting however, that teachers were asked to submit a piece of work which matched the level of their judgement.

Table II.5
Moderated levels of pupils' writing achievement by teacher judgements

Teacher's Assigned Level

Moderated level by teacher judgement of pupil attainment

Total Scripts

Below A

A

B

C

D

E

F

N

%

N

%

N

%

N

%

N

%

N

%

N

%

Below A

80

59

67

6

27

1

1

0

1

0

-

-

-

-

176

A

43

32

802

74

380

17

31

2

-

-

-

-

-

-

1,256

B

9

7

157

14

1,379

63

239

12

30

2

1

0

-

-

1,815

C

2

1

35

3

265

12

1,269

63

109

8

10

2

-

-

1,690

D

2

1

17

2

94

4

334

17

922

71

73

17

-

-

1,442

E

-

-

6

1

42

2

122

6

192

15

304

72

-

-

666

F

-

-

1

0

2

0

20

1

36

3

33

8

18

100

110

Total

136

100

1,085

100

2,189

100

2,016

100

1,290

100

421

100

18

100

7,155

Table II.5 shows the distribution of original teacher judgements around the different moderated levels. For all the levels, the majority of teachers' assessments of writing agree with the moderated level. Over 90 per cent of teacher judgements are within one level of the moderated level.

II.4 Practical assessment

Introduction

In the 2007 survey there were four types of practical assessment: Science investigation skills, Science literacy, working with others/problem solving and ICT. The practical tasks were administered and pupils' performances assessed by field officers during school visits. Field officers were teachers nominated by their local authorities as having an expertise and/or an interest in Science. Approximately 160 teachers received training as field officers. Working in pairs, they visited between four and ten schools, assessing up to twelve pupils at a stage at each school. Field officers received training in administering and assessing the practical tasks through an optional on-line Virtual Learning Environment and a face to face workshop day.

II.4.a Science investigation skills

Task description

Science practical investigation skills were assessed by field officers taking part in one-to-one conversations with pupils about an investigation which they had recently carried out. To allow this to happen in a consistent way across Scotland, schools were asked to undertake small Science investigations prior to the visit of the field officers.

Sixteen investigation topics were developed for the survey and distributed across eight investigation packs.

Pack A

Pack B

1

Ramp

2

Marbles

P3

2

Marbles

3

Writing paper

3

Writing paper

5

Chocolate

4

Jelly

6

Jumps

C

D

1

Ramp

5

Chocolate

P5

4

Jelly

6

Jumps

7

Parachutes

9

Paper towels

8

Mirror writing

10

Ice

E

F

7

Parachutes

8

Mirror writing

P7

11

Pulse rate

9

Paper towels

12

Super gripper

10

Ice

13

Solubility

14

Lather

G

H

8

Mirror writing

11

Pulse rate

S2

13

Solubility

14

Lather

15

Cleaning

15

Cleaning

16

Pendulum

16

Pendulum

Task development and pre-testing

The tasks were developed by trained task developers and currently practising teachers, to allow pupils to demonstrate their skills in carrying out Science investigations. The topics were selected as being appropriate, interesting and stimulating for pupils at the relevant stages.

All investigation skills material was pre-tested and teachers participating in pre-testing were given the opportunity to comment on the tasks and instructions.

Task administration

Schools were provided with a pack containing four stage-appropriate investigations and asked to choose one which fitted best with their planned programme of study. Schools could choose to use an already planned investigation of their own as long as it could be assessed using the level-based criteria provided in the pack (Table 11.1). Teachers were asked not to assess the pupils' performance but to ensure that each pupil kept a record of the investigation for discussion with the field officer. A suggested format for this report was included in the investigation pack.

The investigation may have been carried out with a group which included the selected survey pupil(s), or with an entire class; it was left up to the class teacher to manage this aspect of the conduct of the investigation.

Field officers were provided with a question protocol for discussing with pupils the strands Preparing for tasks, Carrying out tasks and Reviewing and Reporting on tasks (Figure 1 ). The pupils' reports of their investigation were used as the basis for this discussion. The field officers used the Science investigation skills criteria to arrive at a best-fit judgement of pupils' skill levels. While it was likely that a pupil would not meet all the criteria at a level, certain key criteria had to be met in order to award the level. (These key criteria and associated questions are indicated in bold on the protocol and criteria table.)

Table II.6 Criteria for practical science investigations

Attainment Target

Level A - in discussion with the teacher

Level B - in discussion with the teacher as part of a group.

Level C - in discussion with the teacher as part of a group.

Preparing for tasks

  • can respond to questions about the planning of an investigation
  • can agree on what might happen from several suggestions provided
  • can help plan a simple approach by making suggestions, asking questions or drawing pictures
  • can respond to the question- what might happen?
  • can recognize when a test is unfair.
  • can suggest a question to investigate
  • can identify relevant apparatus /resources/information
  • can plan a sequence of activities
  • can identify at least 2 investigation variables
  • can make suggestions about what might happen

Carrying out the task

  • can follow simple instructions
  • can use simple techniques and apparatus
  • can measure using non-standard units
  • can observe and identify
  • can make a simple record of the investigation; e.g. drawing pictures, pictorial charts, writing captions and lists
  • can observe and identify obvious features and events
  • can use simple apparatus/ techniques to collect information e.g. classification
  • can measure using standard unitstime/weight/length
  • can record by completing a 2x2 table with both headings/units and values supplied
  • can complete labelled diagrams, simple charts, picture sequences and databases
  • can use the selected apparatus with appropriate teacher guidance
  • can complete a 2x2 results table with headings and units supplied
  • can draw a bar graph with axes supplied

Reviewing & reporting

  • can answer questions about what happened
  • can give an oral account of their part in the work
  • can present their work in a short unstructured written/oral report
  • can answer questions on the meaning of the findings
  • can identify simple direct relationships including cause and effect
  • can produce a short report (spoken or written) describing how the apparatus was used to get results
  • report should include a simple diagrams of how apparatus was used
  • can answer questions on findings
  • can make links to original predictions
  • can make one suggestions for improvements to the investigation

Note: Key criteria are highlighted in bold.

Table II.6 Criteria for practical science investigations (ctd)

Attainment Target

Level D - individually or as part of a small group with appropriate teacher input.

Level E - individually or as part of a small group with appropriate teacher input.

Level F - individually or as part of a small group with minimal teacher input.

Preparing for tasks

  • can suggest two questions to investigate
  • can identify the variable to be changed and at least 2 variables which are kept the same
  • can select appropriate apparatus from a display and give reason(s) for their choice
  • can make a prediction about what will happen and give a reason
  • can suggest three or more questions to investigate
  • can identify the variable to be changed, the variable to be measured and at least 3 variables which are kept the same
  • can make valid suggestions for the apparatus required to carry out the investigation
  • can predict the effect of changing one variable on the other and give a reason
  • can suggest a testable hypothesis
  • in addition to level E: can suggest at least 3 values for the variable to be changed and able to identify most of the variables which are kept the same
  • can identify most of the apparatus required to carry out their investigation
  • can discuss the relationship between variables in the original hypothesis

Carrying out the task

  • can use the selected apparatus with minimal teacher guidance
  • can complete a 2x2 results table with one heading and unit supplied
  • can draw a bar graph providing own axes or a line graph with given axes
  • can use the identified apparatus with minimal teacher guidance and repeat measurements at least twice
  • can complete a 2x2 results table without help and calculate an average for repeat measurements
  • can draw a bar graph or line graph defining own axes Some teacher guidance allowed with line graph
  • can set up and use apparatus with minimal teacher guidance and repeat measurements
  • can produce own results table showing repeat measurements and average column, without assistance
  • can draw a bar graph or line graph defining own axes without help. Uses the average result in graph.

Reviewing & reporting

  • can produce a more structured report (spoken or written) describing how the apparatus was used to get results
  • report should include a simple labelled diagram of actual apparatus used
  • can answer questions on findings showing greater understanding
  • can identify shortcoming(s) and make at least one suggestion for improvement
  • can produce a written structured report describing in more detail how the apparatus was used to get results
  • report should include an accurately labelled diagram of apparatus
  • can suggest a link between the results and original prediction
  • can reflect critically on the investigation approach used and suggest improvement(s)
  • can produce a written structured report describing in detail the key stages of the investigation . Scientific vocabulary should be evident
  • report should include accurately labelled diagram of apparatus and alterations/modifications
  • can identify links between the results and original hypothesis
  • can now evaluate a range of aspects of the investigation including reliability of evidence

Note: At Level F there should be clear evidence of understanding of the science underpinning the investigation

Figure II.5: Field officer protocol

Preparing for tasks

Did you work in a group, on your own or as part of the class?

Can you tell me a bit about it? What were you trying to find out?

Before you started carrying out the investigation, think back, what was the first thing you did?

How did you decide what you were going to do?

Did you have help deciding what to do?

Did you write down your ideas?

Did you discuss your ideas- group/class/teacher?

What did you think was going to happen? Why did you think that was going to happen?

Think back to the question you were investigating - how would you know/prove/show that what you were going to do would answer that e.g. the person with the longest hand span could hold the most marbles?

Can you tell me what a fair test is?

To make this a fair test what things did you need to keep the same?

Is there anything else you needed to keep the same?

What variable would affect the results of your investigation?

Can you tell me 3 values for the variable you changed?

Carrying out tasks

What did you measure/record to show this?

Can you tell me how you measured it?

How did you make sure your measurements were accurate?

Can you tell me about the apparatus and how you used it?

Was there anything you needed to be careful with when using the apparatus?

Tell me about the investigation- how did you do it… what did you do next… then what… how did you know you were finished?

How did you make sure you and your group were safe?

Which job did you do? How did you decide what each person was doing?

What did you do with the measurements/observations you made?

Did you make up the table/list/drawing yourself?

Reviewing and reporting

Think about the question you were investigating- what did your results tell you?

Is this what you expected to happen? Show me where it tells you that.

Why might your results be different from what you thought would happen?

If you were to carry out this investigation would you do anything differently?

Did you draw a graph/chart to show your results? What does this table show?

Did you make this yourself or did you have some help?

Did you write a report on the investigation?

What did you say in your report?

Thinking about it now are there any changes you would make to it?

If I was to ask you to report to another class on your Science investigation, what would you tell them? Have you missed anything out?

Do you think this was a good question to investigate?

What might you investigate now?

Did you enjoy this activity?

Note: key questions are highlighted

Figure II.6: Example of Science Investigation

Figure II.6: Example of Science Investigation

II.4.b Science literacy

Task description

Science literacy was assessed by field officers taking part in one-to-one conversations with pupils during school visits. The conversation was based on a topic introduced by the field officer using materials developed for the survey. These materials comprised a picture and a set of questions and statements concerning a topic of current scientific interest.

Twelve topics were used in developing the materials. The topics were:

  1. Big cars
  2. Child obesity
  3. School meals
  4. Zoos
  5. Wind farms
  6. Cheap flights
  7. Climate change
  8. Wolves
  9. Animal testing
  10. Designer babies
  11. GM crops
  12. Space exploration

These topics were then used as follows:

P3 - Topics 1, 2, 3, 4

P5 - Topics 1, 2, 3, 4, 5, 6, 7, 8,

P7 - Topics 3, 4, 6, 7, 8, 9, 10, 11, 12

S2 - Topics 3, 4, 6, 7, 8, 9, 10, 11, 12

The topics were entirely independent of the school's programme of study. Although all topics were used at more than one stage, the language used and the questions asked were differentiated from stage to stage.

Task development and pre-testing

The tasks were developed to allow the pupils to demonstrate the extent of their Science literacy. The topics were selected as being interesting and relevant to pupils at the stage concerned. Small scale pre-testing of the Science literacy tasks was carried out. The main purpose of this exercise was to trial the activity (including field officer instructions), and the assessment materials that would be used by the field officers. Teachers participating in pre-testing were given the opportunity to comment on the tasks and instructions.

Task administration

Pupils were assessed by a field officer through a one-to-one conversation with the pupil. Topics were selected on a rotational basis for each pupil. The field officer used performance descriptors to arrive at a best-fit judgement of the pupils' Science literacy skills.

Figure II.7: Example of the first page of a Science Literacy task

Figure II.7: Example of the first page of a Science Literacy task

Table II.7 Assessment grid

Nuffield descriptors

1. Appreciates and understands impact of Science on everyday life

a. Is unaware of issue

1a

b. is aware of issue

1b

c. is aware of issue and of its impact on everyday life

1c

2. Takes informed personal decisions about matters that involve Science

a. no evidence of personal decisions being influenced by knowledge

2a

b. some evidence of personal decisions being influenced by knowledge

2b

c. clear evidence of personal decisions being influenced by knowledge

2c

3. Reads and understands short text on topical scientific issue in discussion or with support

a. understands few/no essential points

3a

b. understands main points

3b

c. understands all essential points

3c

4. Reflects critically on the information included in or omitted from reports

a. makes no (or flawed) judgements about information in text

4a

b. makes simple judgements about information in text

4b

c. makes judgements about information included in and omitted from text

4c

5. Takes part confidently in discussion with others about issues involving Science

a. makes little or no contribution to the discussion

5a

b. contributes some ideas & participates freely in the discussion

5b

c. contributes several ideas to the discussion

5c

6. Overall assessment
a. Mostly a6a
b. Mostly b6b
c. Mostly c6c

II.4.c Group discussion (working with others/problem solving)

Task description

This was a group activity assessed by a field officer during the school visits. The task required pupils to work together to rank a number of statements/questions in order of importance/relevance.

The twelve topics developed for this task were.

  1. New trainers
  2. Healthy lunch
  3. Inventions and discoveries
  4. Global warming
  5. Moon base
  6. Saving water
  7. Endangered species
  8. Saving energy
  9. Speed record
  10. Nuclear power
  11. Cloning
  12. Animal testing

The topics were then used as follows:

P3 & P5: Topics 1- 9

P7 & S2: Topics 1- 12

The topics were entirely independent of the school's programme of study. Although most topics were used at more than one stage, the language used and the support materials provided were differentiated from stage to stage.

Task development and pre-testing

The tasks were developed by trained task developers, currently practising teachers, to allow the pupils to demonstrate their skills in working with others, problem solving and communication through a group activity. The topics were selected as being interesting and stimulating for pupils at the relevant stages. Small scale pre-testing of the group discussion tasks was carried out in order to trial the activity (including field officer instructions), and the assessment materials that would be used by the field officers. Teachers participating in pre-testing were given the opportunity to comment on the tasks and instructions.

Task administration

The tasks were administered and assessed by a field officer working with a group of three or four pupils. The group of pupils comprised a sample pupil and additional pupils (no fewer than two) nominated by the class teacher. The pupils themselves were unaware of their status as a sampled or additional pupil. Topics were selected on a rotational basis. The field officer used performance descriptors to arrive at a best-fit judgement of pupils' skills.

Table II.8 Group discussion assessment grid

Table II.8 Group discussion assessment grid

Figure II.8: Example of a group discussion task

Figure II.8: Example of a group discussion task

II.4.d ICT

Task description

The ICT tasks were assessed by field officers observing individual pupils during school visits. The tasks were developed to assess concepts, confidence, knowledge and skills involved in the use of ICT equipment and in the use of ICT as a core skill within a Science context.

Eighteen tasks were developed to allow pupils to demonstrate skills, knowledge and understanding in ICT capability across the 5-14 levels. The tasks were based on 'virtual experiments' carried out using a commercially available CD.

The topics for the tasks were:

  1. Forces and movement
  2. Grouping and changing
  3. Growing plants
  4. Pushes pulls
  5. Sorting and using
  6. Variation
  7. Characteristics of materials
  8. Keeping things hot
  9. Moving and growing
  10. Measuring friction
  11. The structure of plants
  12. Which substances melt
  13. Brightness of bulbs
  14. Exercise and pulse rate
  15. Forces all around
  16. Germination and growth
  17. Growing yeast
  18. Investigating insulators

These tasks were used as follows:

P3 - Tasks 1, 2, 3, 4, 5, 6

P5 - Tasks 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12

P7 - Tasks 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18

S2 - Tasks 13, 14, 15, 16, 17, 18

The 5-14 ICT strands assessed were 'Collecting & Analysing' and, to a lesser extent, 'Using the Technology'. The tasks were designed to assess pupils' ability to collect and analyse information by carrying out a virtual experiment on screen, and then to transfer that information into a spreadsheet and if possible, into a graph (bar or line).

Task development and pre-testing

The tasks were developed by trained task developers, currently practising teachers, to reflect the way pupils might use ICT to support learning in Science. The topics were selected as being interesting and relevant to pupils at the relevant stage. The website material was copied to back-up CDs which field officers took out to schools. Tasks were trialled on a small-scale informal basis. The main purpose of this exercise was to trial the activity (including field officer instructions), and the best-fit descriptors that would be used by the field officers. The results provided an indication that the level descriptors were useful in making make consistent judgments about pupils ICT skills. Teachers participating in pre-testing were given the opportunity to comment on the tasks and instructions.

Figure II.9: Example of an ICT Task

Figure II.9: Example of an ICT Task

Task administration

The tasks were assessed by a field officer through observation of, and in discussion with, the pupil. The field officer used a best-fit descriptor based on the ' Collecting and Analysing' and ' Using the Technology' strands. Pupils carried out the experiment on their school computer by accessing a 'protected' website or using the back up CD. The final level awarded by field officers based on observation of the pupil at work and their responses to the field officer's questions, matching to the best-fit level descriptor which most accurately described their performance (Table 11.9).

Table II.9 Best-fit level descriptors

These descriptors address the following strands from the 5-14 guidelines:

  1. Collecting and analysing
  2. Using the technology

From observation of the completed task, and in discussion with the pupil, award a level based on the following best- fit descriptors. There may not be evidence to show that the task meets the all criteria fully but make a decision based on available evidence and your professional judgement.

Level A - Pupil can answer simple questions specific to each activity. The pupil can, with support if necessary, complete the virtual experiment.

Level B - Pupil can enter data into a spreadsheet with prepared column headings and with field officer support. Pupil can answer simple questions specific to each activity. The pupil can save and retrieve the spreadsheet.

Level C - Pupil can enter data into a spreadsheet with prepared column headings. The pupil is able to navigate between the website and the spreadsheet independently.

Level D - Pupil can set up a simple spreadsheet, deciding on labels and entering data. Pupil can use spreadsheet tools to create a bar chart from spreadsheet data. Pupil can interrogate the database (searching and sorting) to answer task specific questions.

Level E - Pupil can use graphing tools to manipulate the bar chart; for example, change the graph title, colours, or add axes headings. Pupil can convert the bar chart to a line graph.

Level F - Pupil can use graphing tools to manipulate the bar chart; for example, change the graph title, colours, or add axes headings. Pupil can convert the bar chart to a line graph. Pupil can use an advanced function, such as the Autosum tool to calculate and display an average for the results column.

« Previous | Contents | Next »

Page updated: Friday, June 6, 2008