« Previous | Contents | Next »
Listen
ANNEX H: DETAILS OF THE 'HIGHEST QUALITY' EVIDENCE
Methodological and 'Quality Control' issues
H.1 Randomised controlled trials are regarded as the 'gold standard' method for the evaluation of interventions for the simple reason that aspects of their design help to control for the effects of error variation and bias. Many current systematic reviews, including the bulk of Cochrane reviews, use this fact to rationalise a decision not to include studies other than RCTs in their evaluation. The majority of systematic reviews, Cochrane or otherwise, even where they include other quantitative designs exclude any qualitative primary study following the reasoning that whilst methods of quality control for quantitative studies are comparatively well established, we have few guidelines with which to evaluate the methodological quality of a qualitative study. This is in fact overstating the case, since many of the principles adhered to by well-conducted qualitative studies have been explored in some detail and are well understood within the research communities employing these methodologies. A case in point is the extensive methodological work which has been undertaken in respect of Grounded Theory (Glaser & Strauss 1967).
H.2 The above notwithstanding, the quality of quantitative and qualitative studies cannot be directly compared in any very straightforward fashion, since quantitative research inherently relies on the 'objectivity' provided by, for example, statistical analysis, whilst qualitative research remains inherently subjective. The ideal use of the methodologies would be in tandem, since they are best suited to addressing quite distinct concerns. In the current context, it is clear that the methodologies have been used interchangeably and this represents something of a lost opportunity. The majority of the qualitative studies included in this review have sought to establish conclusively that changes in the behaviour of individual or groups have occurred. Since the methods used are inherently subjective it can never be conclusively determined from these studies that this is actually the case.
H.3 Sadly, very few qualitative studies to date have taken advantage of the main benefits of a qualitative approach to provide an account of the 'lived experience' of interventions from the viewpoint of either clients or practitioners , or to explore the issues raised by intervention, for example, adverse drug-related events, the impact on friends and relatives and so on, at a deeper level. The ideal for future research would be to attach in-depth qualitative research following a well-established methodology to large scale quantitative studies which have the statistical power to establish outcomes objectively but struggle to elucidate either the perceptions of the client or the perceived or actual causes behind the success or failure of an intervention.
H.4 Returning to the issue of measuring 'quality', many of the basic principles which apply to randomised controlled trials in fact apply also to the bulk of quantitative methodologies. This is a point which seems to have been missed by reviewers who choose to exclude quantitative studies other than RCTs on the grounds that their quality cannot readily be evaluated. Whilst, in the specific context of evaluating an intervention, a well conducted single-group follow-up is not on a level playing field with a well-conducted RCT, certain core principles apply equally to both methodologies. Including all designs but controlling for overall quality with reference to the 'core principles' avoids the problem of, in effect, favouring even a very poorly conducted RCT over a well designed study following an alternative quantitative methodology.
H.5 A final point of relevance to the quality control decisions taken in this review relates to the issue of 'quality scores'. It has become increasingly common in systematic review methodology to rate studies on the basis of a 'total quality score', generally derived from existing scales such as the Jadad scale (cf. Moher et al 1996). Despite the 'rule of thumb' purposes for which these scales were developed, they are now commonly treated as if they are equivalent to psychometric scales, in short that a score of '2' is in some real sense twice the quality value of a score of '1'. This entirely misses the point of the nature of the items contained in the scales. The individual items measure qualitatively different aspects of study design. The value of 'blinding' for example (where either the participant or investigator or both are unaware of which intervention a participant is receiving) is that it has the potential to reduce the risk of bias introduced by the preferences or observations/speculations of those conducting or taking part in the trial. In contrast, the benefits of randomisation are that it reduces error variance which may be introduced by, for example, the traits shown by participants which correlate with outcomes but which are not directly connected to the intervention. Summative scores adding points for blinding and points for randomisation are therefore meaningless except as a rough guide to how many potential sources of bias have been controlled for in a study and they should be approached in this way.
H.6 It remains a debateable issue which of the various possible design flaws carries the greatest weight. Where there is a need to inform intervention practice, as in the case of the current review, it is also true that quality judgements are of greatest practical value when they are seen as relative rather than absolute. Reaching the conclusion that no studies meet the ideal design and therefore there is no evidence for any intervention is not helpful, although it may be true in the strictest sense. A more useful approach in pragmatic terms is to identify the messages which can be taken from the best quality studies we currently have available, however poorly designed these may be - with the caveat that their limitations should be recognised.
H.7 Taking the above points into account, we have used the following strategy for evaluating the relative quality of the available material:
Qualitative studies are judged separately from quantitative studies and on the basis of their own individual merit (including whether they follow a well-established methodology) with outcomes presented separately
Quantitative studies are evaluated on the basis of whether the following key aspects of study design are adhered to:
- Adequate sample size to evaluate outcomes
- Low drop-out rates
- Randomisation (of selection and/or allocation of participants)
- Blinding (of participant, investigator or both)
- Control for fidelity of implementation of the intervention
- Baseline evaluation of outcome measures
- Intention-To-Treat analysis
- Number of outcome measures used
- Placebo control (direct or by comparison with equivalent non-participating groups)
- Washout (participants beginning a study having previously been without medication or other treatment for a period preceding the study start point)
For studies involving the direct comparison of two or more groups the following additional quality markers are used:
- Equal group size at baseline
- Equality of groups at baseline on relevant outcome and demographic measures
H.8 To draw comparisons between studies we use a 'total quality score' for pragmatic reasons, since individual in-depth evaluation of every study would not be feasible in a review taking into consideration 200 studies, but we would argue for caution in the over-interpretation of these scores, as they provide a rough guide to the number of potential sources of error in the design of a study only. Taking the scores as a means of identifying which studies are of the 'best' quality relative to the other studies available in each context (evaluation of interventions for suicide versus for self-harm etc.) we compare each study's score to the median total score for that group of studies, and use as a cut-off for 'highest quality' whether or not studies match or exceed the median total score for their group. Hence, the 'highest quality' studies evaluating interventions for self-harm will be those quantitative and qualitative studies matching or exceeding the median 'total quality score' for all studies evaluating outcomes for self-harm and adopting the same broad methodological stance. This having been said, in presenting outcomes for the 'highest quality' quantitative and qualitative studies identified we give greater credence to those studies which provide well conducted statistical analyses supporting the conclusions reached by their authors.
Quality profile of the studies as a whole
H.9 To inform future research in this field, we will briefly consider the profile of individual 'quality markers' for the studies taken as a whole. For this overview, we combine together qualitative and quantitative studies, since although the fundamental rationale behind the two methodologies is distinct, in respect of their ability to answer the question of whether or not an intervention works, the same principles apply:
H.10 Sample size: This impacts on the ability to detect changes where these occur. Clearly, the sample size required to detect a difference in the rate of completed suicide is rather different to that required to detect a change in self-harm since the former behaviour is relatively rare. This aside, the general profile of the included studies suggests that sample size is an issue which the literature has taken on board. The majority (55%) of studies reported initial sample sizes of above 100 and one fifth (20%) reported initial sample sizes of 500+. This again contrasts favourably with equivalent research in the area of other-directed violence (which is subject to the same problem of detecting comparatively 'rare' behaviours). The sample size of included studies also tracks the incidence of the behaviours studied appropriately, with studies using suicide as an outcome measure significantly more likely to report sample sizes of 500+ than studies with alternative outcome measures (? 2=18.3 p<0.001).
H.11 Drop-out: This impacts on the representativeness of final outcomes, in particular where an intention-to-treat ( ITT) analysis is not presented. Retention of participants showed a rather split distribution, with a comparatively high proportion of studies reporting 100% retention (38%) but a rather higher proportion losing a third or more of their participants to follow-up (46%). This profile was not a function of either the mode of intervention focussed on or of the participant population or setting. However, drop-out tended to be higher in studies focussed on interventions for suicide (51% of studies lost one third or more of participants to follow-up compared to 43% in other studies, ?2=7.03 p<0.03, this is not a function of loss due to mortality).
H.12 Randomisation: As noted earlier, randomisation controls for error variance relating to participant characteristics. Again, in comparison to the literature on other-directed violence the literature on suicidal behaviour and ideation seems to have taken on board the need for randomisation, with just under half (43%) of studies reporting random selection and/or allocation of participants. Again, there was no significant association between randomisation and the mode of intervention, population or setting addressed by the studies, but there was an association with a focus on completed suicide, with 72% of studies evaluating interventions for suicide failing to randomise either selection or allocation of participants, compared to 49% of studies addressing other objectives (? 2=9.05 p<0.004).Conversely, studies addressing suicidal ideation appeared more willing to randomise (54% of studies on suicidal ideation versus 34% of studies focussed on other behaviour ?2=7.69 p<0.001)
H .13 Blinding: Blinding acts to reduce bias introduced by either the participant or the investigator and is an aspect of study design which future studies in this field could improve on. The majority of studies (79%) failed to carry out even single-blind procedures (or to report on these if they had done so). Whilst in part this is a function of the difficulty of 'blinding' for complex interventions in open settings (programme-based interventions in community settings were for obvious reasons the least likely to blind to outcome ? 2 10.5 p<0.005) it can be argued that even in such conditions it is still possible to blind investigators to allocation in order to reduce bias at the data analysis stage. Furthermore, although pharmaceutical studies, in which blinding is a comparatively straightforward procedure, were significantly more likely than other studies to blind (?2=15.9 p<0.001), the majority of these studies (62%) also failed to blind either participants or investigators. No distinctions on the basis of population or setting were noticed, but, as with randomisation, studies focussed on suicidal ideation were more likely to use blinding (29% versus 13%, ? 2=7.0 p<0.009).
H.14 Fidelity of implementation: This ensures that outcomes are due to the impact or otherwise of the intervention as it is intended to operate, rather than due to implementation failure. Few studies in this field appear to address this issue (83% of studies failed to provide any evaluation of the fidelity of implementation). Again, there were no differences between studies evaluating different modes of intervention, or different populations or settings in this respect. However, studies evaluating interventions for suicide were significantly less likely to explore the fidelity of implementation of their interventions than studies addressing other outcomes (6% versus 23%, ?2=9.06 p<0.001).
H.15 Baseline Evaluation: This provides a check on the base rate of the outcome measure in the participant group(s). As such, it provides both a control for initial differences between groups where more than one group is included in a study and a means of assessing the true clinical impact of the intervention on the outcome measure (a high percentage reduction in an already very rare behaviour gives only a spurious indication of efficacy). The majority of studies (57%) did provide baseline figures for the incidence of the behaviour they addressed. Evaluation of baseline figures was not associated with mode of intervention, population or setting, or the form of suicidal behaviour evaluated.
H.16 Intention-To-Treat ( ITT) Analysis: This provides a control for the 'real world' value of the outcomes, as well as reducing certain biases, by ensuring that people who drop out of the study, for whatever reason, are counted as 'treatment failures'. The majority of studies did provide an ITT analysis (53%). No associations were noted between the likelihood of doing so and either the population of interest, the setting or the mode of intervention. However, in respect of the behaviours focussed on, studies addressing attempted suicide were significantly less likely to provide an ITT analysis. 57% failed to provide such an analysis compared to 40% of studies with a focus on the other behaviours considered (? 2=5.03 p<0.03).
H.17 Number of outcome measures: This is evaluated in an attempt to control for 'fishing trip' approaches to outcome evaluation, in which a diverse range of distinct measures are applied in the hope that at least one measure will provide a positive outcome. Statistical controls can be put in place if it is necessary to use a broad range of measures, but since virtually none of the studies included made any attempt to address this issue, a large number of measures can be taken as a simple estimate of the likelihood of a bias towards positive outcomes. The mean number of measures used in the studies was 4, with a median of 3. The range, however, was quite extensive, stretching from 1 to 39.
H.18 Nearly half of the studies (44%) used at least four separate outcome measures, suggesting that there is some room for bias towards positive outcomes to creep in here, notably given that, for outcomes other than suicide, it was common for studies to use standardised self-report scales. A not insignificant number of studies used three or more different standardised or 'in-house' scales to measure the same behaviour. The biggest 'offenders' in this context were studies focussed on psychotherapeutic or psychosocial interventions. Evaluations of pharmaceutical interventions on the other hand were significantly more likely to use only one or a small number of outcome measures (69% compared to 50% of other studies ? 2=5.88 p<0.002).
H.19 Population focus also impacted on this measure of bias, with studies carried out on the general population far more likely to restrict the number of outcome measures used to no more than 3 (75% versus 52% in other populations, ? 2=5.87 p<0.05). Studies focussed on completed suicide also tended, for obvious reasons, to use far fewer outcome measures, 75% using three or fewer measures (? 2=13.1 p<0.001 in direct comparison with 'any other focus') in comparison to 48% of studies focussed on attempted suicide, 40% focussed on suicidal ideation and 36% on self-harm. Whilst this assessment is based on the number of all main outcome measures (including, for example depression), even studies focussed exclusively on suicide quite commonly found a number of alternative ways of measuring outcome, for example via different interpretations of the time to completed suicide (from start-point, from leaving hospital, from re-admission etc.).
H.20 Placebo comparators: The true measure of any intervention is whether it achieves more than doing nothing at all. For ethical reasons it is understandable that researchers in this field are reluctant to assign participants to a placebo condition (90% of studies used active comparators only). However, an alternative ethical issue is raised if placebos are not used, notably where interventions may have adverse outcomes. Evaluations based on an active comparator alone preclude researchers from establishing that in fact, as may be the case, clients are better off without either active intervention. This is an issue which urgently needs to be addressed, notably in the context of pharmaceutical trials, where it is well established that the compounds most commonly evaluated (drugs acting on the serotonin or dopamine pathways) can have a broad range of adverse outcomes. Whilst pharmaceutical studies were significantly more likely to use a placebo comparator, fewer than one third did so (28% versus 1% in other studies ? 2 34.5 p<0.001). Providing a convincing placebo comparator for multi-modal or complex interventions is difficult to say the least and in the case of single group studies complex designs are required to compare placebo with active intervention. Nevertheless, where placebo comparison is possible it should more commonly be used.
H.21 Washout: An adequate period without alternative intervention prior to the start of a study avoids the complication that prior (or in the case of many studies in this context ongoing) treatment may be confounding outcomes related purely to the intervention of interest. It is worth noting that a 'washout' period can apply to psychological and other therapies as well as to pharmaceutical interventions. Again, it is understandable that for ethical reasons researchers are unwilling to remove a participant from therapy they are already receiving (95% of studies failed to include a washout period or failed to report on such a period if it was instituted). However, this is a significant problem as without controlling for the impact of other recent or ongoing interventions, outcomes cannot be directly attributed to the intervention being evaluated. Pharmaceutical studies were the only studies included in the current review which addressed this issue and reported including a washout period, yet even in the case of these studies only 15% referred to washout and, of these studies, the washout period was followed, during the course of the study, by a resumption of prior treatment for the participants! The number of studies using washout is too small to allow for any further analysis by study focus, population or setting.
H.22 Equality of group size at baseline (Group comparison studies only): This and, to an even greater extent, the control for key differences between groups at baseline addressed below, are key elements in a strong study design for group comparisons. They ensure that any difference in outcomes is due to the intervention and not to group differences which are unrelated to the intervention. To control for sample size, we evaluated group size differences at baseline as a percentage of total sample size. The majority of studies reported differences in the size of groups at baseline (42% reported differences equivalent to or greater than 10% of total sample size, primarily due to pragmatic constraints on sample recruitment). Pharmaceutical studies were more likely to report comparatively large (greater than 10%, commonly greater than 30%) differences in group size at baseline (52% versus 38% in other studies ? 2=6.05 p<0.05). Similarly, studies addressing completed suicide were more likely than other group comparison studies to report relatively large differences in group size at baseline (64% versus 34% in others studies ? 2=12.3 p<0.003). Studies focussed on suicidal ideation were less likely to do so (33% versus 53% ? 2=6.47 p<0.04). No other differences in terms of population focus or setting were noted.
H.23 Equality in outcome measures and demographic characteristics at baseline (Group comparison studies only): This issue is of particular concern and should be addressed in future research studies. The majority of studies (66%) either failed to evaluate or to report baseline values for the outcome measure used, or evaluated these and found groups to be significantly different on either the main outcome of interest or on demographic variables which may have impacted on outcomes. Few studies went on to control for such differences in their analyses. Removing studies for which the sole outcome measure was suicide (which, except for comparisons between distinct populations, cannot be expected to have a baseline value), 55% of studies for which baseline figures were pertinent still failed to control for baseline outcomes or demographic values. Aside from the association with suicide as an outcome, no other key aspects of study focus were associated with the likelihood of reporting or controlling for baseline values.
Quantitative versus Qualitative studies
H.24 As noted earlier, summative quality scores need to be used with caution, but do provide a 'rule of thumb' for evaluating study quality across large numbers of studies such as the pool of studies considered in the current review. On the basis of the above 'quality control' measures, the total summative score achievable for quantitative studies using group comparisons is 15, the total score achievable for single group studies is 12. Throughout the report, where median quality scores have been compared, like has been compared with like, with medians derived from within each of these study categories separately. Taken across all single group quantitative studies, the achieved median (and mean) quality score was 4. The median for studies using group comparisons was slightly higher at a score of 5 (mean 5.59) but not impressively so. The major failings in study design are as outlined above. Table H.1 below provides comparative figures for studies focussing on the four main outcome measures. The absolute differences between these are not substantial, with the single greatest disparity relating to single versus group comparator studies focussed on self-harm.
Table H.1 Median quality scores by type of behaviour addressed
Focus | Median for Single Group Studies | Median for Comparator Group studies |
|---|
Completed Suicide | 4 | 5 |
|---|
Attempted Suicide | 4 | 5 |
|---|
Self-Harm | 3 | 6 |
|---|
Suicidal Ideation | 3 | 5 |
|---|
H.25 The above median scores clearly indicate some room for improvement in the design and implementation of quantitative studies, with scores for individual studies ranging between 1-8 for single group studies and 1-11 for group comparator studies. However, a number of studies in each category showed a substantially more robust design than the majority. Fourteen quantitative studies stood out on this basis and these are identified as the 'highest quality' quantitative studies for the purposes of evaluating what the best evidence currently available for intervention is. Summary outcomes from these studies are set out in Table H.2 below and are discussed in the main text of the report in relation to the relevant outcome measures.
H.26 It should be noted that although comparatively few quantitative studies meet stringent criteria for study design, studies within this literature are on the whole better designed and implemented than those in the most readily comparable literature relating to other-directed violence. Study quality is also comparable to the bulk of other public health intervention research. Two key issues have a particular impact on the quality of research in this area. Firstly, actual or perceived pragmatic and ethical constraints on the type of study which can be carried out. Secondly, a lack of funding. In comparison to the emphasis on funding the implementation of interventions, the resources allocated to evaluating these same interventions is small. Improvements in the quality of future research could be made by resolving these issues. In the short term, both issues could to some extent be addressed by improvements in the collection and use of routine data.
H.27 The poorest quality studies in this literature as a whole, both with regard to design and to implementation, are the qualitative studies. Of the 27 qualitative studies included in the review, only three make any attempt to follow a specific qualitative methodology (non-participant observation and content analysis/grounded theory), a fourth follows a methodology which is not, strictly speaking, a qualitative methodology as such, but is an approach which has risen to prominence in this particular field and therefore has comparatively well established principles (psychological autopsy). The remainder of the studies are in effect simply narrative reports of the study author's subjective conclusions and, strictly speaking, could be described as 'failed' quantitative methods rather than studies explicitly adopting a qualitative approach. The poor quality of this aspect of the intervention literature leaves a significant gap in respect of our understanding of intervention for suicidal behaviour and suicidal ideation. There are currently very few reliable, in-depth, exploratory accounts of intervention. Well conducted qualitative studies are urgently needed to inform our understanding of how and why intervention does or does not work.
H.28 Fifteen of the 27 studies adopting a purely discursive approach to evaluation present information taken from case reports, or direct clinical experience. In the main, evidence is taken from only one or a small number of participants. However, some studies followed a survey or audit format, with sample sizes ranging from 14 to 35,077. The four studies which followed a more explicit methodological protocol are by default the 'highest quality' qualitative studies available for analysis. To this rather limited total, we add two studies which follow an experimental case study protocol. Whilst, technically, these are quantitative studies, the authors present them as qualitative accounts and provide considerable additional in-depth detail regarding both the intervention and outcomes. Summary outcomes from these six higher quality qualitative studies are presented in Table H.3 below and discussed in relation to the relevant outcome measures in the main text of the report. Well conducted qualitative studies are of substantial value in exploring and evaluating the lived experience of interventions, in particular in respect of interventions which are anticipated to impact on behavioural outcomes. The commissioning of such studies, ideally designed to run alongside quantitative studies which have the statistical power the methodological focus to quantify outcomes, must be seen as a priority.
Table H.2 Summary outcomes for the 'highest quality' quantitative studies
Study Identifier | Design | Intervention | Population | Setting | Sample size | Outcome measures | Outcomes |
Bennewith et al 2002 | RCT | GP based intervention consisting of management guidelines for GPs outlining good practice in treating patients who self-harm. Participant GPs then pro-actively offered patients the opportunity of a consultation, with the consultation intended to follow the recommendations of the new guidelines. | Patients registered with GP s who had attended A&E for DSH | Community | 1932 (males and females, mean age 32) | Any form of self-harm (identified via medical records) | No significant difference between intervention and non-intervention groups was noted for any of the three outcome measures of a repeat episode of self-harm, the number of repeats, or the time to repetition. |
Brown et al 2005 | RCT | Cognitive Behaviour Therapy ( CBT) | People attending A&E for DSH | Start setting, Community, mixed end settings | 120 (males and females aged 18 to 66) | Attempted suicide (established via medical records) Suicidal ideation (Beck Scale for Suicidal Ideation (Beck et al, 1979) completed by researcher) | At least one repeat suicide attempt from baseline to 18 month follow up was noted in 13 (24.1%) of the cognitive therapy group versus in 23 (41.6%) of the TAU group (z=1.97, p=.049). Suicidal ideation showed no significant differences between groups at any assessment point |
Carter et al 2005 | RCT | 'Postcards from the Edge', postcards were sent from the Emergency Department which a person had attended for self-harm to the discharged person at 1,2,3,4,6,8,10 & 12 months after admission for self-poisoning. The postcards contained a short message asking how the person was and suggesting they get in touch if they felt they needed further help. | People discharged from hospital after suicide attempt/self-harm | Community | 772 (males and females mean age 33) | Self-poisoning, established via medical records | No significant differences in the absolute likelihood of further admission for self-poisoning were found. However, the postcard group showed a significantly lower number of repeat episodes. Total N of episodes =192 in control, 101 in experimental group (incidence risk ratio 0.55, 95% CI 0.35-0.87, Z-2.56 p=0.01). A subgroup analysis showed that the postcard intervention significantly improved outcomes for women ( IRR 0.54 95% CI 0.30-0.96 Z-2.09 p0.037), but not for men |
Gagiano et al 1995 | RCT | Treatment with moclobemide (an anti-depressant): comparison of most effective dosage - 150mg twice daily, 100mg three times daily or 150mg three times daily | People with major depression | Community | 270 (males and females) | Suicidal ideation ( HAM-D: Hamilton Rating Scale for Depression (Hamilton, 1960) interviewer administered) | There was a significant reduction in suicidal ideation in all groups (efficacy ratios of 1.2 and 1.3 respectively).In the absence of a placebo or active comparator other than moclobemide it is not possible to gauge whether these outcomes are better than would be expected on the basis of not treating with moclobemide. |
Gonella et al 1990 | RCT | To assess the clinical effectiveness & tolerability of fluvoxamine compared with imipramine in depressed patients | People with depression | Not stated | 20 (males and females mean age 47) | Suicidal ideation Hamilton Rating Scale for Depression (Hamilton, 1976) completed by researcher | Fluvoxamine was reported to show more marked improvement than imipramine but the statistics reported as supporting this outcome are not presented in either text or tables. |
Kapur et al 2004 | Retrospective Cohort study | Emergency Department management after self-poisoning (Psychosocial assessment or referral for specialist follow-up versus no follow-up) | People attending A&E for DSH | Community | 658 (males and females, mean age 30) | Repetition of self-poisoning (established via medical records) | Following adjustment for baseline differences, receiving a psychosocial assessment was not associated with reduced repetition, but being referred for specialist follow-up did improve outcomes. (adjusted hazard ratio for repetition,: 0.49, 95% CI 0.25-0.84, p=0.01) |
Kasper et al 1995 | RCT | Comparison of fluvoxamine vs imipramine in depressed patients | Depressed patients | Mixed start settings, end setting community | 338 (males and females mean age 43) | Suicidal ideation ( HAM-D: Hamilton Rating Scale for Depression (Hamilton, 1960) interviewer administered) | At week 1 there were significant improvements in the fluvoxamine group compared with placebo in respect of suicidal ideation, with no significant improvements in the imipramine group vs placebo (details of the statistical analysis supporting these outcomes are given but are unclear) |
King et al 2003 | Pre-test/post-test | Telephone counselling | People who had made at least one suicide attempt | Community | 1010 (males and females) | Suicidal ideation (scale developed for the study drawing on items from existing scales, primarily the MINI International Neuropsychiatric Interview (Sheehan et al 1998) | Suicidal ideation decreased significantly from the beginning to end of call (t=12.66 p<0.005) as did suicidal urgency (t=8.37 p<0.0005). Considering only items reflecting ideation for 'imminent' suicide, this difference remained significant (t=3.13 p<0.005). Comparable differences were also observed in the raters' views of how suicidal people were (Z=-8.05 p<0.001) |
Lapierre 1991a | RCT | Sertraline vs placebo | Major depression (adults) | Not stated | 369 | Suicidal Ideation ( HAMD, CGI & POMs scales) | Statistical outcomes for suicidal ideation are not presented separately from reductions in HAMD total scores, which decreased in sertraline group, (p<0.001) reportedly to an extent greater than that achieved by the placebo group. This study is included as, by implication in the text, it presents outcomes for younger adults which are similar to the outcomes more explicitly set out in its 'sister' study below. |
Lapierre 1991b | RCT | Sertraline vs amitriptyline | Major depression (elderly adults aged 65+) | Not stated | 448 | Suicidal ideation HAMD, CGI & POMs scales | Suicidal ideation was reported to decrease significantly in both groups, statistical analyses are referred to but only presented in graph form. |
Meltzer et al 2003 | RCT | Clozapine vs olanzapine | Schizophrenia | Community | 1065 (males and females mean age 37) | Completed suicide and attempted suicide (not clear how established) | There were no statistically significant differences in completed suicides between the two groups over a two year period. There were significantly fewer suicide attempts "as determined by the study monitoring board" in the clozapine than in the olanzapine group ( 6.9% vs 11.2% p<0.03 95% CI 0.01-0.08) Similar outcomes were noted for hospitalizations to prevent suicide (16.7% in cloz vs 21.8 in OLA p<0.05 CI 0.00-0.10). A further unspecified measure of outcome, which may refer to patient self-reports of adverse events in respect of attempted suicide also showed better outcomes for clozapine, with 7.7% reporting an attempt versus 13.8% in the olanzapine group (p<0.002 CI 0.02-0.10) |
Milstein et al 1986 | Retrospective group comparison | Electroconvulsive therapy ( ECT) | Adult psychiatric patients | Start setting inpatient open ward, end setting not stated | 1570 (males and females mean age 37) | Completed suicide (established via triangulation across more than one source) | This study tracked the total population of one adult psychiatry hospital across 7 years. Over this time, 76 people died by suicide (established via families; physicians and death certificates). The study compared this group to a sex and diagnosis matched control group and looked for differences in ECT treatment between groups. No significant differences were found. 44% of those who had committed suicide had been treated with ECT, compared to 32% of the matched group (who had died from other causes). |
Tondo et al 1998 | Prospective follow-up | Lithium | People with bipolar disorder | Community | 310 (males and females , mean age 39) | Completed suicide (established via medical records) and attempted suicide (established via narrative report by person other than participant) | Suicide outcomes were not separated from the broader range of behaviours included as 'suicidal acts'. Poisson modelling of risk ratios showed that the incidence of suicidal acts was 5.62 -fold greater before lithium treatment than during lithium treated. 95% CI 2.15-14.5 z=3.96 p<0.001). A subgroup analysis of 185 people who discontinued lithium, gave a risk ratio after vs during of 9.10 z=4.57 95% CI 3.47-23.4 p<0.001 , with the rate of suicidal acts in the first year after discontinuing lithium significantly higher than before starting (ratio 3.09). The rate in the first year off lithium was higher than that for the next five years (4.79 ? 254.6 p<0.0001 CI 1.83-12.3). A Kaplan-Meier survival analysis of time to the first act in 310 patients also showed significant differences before and during lithium (? 2=19.7 p<0.0001) and also during and after lithium for the subgroup of 185 patients who discontinued lithium (? 2=16.4 p<0.0001), for this group however, there were no differences between pre and post lithium periods. |
Zenere & Lazarus 1997 | Retrospective follow-up | Suicide prevention and school crisis management programme focussed on school based crisis teams | School students | School | 330,000 (males and females) | Completed suicide; attempted suicide and suicidal ideation (established via national and/or local statistics) | The programme was introduced in 1989. there were 7 reported suicides in this year. Comparable figures for subsequent years were: 1990=5, 1991=6, 1992=4 1993=3 1994=5. The authors conclude that outcomes favour the programme, however the numbers are so small that patterns could be the result of random variation. In respect of suicide attempts, the comparative figures given were as follows: 1989/90= 243 attempts; 1990/91=157, 1991/92=120, 1992/3=95, 1993/4=95. In respect of suicidal ideation: 1989/90= 641 reports of ideation, 1990/91=511, 1991/92=443, 1992/3=464. 1993/4=640. No statistical analysis is presented, but taken at face value there is evidence as the authors suggest for an initial decrease in both outcomes with a subsequent rise back to a level higher than initial figures in the case of suicidal ideation. These patterns provide at best very weak evidence for the reported efficacy of the programme. |
Table H.3 Summary outcomes for the 'highest quality' qualitative studies
Study Identifier | Design | Intervention | Population | Setting | Sample size | Outcome measures | Outcomes |
Bloxham et al 1993 | Case study | Behaviour therapy based on token economy and time-out | Borderline personality disorder (adult) | Secure inpatient unit | 1 (35 year old female with previous history of self-harm) | Any form of self-harm identified via hospital records | Extinction of self-injury achieved by week 26 of admission, scores on EDI subscales (Eating Disorders Inventory Garner et al 1983) also improved, although self-starvation target was not reached, fluid intake also showed upward trend again without reaching target. |
Cowdery et al 1990 | Case study | Differential reinforcement of other behaviour ( DRO) | General population (child) | Outpatient unit | 1 (9 year old boy with previous history of self-harm) | Self mutilation (observed through one-way mirror or recorded in hospital notes) | Self-mutilatory behaviour suppressed by DRO under a number of different environmental conditions - objective measures used i.e. counting of occasions through one way mirror, length of time without Self-Injurious Behaviour ( SIB) on each occasion also % of observed session with SIB decreased from 80% at baseline to virtually zero at endpoint ( NB endpoint was 50 sessions over an unspecified time period). Time spent in each session without SIB increased, but absolute values are unclear. |
Kuipers & Lancaster 2000 | Grounded theory/content analysis | Supportive relationships and informal social support | Brain injured patients | Outpatient rehabilitation unit | 14 (males and females, mean age 32) | Attempted suicide; suicidal ideation (both as evaluated by clinician) | Themes drawn from structured interviews identified that past suicide attempts and current suicidal ideation were resolved by restriction of access to means for some participants, but the most common mechanism helping to reduce attempts was informal social support by family, friends and clinicians. |
Mishara & Daigle 1997 | Non-participant observation | Different telephone intervention styles used by helpline staff (directive vs nondirective or 'Rogerian' styles) | Callers to a general population suicide prevention helpline | Community | 263 (males and females, mean age 35) | Suicidal ideation (evaluated by scale based report by person other than participant, Suicide Urgency scale Morisette 1984) | Overall, there were no significant differences in ideation from start to end of call based on style of intervention. However, when outcomes were analysed on the basis of whether callers were 'chronic' or 'non-chronic, use of Rogerian techniques improved outcomes in non-chronic callers F=3.69 p<0.05 although not in chronic callers |
Owens et al 2004 | Psychological autopsy | Recognition and treatment of mental illness by GPs | People known/thought to have committed suicide | Community | 100 (males and females aged between 18 and 87) | Completed suicide (established via (local) coroner's reports or death certificates | This study assessed whether there had been adequate detection and treatment of mental illness by GPs in people who had committed suicide. The authors concluded that rates of detection and treatment were high and not therefore responsible for subsequent suicide. Note that detection and treatment rates referred only to those who had consulted with their GP, 30 of 68 with an identified mental illness failed to consult. Detection & treatment rates were 76% in those who consulted, so lack of consultation remains a problem and lack of presentation (lack of outreach approaches) may therefore have resulted in the adverse outcomes observed. |
Perseius et al 2003 | Content analysis | Dialectical behaviour therapy ( DBT) | Borderline personality disorder | Setting not specified | 10 (females mean age 27, with previous history of self-harm) | Attempted suicide, self-harm and suicidal ideation (all established via narrative self-report by participant) | Themes derived from all patients suggested that they regarded therapy as 'life-saving' in respect of having reduced their suicide attempts; reduced self-harm and reduced suicidal ideation. No further details are given. |
« Previous | Contents | Next »