The Risk Management of HAI: A proposed Methodology for NHSScotland Consultation Document
ANNEX 1: Human factors and learning from mistakes - the wider context
The term "human factors" refers to the role played by human beings in "complex socio-technical systems". A complex socio-technical system is a set of circumstances in which people (socio-) and machines (technical) interact with each other. Normally, the term would be reserved for situations in which this happened on a large scale (i.e. the system is complex), as in a power-plant, on an oil rig, or within a transport industry where there are clear organisational aspects to the work. The term would not normally be applied to an individual person working e.g. to fix the car in his/her own garage or mend the vacuum cleaner in the kitchen. NHSScotland is a complex socio-technical system.
The term "human error" is used to describe a sub-set of the behaviours collectively called human factors. Human error is said to occur in situations that arise within a complex socio-technical system where a particular human action has, or could have, an unwanted consequence; and where the action in question is deemed with hindsight to have been incorrect. In the popular media, the term "human error" is frequently used incorrectly to describe a mistake made by a front-line operator such as a doctor administering the wrong drug, the driver of a train who failed to stop at a signal, or an operator in a power plant who pressed the wrong button.
This is a quite incorrect use of the term however, because human error can occur at any point in a complex socio-technical system, from the front-line operators through middle-management and supervisory staff, and ultimately to the top management in the boardroom.
Consequently, in recent times systems have been devised which look at human error at three distinct levels. These are the proximal level where the errors made are defined by the jobs that front-line staff are required to do 'at the coal face'; the intermediate level which encompasses issues such as staff training, supervision and local procedures; and the distal level which includes the kinds of errors that management may make concerning decisions such as resource allocation, staffing levels, or recruitment of contract labour. In the context of HAI management, proximal and distal methodology can be used, for example, to explore and determine why people do not wash their hands:
- an individual refuses or forgets to do it (proximal)
- poor estates planning where there is a lack of wash hand basins in clinical areas, overstretched staff feel 'too busy' to comply, or delay in delivery of a hand hygiene training programme (intermediate)
- lack of a training programme, low organisational priority for HAI control, or low priority for funding hygiene initiatives (distal).
The final thing to say within this brief description of human error is that the types of errors made by those working in the front line (within NHSScotland this would include doctors, nurses and Allied Health Professionals) are rather different from the types of errors made by those in management and or administrative positions. Whilst it is difficult to make firm assertions about this issue, some evidence exists suggesting that front-line errors are relatively more common, more likely to be self-detecting (i.e. at this level, the fact that an error has been made is usually fairly obvious) and less likely to have catastrophic consequences for the organisation than errors at the distal level. By contrast errors at the distal level are more likely to remain dormant for long periods of time 1, more likely NOT to reveal their presence until too late and, importantly, more likely to be involved as root causes in major incidents/catastrophes.
It is also the case that errors at the front line can sometimes occur not simply as a consequence of error on the part of the front-line operator, but because decisions/procedures made higher up in the organisation have inadvertently created the conditions under which certain types of front-line error are more likely to occur. In such a case, human error is said to be due to "error-promoting conditions"; the implication being that any person involved in that task would have an increased probability of making that type of error due to the way the task is configured or the conditions under which it is to be performed.
Motives, intentions and rule violations
Information is needed at two levels if the problem of human error is to be tackled. The first level requires a description of the action(s) that the person performed within the socio-technical system. That would include a description of the machinery or technology involved, any failures of that machinery (e.g. breakdown, or intrinsic design fault), and the action(s) that the person performed. There would also be an attempt to describe the conditions under which the work was being performed, and reference to any error-promoting conditions (e.g. poor illumination; smoke; time-pressure; low or high temperature). The second aspect concerns the motives and intentions of the person who carried out the act. That is, an attempt has to be made to find out why the person acted as they did; or what was in his/her mind when they made the error.
Sometimes an error may be due to simple distraction, lack of attention or skill; and sometimes it may be due to an error of judgement where a decision results in an outcome that was not anticipated, expected or desired. However, rule violations deserve a few special comments. Where rule-breaking is concerned, some organisations have a policy based on the ill-defined idea of "the genuine mistake". A "genuine mistake" is deemed to happen where an operative breaks a rule "accidentally"; that is, the rule was broken in some way without the operator intending to do so. Within such a system, a rule which is broken deliberately is therefore deemed NOT to be a "genuine mistake", and hence the perpetrator is subject to disciplinary action. However, deliberate rule-breaking often takes place as a response to circumstances and conditions in the workplace, and consequently knowing the reason why a person broke a rule can be of extreme importance for an organisation. If, for example, a person used the wrong equipment to perform some task, it is important to know whether this was due to personal indolence on the part of the person concerned, or whether he/she was simply endeavouring to get the job done with the resources made available, and in the absence of provision of the correct equipment.
Avoidance of acts which lead to unwanted outcomes.
Major catastrophes tend to be obvious, and to thus report themselves and they tend to be investigated in great detail by an external body. This tends to be a lengthy process where people search diligently for the precursors and root causes of the catastrophe. As a result, the outcome of such investigations can sometimes be a lengthy and incredibly complex list of unprioritised 'causal factors'. Some inquiries have come up with lists of over one hundred such factors on which action is deemed necessary. However, such long lists are not necessarily helpful where they leave an organisation in the position of being overwhelmed and 'not knowing where to start'. It may, of course, be the case that in fact a particular organisation has multiple failings; but there is also another reason why investigations of catastrophes produce such long lists, namely that such investigations tend to allocate more time to finding root causes, and the more time one spends looking for them, the more one tends to find.
However, the issue here is not major catastrophes which require investigation by an outside body, but the collection of information about things that happen within an organisation which have less-than-catastrophic outcomes, and which therefore would not 'report themselves'.
Events which have limited consequences, few consequences or even no consequences can often give clear indications of where there are gaps in safety systems and procedures. The problem is that such minor events often go unreported and thus useful preventative lessons are not learned. This is particularly the case where human error is involved and where the organisation in question operates within a 'blame culture'. In such a case, when a person makes an error, perhaps due to systems features which create an error-promoting condition, he/she is likely to say to him/herself "I'm glad no-one saw me do that" and keep quiet about it. In this way, lessons about gaps in safety systems are not learned, and the opportunity for system improvement is lost.
One aim of the present report is to open a discussion about how data-collection systems might be developed within NHSScotland which could capture this type of information, so that major incidents can be avoided in the future. The topic of Healthcare Associated Infection is suggested as an area where such a system could be of particular value.
Adverse event recording
What would be the main features of a system designed to capture this type of information? Firstly, such a system would be dependent on reports provided by people working in the organisation, and so a number of preliminary decisions have to be made about the nature of such a system. This includes deciding whether the system is to be anonymous, confidential or 'total disclosure'. The decision will affect both the nature and number of the reports received. Anonymous systems produce the most reports but they also produce the highest number of trivial, poorly motivated and mischievous reports. On the other hand, total disclosure systems produce the least reports, and these tend to focus on technical failures or problems that are highly visible (that 'report themselves') and there is very little human factors information or self-disclosure. By contrast, confidential systems involve reporting to a third party who can, if necessary, carry out follow-ups to gather more detail; but when the report is processed, all identifiers are removed and no identifiers appear on the data base. Thus produces a manageable flow of well-motivated and high quality reports. A confidential system is therefore recommended here. It is also necessary to decide who can send reports in, how and to whom the reports are made, how they are to be analysed and coded, and how they will be turned into concrete actions and implemented. It is also necessary to consider how such a system will be publicised and communicated to those potentially involved.
However, whatever decisions are made with respect to the organisational aspects of the system, the system itself will probably have some of the following general features.
- It will need to capture descriptions of unwanted events in a useful level of detail. 'Useful level of detail' implies a level of description that enables targeted action to be taken, as opposed to non-specific general action such as putting notices up saying "Take Care" or "Stop, Think, Act, Review" (i.e. the so-called STAR system). On the other hand, there is limit to the amount of detail that can be handled by a local system staffed by people who already have full-time jobs. 'Useful level of detail' thus implies a compromise between the need for fine-grained data that enables highly detailed specific action, and the need for simplicity and usability in the interests of local pragmatic considerations. However, it seems reasonable that event reports would include not merely an event description but would also include details of staff involved, technology/machinery involved, times, dates, circumstances and other 'objective' features of the event, subject to the need for confidentiality. A coding system for events is also required if individual events are to be entered on a cumulative database.
- Secondly, the event description (above) will need to be used as the basis for a risk-assessment. The preferred approach (as in the present document) is via the 'risk matrix' method (AS/NZS 4360:1999), which assesses risk in terms of two dimensions, namely the rated consequences (severity) of the event, and the likelihood (probability) of occurrence. This is a system that has been widely adopted, with detailed modifications, in other industries and organisations. At this level, actual consequences or rated potential consequences will also probably be included in the analysis. Action may then be indicated on the basis of this overall assessment if no important human factors are felt to be involved.
- In most cases, however, there will be a human factors component. The problem thus remains of tapping into human factors and human error, and the motives and intentions of the 'actors' involved, within the limitations of local systems. A full taxonomy of human error can involve literally scores of codes and require some specialised knowledge to operate reliably. Within the proposed system, however, such an approach is probably inappropriate and not justified. It is proposed that a basic human error taxonomy be devised, using a minimum of codes at the three levels previously described (i.e. proximal, intermediate and distal), providing some basic human factors information where at the present time there is a virtual dearth of any systemised human factors information of any kind. However, simplicity and local usability, rather than comprehensiveness and theoretical coherence would be the primary requirements at this stage. In practice, a short human-factors coding sheet with a minimum of categories might be made available to appropriate staff. Using their knowledge and experience, they would use the event description (see paragraph 1. above) and the risk assessment to identify events which required action; and at that point they might invite personnel involved in that event for an informal chat in order to obtain additional human factors information, especially with respect to motives, intentions, decisions and expectations of those involved, and how these might have contributed to any human error. On the basis of this interview, they would complete the simple 'human factors' protocol, and enter the data alongside the event description and risk assessment data. The risk assessment might in some cases be modified in the light of the incoming human factors information.
The proposals above outline in the most general terms the basic geography of a system which might capture information on unwanted events in a systematic way about Healthcare Associated Infections, and also go some way to making a start on the human factors front. It is geared towards local needs, and to a large extent could be interpreted locally at the detailed level. It is also focussed on usability, taking into account the reality that such a system would in all probability be operated by people already fully occupied in their existing jobs, so that simplicity and speed of use would be the main requirements. A more comprehensive and discriminating system could be devised, but it is believed that the necessary additional complexity of such a system would lead to its non-adoption by people already carrying a full workload. It is felt that a more complex system would be seen as an additional burden, and would therefore meet with resistance. Despite this qualification a system such as the above, if adopted widely at the local level, would offer a degree of systematised data collection on unwanted events substantially greater than that currently evident, the possibility of comparing data between localities, and a starting point for the collection of human factors information and its integration into event descriptions and risk assessments on the same database.
1 According to James Reason, these types of errors can lurk within an organisation for extended periods of time, and have teen termed organisational 'latent pathogens'. They often only come to light at times of crisis, and/or can precipitate major disasters.