From Micro to Macro: Safety Science Special Issue-安全是一門科學嗎？

以下Safety Science 2014年才會刊出的special issue，問了許多大哉(+災=對於國內的大老而言)問

以下此一領域守門把關大師們的想法與一生的功力值得好好深思體會（看看人家如何定義此一學科的範疇、批判哪些常見問題與思想誤謬）

Is safety a subject for science?

作者

Erik Hollnagel（丹麥人、心理學背景）
http://www.erikhollnagel.com/CV.html

Safety science is therefore taken to refer both to what we know about safety and to the ways we have built and continue to build this knowledge. In other words, to how we study the subject matter, which in this case is safety itself.

If the common definitions are accepted, then a science must have a more or less well-defined topic, focus, or object (phenomenon) that can be studied. It must have a paradigm, as argued by Kuhn (1962).

什麼是”安全”這門(科學)研究的(具體)範疇、領域和典範？

Astronomy is a science because it studies celestial objects (such as moons, planets, stars, nebulae, and galaxies); that chemistry is a science because it studies the composition, properties and behaviour of matter; that psychology is a science because it studies the mental functions and behaviours of humans; that organisational studies is a science because it examines how organisational structures, processes, and practices shape social relations and influence performance.

According to this way of reasoning, safety science is the study of safety. But unlike the celestial objects, unlike matter, even unlike mental faculties, organisations, goods and services, safety does not represent an agreement on cannot what it is that should be studied, nor can it be said to exist in any concrete or material sense, or to be real (Westenhoff, 2011). Because of this we cannot resolve disputes about what safety is by referring to something that exists independently of our thinking of it, as if it was an object (as the term is used in semiotics). Yet we need to be able to refer to what safety is in a way that is open to intersubjective verifiability, we need to have a common agreement on what we should focus on, to avoid falling into the trap of solipsism.

以下是一堆安全的定義

Safety is often, indeed nearly always, defined as a condition where nothing goes wrong (injuries, accidents/incidents/near misses) or more cautiously as a condition where the number of things that go wrong is acceptably small. Examples of this definition are easy to find. The International Civil Aviation Organisation, for instance, defines safety as ‘‘the state in which harm to persons or of property damage is reduced to, and maintained at or below, an acceptable level through a continuing process of hazard identification and risk management’’ while the U.S. Agency for Healthcare Research and Quality defines safety as the ‘‘freedom from accidental injury’’. More indirect definitions can also be found. As an example, Transport Safety Victoria defines a major incident as ’’an incident or natural event that poses a serious and immediate risk to safety and includes a derailment of rolling stock, a collision, a fire or explosion’’. From this one may conclude that if accidents and incidents are a risk to safety, then safety is marked by the absence of accidents and incidents.

Such definitions of safety are, however, indirect rather than direct since safety is defined by what happens when it is absent or missing. Properly speaking, they are therefore definitions of lack of safety (or unsafety) rather than of safety. One consequence of this is that safety management relies on measurements that refer to the absence of safety rather than to the presence of safety. Because the focus is on things that go wrong, there will be something to measure when safety is absent, but paradoxically nothing to measure when safety is present.（我們通常只能看到”不安全（的化學物質、人為行為、製程條件、設備機台與環境）”，無法直接量測”安全”，安全=沒有缺失或不存在潛在危害）

The focus on situations where things go wrong, on the absence of safety, is theoretically and scientifically suspect but makes eminent practical sense.

Seeing technology as the predominant – and mostly also the only – source of both problems and solutions in safety was maintained with reasonable success until 1979, when the accident at the Three Mile Island nuclear power plant (TMI) demonstrated that safeguarding technology was insufficient. The TMI accident forced safety professionals to consider the role of human factors – or even of the human factor – and made it necessary to include human failures and malfunctioning as potential risks, first in operation but later also in design, construction, and maintenance (Swain and Guttman, 1983; Dougherty, 1990).三哩島事件之後，不再迷信科技可以預防安全問題=>注意到human factors=>人員操作/作業人員的層次

In 1986, 7 years later, the loss of the space shuttle Challenger, together with the accident in Chernobyl, made yet another extension necessary. This time it was the influence of the organisation, captured by terms such as organizational failures (Reason, 1997) and safety culture (Guldenmund, 2000).=>安全文化與組織的層次

The general concern for safety management has always been to find a cause, or a set of causes, both in order to explain what has happened and in order to propose remedial actions. This way of thinking corresponds to a causality credo, which can be formulated as follows: (1) adverse outcomes (accidents, incidents, etc.) happen when something goes wrong; (2) adverse outcomes therefore have causes, which can be found, and (3) treating – and preferably eliminating – the causes will increase safety by preventing future accidents (e.g., Schröder-Hinrichs et al., 2012).

An alternative approach would, of course, be to challenge or change the basic underlying assumption of causality, but few have entertained that. We have therefore through centuries become so accustomed to explaining accidents in terms of cause-effect relations – simple or compound – that we no longer notice it. And we cling tenaciously to this tradition, although it has becomes increasingly difficult to reconcile with reality.（不要沉迷於詮釋事故發生的前後因果）

2.1 Safety as an epiphenomenon=>安全只是附帶現象

This way of defining safety indirectly, namely as that which is missing when something goes wrong, sees safety as an epiphenomenon rather than as a phenomenon. (An epiphenomenon is defined as an incidental product of some process, that has no effects of its own.) The primary phenomena are the adverse outcomes and how they come about, and safety is simply a name for the condition that exists when the adverse outcomes do not happen. In relation to the question addressed by this paper, the subject matter of safety science is therefore the occurrence – or rather, the nonoccurrence – of adverse outcomes (accidents, incidents, and near misses) and their aetiology（the philosophical study of causation）, but not safety as such. The subject matter is the lack of safety rather than safety. This raises the interesting question of whether it is possible to have a science about something that is not there? In other words, can the object of a science be nothing?

沒有出事不代表安全，我們只能研究一家公司（或某個情境）有多危險，卻不能肯定這間公司(或那個情境下)ㄧ定會出事，只有等到事故發生的時候，這些前因後果才會被串連與穿鑿附會(解釋)在一起

2.2. Safety as a non-event=>安全不是單一事件，而是一種動態的過程

The problem alluded to above has been accentuated by the suggestion that safety should be defined as a ‘dynamic nonevent’ (Weick, 2001, p. 335). (Weick actually talked about ‘reliability as a dynamic non-event’ but the similarity to safety is unmissable.) The meaning of a ‘non-event’ is, of course, that safety is present when there are no adverse events, i.e., when nothing goes wrong. The meaning of ‘dynamic’ is that the condition of nothing happening, meaning that nothing goes wrong, cannot be achieved by passive means, by adding layer upon layer of defence and protection, but requires constant attention.

The focus on non-events does obviously not mean that nothing happens. Indeed, many things happen, but they succeed rather than fail. This becomes clear if it is rephrased so that safety is defined as ‘a dynamic lack of failures’. If we go one step further and replace the ‘lack of failures’ with ‘successes’, we arrive at Safety- II as a proper alternative to Safety-I, cf., below.

Safety-I represents the established understanding that has been described above, which means that safety is defined as a condition where the number of adverse outcomes (accidents/incidents/near misses) is as low as possible.

Safety-II is consequently defined as the ability to succeed under expected and unexpected conditions alike, so that the number of intended and acceptable outcomes (in other words, everyday activities) is as high as possible. (The astute reader may notice that this is a paraphrase of how resilience engineering defines resilience, cf. Hollnagel et al., 2011).

Following this definition, safety science changes from being the study of why things go wrong to become the study of why things go right, which means an understanding of everyday activities. All everyday activities are clearly events rather than non-events, which solves Weick’s problem, so to speak. Safety – or more precisely Safety-II – thus becomes an aspect or a characteristic of how systems function, and its presence can be confirmed by looking at well-defined categories of outcomes and by understanding how they came about. The purpose is no longer to avoid that things go wrong, but instead to ensure that things go right.

This new understanding of safety explicitly acknowledges that systems are intractable rather than tractable. While the reliability of technology and equipment in such systems may be high, workers and managers frequently trade-off thoroughness for efficiency, the competence of staff may vary and may be inconsistent or incompatible, and effective operating procedures may be scarce.

因此作者認為safety science應該研究「how people are able to provide the required performance under expected and unexpected conditions alike.」而非事故的發生。

-------------------------------------------------------------------------

What is safety science?

作者：Terje Aven

http://www.sraeurope.org/home.aspx?pag=1045#bl2192

從風險與機率不確定的角度來看所謂安全

----------------------------------------------------------------------------------

Issues in safety science

作者：Andrew Hopkins

https://researchers.anu.edu.au/researchers/hopkins-ap#top

社會學、公共政策背景專長

Abstract

This paper deals with three issues. First, the question of the boundaries of safety science –what is in and what is out – is apractical question that journal editors and reviewers must respond to. I have suggested that there is no once-and-for-all answer. The boundaries are inherently negotiable, depending on the make-up of the safety science community.

不同審稿人有不同的見解，也或許在未來全球氣候變遷影響和衝擊人類的安全，因此一議題也會被納入所謂安全科學的範疇？！

The second issue is the problematic nature of some of the most widely referenced theories or theoretical perspective in our inter-disciplinary field, in particular, normal accident theory, the theory of high reliability organisations, and resilience engineering. Normal accident theory turns out to be a theory that fails to explain any real accident. HRO theory is about why HROs perform as well as they do, and yet it proves to be impossible to identify empirical examples of HROs for the purpose of either testing or refining the theory. Resilience engineering purports to be something new, but on examination it is hard to see where it goes beyond HRO theory.

The third issue concerns the paradox of major accident inquiries. The bodies that carry out these inquiries do so for the purpose of learning lessons and making recommendations about how to avoid such incidents in the future. The paradox is that the logic of accident causal analysis does not lead directly to recommendations for prevention. Strictly speaking recommendations for prevention depend on additional argument or evidence going beyond the confines of the particular accident.

底下欣賞人家如何批判三大常見理論

3.1. Normal accident theory

The theory of normal accidents is propounded by sociologist Charles Perrow (1999) in his book, Normal Accidents . It offers an explanation for why major accidents in many hazardous technical systems appear to be inevitable. He argues that where a system is characterised by both complexity and tight coupling, accident are inevitable, no matter how well the system is managed (Perrow, 2011:172).

The question I want to ask is: how useful has this theory been in explaining the major accidents of our time? The answer is: not at all. Perrow (1994:218) himself acknowledges that few if any of the high profile accidents of recent decades are normal accidents. They were the result of poor management, cost pressures and the like, not the inevitable result of complexity and tight coupling. Most recently he conceded that the Gulf of Mexico blowout of 2010 was not a normal accident.

Given all this, the question that arises is: why has the theory of normal accidents proved so enduring? Perrow’s political purpose is relevant here. He saw his theory as away of combating the ubiquitous tendency to blame accidents on front line operators: if complexity and tight coupling were the real culprits then it was clearly inappropriate to blame the people who made mistakes on the day. That is a laudable purpose, but there are many other theories that do this, not the least of which is Turner’s theory of sloppy management.

I suspect the fact is that while people continue to make reference to the theory, this is no more than lip service. We are dealing here with one of the more unfortunate aspects of academic practice. People refer to the works of others not necessarily because that work supports their arguments or are in any other way relevant to what is being said, but simply to establish that they are aware of the relevant literature. Such citations amount to little more than academic name dropping. I have myself been cited by people who seem unaware that my point is quite the reverse of theirs and that my work undermines their own conclusion, rather than supporting it. I suspect that this process of catch-all citation is part of the reason the theory of normal accidents continues to be cited.

感覺這個理論還是有救，因為以上兩派學者從個體micro的角度在爭辯事故的發生是否有所謂過失的存在=>沒有過失，事故就不會發生嗎？從總體macro的觀點來看，事故的發生是基於或然率的”Normal”現象

3.2. The theory of high reliability organizations

HROs manage the unexpected through five processes:

(1) preoccupation with failures rather than successes,
(2) reluctance to simplify interpretations,
(3) sensitivity to operations,
(4) commitment to resilience and
(5) deference to expertise, as exhibited by encouragement of a fluid decision-ma king system.
Together these five processes produce a collective state of mindfulness ’’（呵呵，國外的學者也很會嘴砲和唬爛）

所謂的HROs高可靠度的組織是一種應然的理想狀態、在現實生活中並不存在，連NASA都被批判不是high reliability organizations（國內的國家工安獎得主不過只是selection bias，時間可以驗證這些公司安全績效的reliability）；此一理論正好是前者（Normal accident）的對照

3.3. Resilience engineering

作者認為所謂應變調適Resilience和以上所謂高可靠度組織其實是相同的構念A resilient organization . . ., seems indistinguishable from a high reliability organization . . . . I hope that resilience theorists will someday explain the difference, if there is any, between these two ideas.

Resilienc e is one of the five cardinal features of HROs identified by Weick and Sutcliffe, in the quotation above. According to Weick and Sutcliffe (2001:14), resilient organizations are not disabled by errors or crises but mobilise themselves in special ways when these events occur, so as to be able to deal with them. A commitment to resilience is actually a commitment to learn from error.

備註：同樣的疑問也可以用來質疑安全文化和氣候學派（構念不夠嚴謹和穿鑿附會）

所以如此看來，其實工安領域沒有一套站得住腳的理論，理論基礎薄弱，也難怪沒被科學界當成一回事與認定為一門嚴謹的學科（收容不同領域的浪人過來攪和、混飯吃）？！

4. Major accident analysis

國內的很多的職安衛系所，應該正名為事故/職病調查學系=>因為主要只是在談一些過去已經發生的事故和職病的lessons learned，對於範疇更廣的安全和衛生，其實沒有太多的理論與(經科學實證的)有效作為可供著墨。

4.1. Accident causation（以下這個部份的分析受教了）

Major accident inquiries are implicitly or explicitly inquiries into cause. We must therefore begin with some observations about causation. For present purposes , I distinguish two distinct meanings of cause. The first is sufficient cause, meaning a factor or set of factors that is sufficient to produce the outcome. This is a strong sense of causation. It is not however the most useful meaning of cause, because to identify the sufficient cause of an accident, that is, the entire set of factors that went into the producing the accident, is impossible, practically speaking.

充分條件原因
ex：”可燃性物質”或”人為疏失”是”發生火災”的充分條件，但有可燃性物質或人為疏失發生，不見得一定會導致火災事故的發生

The second meaning of cause is a factor that was necessary for the outcome to occur. Such a factor can be called a but-for cause, in the following sense: but for this factor (had it been otherwise), the accident would not have happened. Most accident analyses implicitly adopt this second meaning. They aim to identify are latively small set of necessary causes, in the absence of any one of which, the accident would not have happened.

必要條件原因
Ex：沒有水就沒有生命，水是生命存在的必要條件；縱火是此一事故的必要條件

維基百科充分必要條件

http://zh.wikipedia.org/wiki/%E5%85%85%E5%88%86%E5%BF%85%E8%A6%81%E6%9D%A1%E4%BB%B6

這個充分與必要條件的事故分析觀點，可以有效的戳破與推翻骨牌理論在各個環節上的穿鑿附會；但相對於失誤樹分析的 and/or +發生機率去又顯單薄，只是重大事故的發生對於失誤樹分析而言，是很多event同時發生的交集與後果，往往事後才知道，些單元間的失效其實機率沒有想像中低&具備關聯性(還前面Normal accident theory提到where a system is characterised by both complexity and tight coupling=> accident are inevitable, no matter how well the system is managed)

所以敝人骨子裡是Normal accident theory的擁護者？！

同樣的觀念也可以用來批判 (accident map)method of accident analysis developed by Rasmussen (1997) and Rasmussen and Svedung(2000)

http://www.dedale.net/images/fulltext.pdf

不同事件/組織管理權責邊界之間的關連性，被視為因果關係不太恰當！

The reasoning involved in identifying necessary factors is counterfactual – making an argument about what would have happened had this factor been otherwise. This raises the question: how can we know what would have happened had this particular factor been otherwise?

For more remote, organisational causes, it becomes a matter of expert judgement, and the causal connections become probabilistic statements rather than logical deductions. For example, ATSB provides the following example, where the arrows can be read as “contributed to’’.

In this situation the analyst cannot be certain that a better shift roster would have prevented the accident. Perhaps the first mate was dealing with some other issue that distracted him and, even in the absence of fatigue, this would have caused him to forget to change course. The best the analyst can do is make a judgement, based on assessment of the all facts of the case, that a better roster would probably have prevented the accident.

我們如何能夠界定兩個事件之間”真的具備”(因果的)關聯性？是依據邏輯還是專家的判定？

以ATSB的案例（船員班表異動=>操作人員疲勞=>(疲勞)人員未注意船隻航向=>船隻擱淺）來說，船員班表異動算是對事故發生有貢獻（contributed to’）的因素，或許應該算是背景情境脈絡因素=前提與假設(邊界條件)，去指責這些假設前提要改變，算是違反遊戲規則

越把事故責任推到這些情境脈絡因素上，我們也越難確認=>是否這些情境脈絡改變，事故就不會發生？這些情境因素是造成事故發生的必要或充分條件？（無解的難題和疑問=>沒有辦法有實驗組和對照組來確認差異與因果關係！）

感想備註：

1. 見學了，可惜自己的英文寫作造詣差、文章讀得不夠多，不然應該可以跳出來加入此一口水爭辯（我就是那種喜歡把問題推給情境脈絡的鄉民）。
2. 回過頭來思考自己在國家地理頻道上的哪些災難調查影集，感覺說詞都太煽情+武斷=>人腦果然是習慣於聽故事，而非嚴謹的一步一步分析

如果連以上事故發生的因果關係和哪些因素是充分和必要條件都搞不清楚了，那我們怎麼能夠提出”合理與適宜”的改善建議？

事故的原因與改善預防的建議或許可以分開

From Micro to Macro

網頁

2013年12月28日星期六

Safety Science Special Issue-安全是一門科學嗎？

沒有留言:

張貼留言

網頁

2013年12月28日 星期六

Safety Science Special Issue-安全是一門科學嗎？

沒有留言:

張貼留言

2013年12月28日星期六