| The Investigation Process Research Resource Site A Pro Bono site with hundreds of resources for Investigation Investigators | |||||
| Home Page | Site Guidance | FAQs | Old News | Site inputs | Forums |
|
INVESTIGATING
INVESTIGATIONS to advance the State-of-the-Art of investigations, through investigation process research. Research Resources: Search site for:: Launched Aug 26 1996. |
The knowledge and skills required to develop recommended actions following investigations has been assumed uncritically in the past. Research in this area is sparse. The dialogue that follows is beginning to disclose the close relationship between predictive investigations (conducted to identify problems before they occur) and retrospective investigations (conducted after they occur.) . Read on......then see end note.
12/20/96 Subject: Recommendation development Date: Mon, 16 Dec 1996 06:47:16 -0800 From: "Charles P. Hoes" To: luben@patriot.net Dear Ludi, I have just read your article on recommendation development and found it to bring up a number of very troubling areas related to the field of system safety. The issue of recommendation development and verification of effectiveness is one that has been nagging at me since I first entered the field, and has only gotten worse the more I think about it. The problem from the point of view of developing recommendations based upon investigations of major accidents (whatever "major" might mean) is more difficult, but closely related to the problem that I face doing system safety. I find that when I work with a moderately complex system such as a semiconductor etcher for example, I typically identify between 50-100 significant "hazards" that I want to work on. Why I select these is unknown. Sometimes it is because of known problems in that particular area (accident history), other times it is because the management and customers are sensitive to an issue, and at other times it just seems like that right thing to do. I stop at this number of issues because many more and I become overwhelmed. I try to "chunk" the issues together so that it is more or less manageable. I then develop recommendations for controlling each of these hazards. The number of recommendations for each seems to depend upon the "chunkiness" of my hazard descriptions, but almost always contain a set that include (1) hardware design, (2) procedures, (3) warning notices in the manuals, and (4) maybe warning labels. Sometimes the set of recommendations get more complex and include inspections, quality control measures, maintenance requirements, special training, warning devices, and on an on. The problem that I have always had is that I have no way of knowing when enough is enough, and whether or not what I have suggested makes any real difference to the safety of the system. I can easily guess at the value of each recommendation, or the entire set of recommendations, but I can't measure it. This is especially a problem with new, state-of-the-art systems which have no previous accident history. I can't very well demand that some systems are made without these features to act as "controls" so I can evaluate my recommendations. I can't even seem to figure out how to compare "my" system with similar ones because the accidents seem to be in the details and I really can't compare the details in any sensible way. The next problem that I have in developing these recommendations is that it is becoming more and more clear to me that the real safety benefit comes from very detailed, and therefore costly, evaluation of the system. It is useful to develop general classes of recommendations for hazards, but the detailed implementation, or failure of the implementation, is where the accidents come from. I always have the very uncomfortable feeling that I missed the one important element, and I have found no way to resolve that problem. I agree with you that we are really missing training, and knowledge, about how to develop recommendations. This problem exists all the way from what you are talking about as the global recommendations that are created by accident investigations by the NTSB to the detailed recommendations that I develop to control a very specific problem. I can, and do, develop several hundred recommendations for each system that I work on, but I have very little to base my opinions on and almost no evidence that I am adding value to the system. I don't even have any way to get accident history for systems that I have worked on to see where I have missed things. I have been rolling this problem around in my head a lot lately, and expect to expand this effort significantly because of new work that I have entered into with UL. We are working together to combine UL listings methodology with System Safety methodology to evaluate systems for the semiconductor industry and for the European Union's CE marking requirements. This problem of integrating the two approaches has really sent my mind spinning, with the issue of determining how to create and evaluate recommendations being the foremost problem. I will let you know if anything of interest pops up from all of this. I think that maybe our problems are one and the same, just at a different level of detail (maybe they are even at the same level of detail, I'm not that I can tell). Thanks for the interesting paper. It helps to know that there are others who are struggling with similar problems. ___________________________________________| Charles P. Hoes,PE,CSP Hoes Engineering, Inc. | System Safety Engineering | | (916) 756-3999 FAX (916) 756-3970 | Editor of the SSS's "Hazard Prevention" journal |_____________________________ _____________ ![]() C Hoes message precipitated subsequent exchanges, which may be of general interest to the investigation research community, in that it addresses investigation of undesired incidents both before and after they occur. Subject: Re: Recommendation development Date: Mon, 16 Dec 1996 20:53:54 -0400 From: Ludwig Benner Organization: LB&A To: "Charles P. Hoes" References: 1 Thank you VERY much for your comments about recommendation development issues. I don't know what the next step might be, but I would be delighted to share experiences in whatever area you wish to explore in detail. My perspective was initially from the NTSB investigation point of view, but after I left there I did a number of complex system safety hazard analyses and realized the very very close linkages between investigating accidents before they happen (system safety analyses) and after they happen (accident investigations). There are several helpful experiences that I would be happen to share when you want to do so. Again, thanks for your really introspective comments. If we are going to make progress we first have to overcome what Boorstin called the ILLUSION OF KNOWLEDGE that sometimes impedes our perspectives. Best regards, Ludi![]() Subject: Re: Recommendation development Date: Tue, 17 Dec 1996 11:34:04 -0800 From: Charles Hoes To: luben@patriot.net At 08:53 PM 12/16/96 -0400, you wrote: > If we are going >to make progress we first have to overcome what Boorstin called the >ILLUSION OF KNOWLEDGE that sometimes impedes our perspectives. > This illusion is certainly an interesting problem. It seems that the idea that we are working under this illusion (or delusion) isn't very popular with many of our colleagues. They seem to prefer working under the illusion that they actually know what is going to happen, and therefore that they have the correct solution. I think this is much of the reason for the development of all of the design based standards floating around. It is relatively easy to have a good knowledge of what is in the standards, it is much more difficult to know how to proceed in the absence of such guidance, or with more open ended performance style requirements ("control the risk of such and such") promoted by the European Union. In my system safety work I have settled into a pattern of identifying a potential problem, and then mixing hardware and people related controls. This has resulted in a situation where I often develop two or three hardware based recommendations along the lines of "reduce the energy", and "provide shields" and "provide interlocks" with person related controls such as provide "warning labels" and "warning devices" and "written procedures" and "training" and "periodic inspections" and "quality control". While this sometimes seems like I am smothering the hazards with too many belts and suspenders, I often feel that it is appropriate. More difficult questions arise from deciding issues such as whether or not shields and interlocks are both needed, and whether I can actually expect formal training. Unfortunately, I don't have any method for determining whether my list of controls is necessary and sufficient (although I do consider that to be the goal). I think I have settled into a more well defined method of developing the necessary recommendations than the serial use of the "order of precedence". There seems to be another level of thinking that is needed to make the decision about whether the controls are "good enough". This is somehow based upon an understanding of the entire hazard scenario, including foreseeable considerations of the person(s). I have started an extremely interesting project with UL. We have formed an "informal team" (which I am not allowed to advertise) to evaluate products (machinery) to SEMI S2 (a semiconductor product safety standard) and the CE marking requirements ("machinery directive", "low voltage directive", etc.). They are doing their normal compliance to UL standard thing, and I am doing a hazard analysis to develop other requirements to check against. I am using my normal system safety analysis approach. As you might guess, this has resulted in some extremely interesting discussions concerning whether my recommendations are really needed, and whether or not the items in the UL standards are needed given that we have evaluated the hazards and done other things to control the risks. We are doing a lot of head scratching about this problem. I have offered, and have been accepted, to provide a series of seminars on the topic of hazard based safety for the semiconductor industry. This should be quite an experience because it will bring together the two paradigms of traditional "product safety though compliance" and "system safety through analysis". At this point in time my biggest concern is with how to address this issue of recommendation development and risk evaluation. I think I will be walking into some pretty hot water because of the vested interests of the "third parties" involved in the compliance assessment process. I would be interested in knowing more about this topic, and any "war stories" or advice that you might have to offer. ___________________________________________<| Charles P. Hoes,PE,CSP Hoes Engineering, Inc.< | System Safety Engineering | (916) 756-3999 FAX (916) 756-3970 | Editor of the SSS's "Hazard Prevention" journal |_____________________________ ___________ ![]() Date: Tue, 17 Dec 1996 16:47:36 -0800 From: Charles Hoes To: luben@patriot.net At 04:20 PM 12/17/96 -0400, you wrote: >Dear Charles, > >1. Would it be OK to publish your prior e-mail on theIPRR site? I'd like >to give it prominence - i.e., provide the topic with your comments a >page of its own. : I have no known personal problem with your posting my e-mail. I suppose it might be blasphemous enough to cause some interesting responses. I guess I have enough conviction in my opinion to show the world how much I don't really know. Actually, as I think about it, maybe we should allow this whole thought process to mature a bit first so it isn't quite so one sided. While I have many concerns about the process that we use to develop recommendations, I don't in any way think that it is a waste of time or is somehow wrong. I have great confidence that we do in fact accomplish much good with our efforts. The concern isn't so much about whether or not we are doing the right thing, it is more along the lines of being more clear about how it is accomplished and judged. I am convinced that I help products become much safer than they would have been if someone hadn't facilitated the process that I help to create. (I am convinced of this based upon examples of similar products which were created by the same design teams before I arrived on the scene. The "before" and "after" configuration in terms of enhanced safety is often quite striking.) System safety works, but I think the reason that it works is quite different than most folks realize. My engineers have taken to jokingly describing our work as that of "holistic facilitators". While this is a bit of a "new age" joke, it is also closely related to what I perceive our job to be, with the emphasis on "good works" such as safety, the environment, and human factors. (We get hired fairly regularly for our system engineering support without the emphasis on safety.) It is this aspect of our work that I think is most important, and it is important because it gets the recommendation creation process to the designers and "risk accepters" (as you indicated is needed in the accident investigation field). My way of working is to become the safety lead for a diverse group of involved project personnel (and customers/users when practical) who use the system safety process within the concurrent engineering effort. (Actually, we often have to facilitate the creation of the concurrent engineering effort as part of our work because it often isn't in place when we first show up.) We help identify the potential hazards and then help figure out how to resolve them. Once that has occurred and everyone has agreed upon how to get the risks under control, we help make sure that the controls are implemented and verified. I believe that it is the process of working with the design team to figure out the problems and solutions in terms that are appropriate for the project that makes system safety so powerful and valuable. The questions of how to you make sure that the hazards are properly identified and controlled are important and interesting, but we help make things better even if we never do figure out objective answers to recommendation development. >As I pondered it, the thought occurred to me that if >some smartA lawyer gets ahold of it, litigation may shake up the SSS >troops. How do you envision this discussion to cause a litigation problem? I am not quite sure what I said that might get lawyers interested. I think I said that we don't really have much hard and fast evidence to based our decisions on. I suppose that means that we could be in a position of basing claims on the subjective opinions of "experts" in court. I think we already do that. There are of course the existing rules, regulations, laws, standards and "best industry practices" that need to be complied with when appropriate. I believe the questions that we have been worrying are those that fall outside of these known practices, or worse, when the existing standards are not necessarily the best safety practices (even though they may be the state of the art). >2. Thanks for your message about the dilemmas you described in your last >e-mail. It gets more and more interesting. > And more complicated! By the way, it appears that the Japanese are moving to incorporate something like system safety into their normal way of designing and developing products. If this turns out to be true, it should be an extremely interesting experiment in implementing system safety in a commercial environment. >3. You wrote - I think I have settled into a more well defined method of >developing the >necessary recommendations than the serial use of the "order of >precedence". >There seems to be another level of thinking that is needed to make the >decision about whether the controls are "good enough". This is somehow >based upon an understanding of the entire hazard scenario, including >foreseeable considerations of the person(s). > That is very true. The choice of controls needs to include an understanding of how the system (including people) will behave after the controls have been put into place. The EU has a requirement to perform after-market evaluations of the actual use of the system to ensure that the assumptions (foreseeability considerations) are in fact correct. (I believe that this is in the product liability directive, but I am not positive about that.) They have been explicit in their understanding that of necessity we make guesses, but then they say that we need to do something to validate our guesses based upon actual use. I haven't heard of anyone actually doing this after-market evaluation, but it is certainly an interesting concept. >To predict effectiveness of any recommendations, you need to understand >how the system to be changed will operate in the future both without and >with the proposed changes, so you don't get into the air bag kind of >fiasco. I have observed that understanding is dependent on the system >description and the methodology used to define the system operation. >Many design and operating people don't really understand their system - >as evidence, when we used an MES-based approach to describe who or what >did what in what sequences in a system, it usually took a minimum of >three of typically five revisions to our system description flow charts >before we could get the designers and users to agree that it was valid. >Only then did we start to do the hazard analyses to define undesired >potential occurrences ( including human actions) and discover options to >control the risks of those occurrences. > I agree with you. If you don't know the system, you can't do much with it. Unfortunately, this step seems to be the most likely to be skipped. I have never been able to figure out why, except that it might be embarrassing to admit that you don't understand how the thing works. My experience agrees with you statement, once you ask the question it is amazing how little anyone knows about the system. I have found that the confusion is often greatest among the designers themselves, especially the different specialties such as hardware design, software, and electronics. They seem to work on such different mental models of the system that it is difficult for them to determine that they don't understand each other. >I don't know if this is applicable in your cases or not.> It does indeed apply. I have never spent that much time hammering out the operation of a system as you seem to have, but it sounds like a good idea. Next time I get an appropriate project I will try mapping the operations on a flow chart type of representation and see what happens. I bet it will be enlightening to all concerned. >More to follow when I get a chance. >-- I look forward to your next response. Subsequent exchanges on this topic are compiled in the indexed files provided to Active Participants in the Experimental On-line Investigation Research Project. To become a participant see Notice at home page.
|