1. The Need for a Structured Expression of Failure
When we look at the recent high incidence of a diverse variety of failure in Japan, we could be forgiven for thinking that we are witnessing failure on parade. What are the reasons for this frequency of failure? What can we do to prevent the repeated occurrence of failures? A variety of organizations and individuals are active in well thought-out initiatives to prevent the recurrence of failures. The most common activity is the production of compilations of case studies of events such as failures, problems, and accidents. Businesses serious in this field have been working hard to produce compilations of failure case studies. But these compilations are hardly ever made use of. From the point of view of the people who produce such compilations, whose aim is to see their work put to use to prevent unnecessary repetition of failure, it must be particularly frustrating to see their work ignored and the same failures repeated. One of the reasons why such compilations are not used is that failure knowledge is not being effectively communicated.
In fact, one of the purposes of the Japan Science and Technology Agency in creating the Failure Knowledge Database was precisely to provide a means of communicating failure knowledge. The main reason why the many diverse compilations of failure case studies are not being applied in practice is that knowledge of past failures is not being communicated to the very people interested in applying this knowledge to prevent failure. If we can properly communicate failure knowledge, and if the people receiving that knowledge can use it correctly, then the needless repetition of failure can be halted. Accordingly, we should ask the question "What is the best way to communicate failure knowledge?" The answer is to give people who encounter failure a clear idea of how to structure failure knowledge, and have them describe a case study of the failure in terms of that structure. This then makes that failure knowledge easier to find and access for anybody wishing to learn from that experience, and, as the procedure of structuring failure knowledge is repeated, that procedure becomes better known and more widely established. The most important concept here is the structuring of failure knowledge.
When structuring failure knowledge, the most important aspects are breaking the failure down into its component parts and expressing that breakdown clearly. This is shown in Fig. 1 below.
Fig. 1 Analysis and Expression of the Occurrence of Failure
Most people, when they analyze a failure, see it in terms of "cause and result" (Fig. 1, Caption (a)): First the cause takes place, followed by its inevitable effect, or result, as we refer to it. This is a simple, easy-to-grasp analysis that most people have no problem taking in, but it has one significant drawback. It assumes that where a cause exists, a result will always follow, but does not allow for cases where a cause exists and no result follows. According to the currently prevalent way of thinking, if the cause is removed, the result will not follow. But is this really the case? Is it not really the case that, on many occasions when failure has been dealt with by applying this way of thinking, it has merely led to the repetition of failure?
Let us take a look at the chronological progression of a failure (Fig. 1, Caption (b)). In fact, all we can really see are the events that take place in front of our eyes. We cannot see the cause or the background. As a developing failure becomes evident as a failure, a person takes action to deal with the unfolding sequence of events. We can think of these events and the action taken in response to them as results. In addition, a variety of related developments take place. The best term to describe such developments is "sequels".
Let us now consider how to express the occurrence of failure in the easiest possible way for people wishing to make use of failure knowledge to take in (Fig. 1, Caption (c)). First we have the "incident" (in other words the failure that has occurred); next we have the "sequence" of the failure (how the failure develops with the passage of time); then we have the "cause" (what the presumed cause is and whether or not it is noticed at the time of failure); finally we have the "response" (the action taken in response to the failure). These four stages are combined to give the "overview". Expressing an occurrence of failure in terms of these five items makes the task of obtaining a comprehensive understanding of that failure much easier for people wishing to learn from it. Failure knowledge - what we can actually learn from the failure - must then be extracted from this information and formulated in such a way (as a "knowledge formulation") that it can be readily communicated to people wishing to make use of it. The six items identified in this paragraph comprise the minimum requirements of a coherent expression of failure.
We can also view failure as a unified system (Fig. 1, Caption (d)). In the fields of science and engineering, the occurrence of an event is viewed in its entirety as a single system, through which input leads to output. These three elements (input, system, and output) could otherwise be termed trigger, characteristics, and result. In an actual engineering system, all three elements can be seen, but in a "failure system" only the result is visible - the trigger and characteristics that lead to it are not. In order to view failure as a system, it is necessary to "look back" from the result and guess or estimate the trigger and characteristics, then to verify the accuracy of these guesses or estimates to positively identify the trigger and characteristics. Once the characteristics have been identified, it is possible to view the failure in terms of characteristics, with a trigger leading to a result. The joining of trigger and characteristics as described here is equivalent to the "cause" in the "cause and result" mentioned above, by which most people commonly define failure.
When failure actually occurs, and the trigger and characteristics are determined by "working back" from the result, it becomes clear that one more important element is missing. The occurrence of failure is tempered by certain restrictions (Fig. 1, Caption (e)). In any system with given characteristics, a trigger works on those characteristics, but those characteristics are in fact also subject to certain restrictions, and these restrictions can also play a part in the occurrence of failure. In this way of thinking, the trigger and characteristics represent the "cause" in the "cause and result" mentioned above, and the restrictions are often equated with the "background".
2. Structure and Sequence of Failure
It is often said that all failure is the result of human error and, of course, "we all make mistakes". When a person experiences failure, it often seems that only the "result" is apparent, particularly to those not directly concerned with or affected by the failure. However, once we investigate and analyze that failure, we understand that a failure consists of a cause, in response to which a person takes action, leading to the resulting failure. In this reasoning, action can be regarded as the human intervention that links the cause and result of the failure, neither cause alone nor action alone will lead to failure, and failure can only result when both cause and action exist. If we create a sequence of events based on this way of thinking, we first have a human cause, followed by human action, leading to a result (Fig. 2, Caption (a)). In the Failure Knowledge Database, we refer to the cause --> action --> result sequence that leads to failure as a "scenario". As an example, look at the following two chains of events:
- Through my own carelessness, I forgot to turn off the switch and a fire started.
- Because I was looking to one side, I didn't turn in time and I ran into a wall.
The similarity of the two scenarios is quite clear: The first phrase of each case ("Through my own carelessness" and "Because I was looking to one side") represents a cause that we can identify as carelessness; the second phrase ("I forgot to turn off the switch" and "I didn't turn in time") represents an action (in fact an action that should have been taken but was not); the third phrase ("a fire started" and "I ran into a wall") represents a resulting failure. We can represent this graphically as a "failure tree" (Fig. 2, Caption (b)). The tree consists of the trunk (cause), the boughs (actions), and the branches (results) all of which are bound together in the whole. Taking the analogy further, we can say that the case studies hang from the branches like the fruit of a tree. A branch from which many case studies hang represents a scenario in the real world that often results in failure. Using this tree expression, it becomes easier to determine what kind of scenario only rarely results in failure and what kind of scenario results in deadly failure. Taking the analogy still further, we can say that a group of failure trees in proximity represents a "failure forest" (Fig. 2, Caption (c)). We identified ten "human" causes of failure: "Carelessness", as described above, "Unknown Cause", "Ignorance", "Misjudgment", "Ignorance of Procedure", "Insufficient Analysis or Research", "Poor Response to Change in Environment", "Poor Concept", "Poor Value Perception", and "Organizational Problems". If we create one tree per cause, these ten trees together form the failure forest, through which we can express any failure.
Fig. 2 Structure and Sequence of Failure
Next, we look at a three-dimensional expression of the occurrence of failure (Fig. 3). As mentioned above when describing the failure tree, failure consists of "human" causes that form the basis of subsequent human actions. The result of those actions leads to the occurrence of failure. These three components of failure are shown in Fig. 3, Caption (a). Cause is represented as the bottom layer, action as the middle layer, and result as the top layer. By using solid lines to represent the connections between these components, we can produce a three-dimensional expression of the scenario of a failure. Fig. 3, Caption (b) shows the sequence reversed, with the cause positioned at the top and the result at the bottom. If we apply this sequence to the first of the examples mentioned above ("Through my own carelessness, I forgot to turn off the switch and a fire started"), we can draw a line from the cause "Carelessness" to the action "Non-Regular Movement", and on to the result "Bad Event". If we apply the same sequence to the second of the examples mentioned above ("Because I was looking to one side, I didn't turn in time and I ran into a wall"), we again see "Carelessness" as the cause, followed by the lack of action, Non-Regular Movement, leading to the Bad Event that is the collision. By applying this three-dimensional expression to a variety of case studies, we can determine not only what kind of scenario develops in what kind of situation, but also, because a scenario with many case studies is represented with a concentration of connecting lines, the frequency with which a given scenario leads to failure. Such information can then be used to predict and prevent failure.
Fig. 3 Three-Dimensional Expression of the Occurrence of Failure
3. Expressing the Elements of Failure through Mandalas
In this section, we explore the hierarchical relationship between the elements that make up the components of failure, taking as an example one of the elements of "cause". (In Fig. 4. and subsequent diagrams this element is "Carelessness".) If we assume Carelessness to be a "parent" concept, we can identify a variety of "child" concepts (such as "by accident" or "through distraction") which can themselves have child concepts. If we represent this idea in pyramid form, we have a structure such as in Fig. 4, Caption (a). If we represent the same information as a cladogram, we have a structure such as in Fig. 4, Caption (b). If we then collect all of the "parents" (along with their "children") that are identified as causes of failure and combine them in a single diagram, with lines drawn to show links as in a node diagram, we have a structure such as in Fig. 4, Caption (c). This type of diagram maps all of the elements of cause of failure and illustrates their hierarchical relationship.
Specializing one step further, if we represent each ring of nodes as a concentric circle, we have a structure such as in Fig. 4, Caption (d), which we refer to as a "mandala". (Mandalas are Buddhist representations of the universe and Buddhist teaching that we took as inspiration for our innovative method of expressing failure knowledge.) At the heart of the mandala is the central concept. The inner ring contains what we refer to as the top level elements. Whether for cause, action, or result, it eases understanding to divide this level into 10 key phrases. The outer ring contains what we refer to as the second level elements, of which there are 20 to 30 for each mandala. In the classification system used for the Failure Knowledge Database, we have arranged a third level of elements outside the second level ring. The third level elements are designed to be more specifically tailored to particular fields and case studies. The top and second levels are designed so that they can be applied to case studies in any field, but the third level classifications are designed to be field-specific.
We have created three mandalas: one each for Cause, Action, and Result. Hereafter the three mandalas are referred to as "Failure Mandalas".
Fig. 4 Type and hierarchy for the Elements of Failure
Next, we look at creating a three-dimensional version of one of the mandalas (in this case, cause; see Fig. 5). With many failures, analysis reveals the elements of cause to be factors such as carelessness or lack of preparation. By analyzing these factors, we can extract and define classifications of types of cause, such as the top level concepts "Ignorance" and "Misjudgment".
However, if we view the elements of cause from a different standpoint, we can see that there are other means of classification. For example, "Carelessness" can be classed as a cause for which the individual is responsible, "Poor Concept" can be classed as a cause for which the organization is responsible, and "Change in Environment" can be classed as a cause for which society is responsible. In other words, there is another thought process that involves relating causes to the expansion of the selected range from the individual, through the organization, to society. By following this train of thought, we see that, in the process of investigating the elements of cause, even as there is a toing and froing of thoughts between the abstract and the concrete, or between the individual and society, viewed overall, there is a "rise" toward abstract, top level concepts starting from individual, concrete elements and passing through organizational and societal elements. This way of thinking is illustrated in Fig. 6, Caption (a), which shows a mandala opened out into a spiral staircase that spirals clockwise, inward, and up, gradually rising to the center. In Fig. 6, Caption (b), the spiral staircase is removed, showing the thought path only as a whole. This path spirals inward and upward to the top and center, and it is the idea that we find here that we can regard as the genuine top level concept, and thus communicate to others as failure knowledge. The communication of failure knowledge is illustrated in Fig. 7. As illustrated on the left side of Fig. 7, a person who wants to communicate information on a failure must process that information from concrete details to abstract "top level concept (failure scenario) of cause, action, and result" along the rising thought spiral in order to formulate workable failure knowledge. Only then, as failure knowledge, can the information be successfully passed on to others. In similar fashion, the right side of Fig. 7 illustrates how a person who wants to receive information on a failure first receives that information in abstract form as a failure scenario. The recipient then processes that scenario via a descending thought spiral in order to extract concrete, practically applicable failure knowledge.
Fig. 5 Hierarchy of Elements and Correlation to Mandala (for Cause)
Fig. 6 Train of Thought
Fig. 7 Communicating Failure Knowledge According to Train of Thought
4. Explanation of Failure Mandalas
When we first created the Failure Mandalas, we thought that the idea of rotating in order and heading toward the center would only be applicable for a Cause Mandala in the field of Mechanical Engineering. But gradually, as discussions on the creation of the Failure Knowledge Database progressed, we came to realize that all of the Cause, Action, and Result Mandalas could be applied to any field. From then on, we approached the creation of the Failure Knowledge Database with a mentality which could be summed up as "Not perfect, but close".
(a) Cause Mandala (Fig. 8)
There are 10 top level key phrases and 27 second level key phrases. These key phrases can be applied to any of the four fields (Mechanical Engineering, Material Science, Chemicals and Plants, and Civil Engineering) that are used in the creation of the database. At the top level, we divide the classifications into four broad groups depending on "who" or "what" is to blame: Nobody is Responsible, Individual is Responsible, Organization is Responsible, and Neither Individual nor Organization is Responsible.
- Nobody is Responsible is represented in the Mandala as Unknown Cause.
- Individual is Responsible is split into five subclassifications: Ignorance, Carelessness, Ignorance of Procedure, Misjudgment, and Insufficient Analysis or Research.
- Neither Individual nor Organization is Responsible is represented in the Mandala as Poor Response to Change in Environment.
- Organization is Responsible is split into three subclassifications: Poor Concept, Poor Value Perception, and Organizational Problems.
Note that we consider the second level key phrases Change in Environment and Change in Economic Factors of the top level key phrase Poor Response to Change in Environment to be essentially the result of external factors that cannot satisfactorily be classed as the responsibility of either an individual or an organization.
Fig. 8 Cause Mandala
The top level key phrases and the second level key phrases under them are listed and described below.
Individual is Responsible
- Ignorance : Insufficient Knowledge - Disregard of Tradition
Individuals not knowing the standard way to prevent or solve failure, even if such knowledge is well known.
- Insufficient Knowledge
An individual or the people around him/her not knowing about established, known technical information.
- Disregard of Tradition
An individual not knowing about an industry or enterprise's conventions.
- Carelessness : Insufficient Understanding - Insufficient Precaution - Fatigue or Poor Health
Failure that could have been prevented by paying attention. Situations of being extremely busy or in poor physical condition causing a person to not pay sufficient attention.
- Insufficient Understanding
Failure caused by an individual having only superficial understanding of the fundamental issues of a procedure.
- Insufficient Precaution
Failure caused by an individual not paying sufficient attention because he/she is busy or cannot be bothered. Failure caused by an individual not taking adequate, known precautions.
- Fatigue or Poor Health
Fatigue or poor physical condition causing an individual to not pay sufficient attention.
- Ignorance of Procedure : Insufficient Communication - Disregard of Procedure
Failure caused by an existing procedure or rule not being followed.
- Insufficient Communication
Failure caused by information being inadequate or not being communicated adequately to those that need it.
- Disregard of Procedure
Failure caused by established procedures or methods, whether formal or informal, being ignored.
- Misjudgment : Narrow Outlook - Misunderstanding - Misperception - Misjudgment of Situation
A situation is not understood correctly, leading to an error of judgment. Evaluation criteria are incorrectly applied, the procedure by which decisions are made is not correctly followed, or elements normally expected to be considered in the decision-making process are missing.
- Narrow Outlook
Failure caused by an individual only considering a single aspect of a situation or ignoring possible and pertinent relationships between things and/or events.
Failure caused by an individual understanding neither the situation nor its background principles and structures. For example, in the case of a leaking container of combustible gas, an individual may think that the way to turn off the valve is to turn it in a clockwise direction, as is normal. However, some valves turn off in the opposite direction, and turning such a valve clockwise would increase the leak.
Failure caused by individuals believing they are acting in the correct manner, but in fact are not because of a misapplication of their knowledge. For example, in the case of a leaking container of combustible gas, an individual may know that the correct way to turn off the valve is to turn it in an anti-clockwise direction (that is, not the "normal" direction), but through a lapse in thought turn the valve in the opposite direction, thus increasing the leak.
- Misjudgment of Situation
Failure caused by an individual not understanding what is happening. For example, an individual may discover a fire and, believing it to be fueled by wood, spray water on it. But if the fire is in fact caused by burning cooking oil, spraying water will cause the fire to spread.
- Insufficient Analysis or Research : Insufficient Practice - Insufficient Prior Research - Insufficient Environment Study
Failure caused by lack of preparation in the decision-making process. Higher up the decision-making ladder, failure caused by not considering appropriate action to take if problems occur.
- Insufficient Practice
Failure caused by lack of practice or because practice conditions do not match actual conditions. Hazardous Operability Training (HAZOP) and Fault Tree Analysis (FTA) are examples of virtual testing systems that are certified by industry.
- Insufficient Prior Research
Failure caused by insufficient research into component parts and chemicals of products, including controls over production methods regarding adherence to safety rules, functions, and characteristics. For example, there are many cases of the reactivity of chemicals not being adequately researched, leading to failure.
- Insufficient Environment Study
Research into the environment in which a substance or product is to be used or the economic environment is either inadequate or conditions change after the research is complete.
Neither Individual nor Organization is Responsible
- Poor Response to Change in Environment : Change in Environment - Change in Economic Factors
Failure due to a change in circumstances with insufficient time to adapt accordingly.
- Change in Environment
The environment at the time of initiation of a project changes by the time of completion of the project, but there is insufficient time to adapt the project appropriately.
- Change in Economic Factors
The economic environment at the time of initiation of a project changes by the time of completion of the project, but there is insufficient time to adapt the project appropriately (for example, due to sudden fluctuations in interest rates or exchange rates).
Organization is Responsible
- Poor Concept : Poor Authority Structure - Poor Organization - Poor Strategy or Concept
Failure caused by problems at the planning stage or in the plan itself. Individuals working on plans conceived by predecessors or superiors may sometimes take responsibility for the failure of those plans.
- Poor Authority Structure
Failure caused by not obtaining the necessary permission or rights (such as patents) to complete a project.
- Poor Organization
Failure caused by flawed or inflexible organization structure.
- Poor Strategy or Concept
Failure caused by poor strategy or concept.
- Poor Value Perception : Difference in Culture (for example, failure to adjust to different customs) - Poor Organizational Culture - Poor Safety Awareness
Failure caused by individuals holding a different outlook or values from the people around them. Failure when a company gives more priority to profit than to observing rules.
- Difference in Culture
Failure caused by differences in culture and failure to adjust to or understand the surrounding culture. Technology from one culture may be misunderstood in another culture or have different standards or measurement units.
- Poor Organizational Culture
The rules of the corporation override the rules of society so that societal responsibility is neglected. Covering up a relatively minor failure to avoid negative consequences, leading to more substantial failure. Recent examples include the failures that occurred at Snow Brand Milk LTD, Nippon Meat Packers, Inc, and Mitsubishi Motors.
- Poor Safety Awareness
Lax safety standards and awareness due to inadequate control by the body responsible for safety because of, for example, the safety supervisor's assumption that somebody else will take responsibility for safety or cost-cutting measures at the expense of safety. A notorious example of this is the battery factory disaster in Bhopal, India.
- Organizational Problems : Inflexible Management Structure - Poor Management - Poor Staff
Failure caused by organizational shortcomings preventing the smooth flow of operations. Executives and managers realize their responsibility for guaranteeing the flow of operations, but neglect that responsibility, which may make problems worse.
- Inflexible Management Structure
In organizations with vertically structured management systems or unclear assignment of authority and responsibility to personnel, relatively minor decisions may need to be taken by high-level management and confrontation of problems is easily put off. In such cases, if a problem occurs, the organization and its staff cannot respond quickly enough or lack the authority to respond.
- Poor Management
Failure caused by problems of management such as top-level management decisions not being communicated throughout the organization, management not being fully aware of the true situation at lower levels, and superiors not adequately supervising subordinates.
- Poor Staff
Failure caused at "shop-floor" level, for example by subordinates failing to raise a problem with superiors, by individual selfishness affecting decision-making, or by lack of willingness to learn. Apart from failure explicitly caused by laziness or sabotage, blame for this kind of failure usually rests equally with the management and the managed.
Nobody is Responsible
- Unknown Cause : Occurrence of Unknown Phenomenon - Occurrence of Abnormal Phenomenon
Failure caused by a previously unknown phenomenon. The thread of human history can be said to be a catalogue of advances in science and technology brought about as countermeasures to failures caused by phenomena encountered for the first time. Such failures can be learned from and can be said to be blessings in disguise.
- Occurrence of Unknown Phenomenon
Failure caused by conditions or events that cannot be foreseen using current knowledge or experience, as they have never happened before.
- Occurrence of Abnormal Phenomenon
Failure caused by phenomena that can be understood using current knowledge, experience, theories, or thinking, but have never been reported or experienced in this context, for example.
Note that the third level items are not included in the Cause Mandala. This is because this level is meant for field-specific items, to be created as relevant and required and positioned under the appropriate second level item.
(b) Action Mandala (Fig. 9)
There are 10 top level key phrases and 24 second level key phrases. These key phrases can be applied to any of the four fields (Mechanical Engineering, Material Science, Chemicals and Plants, and Civil Engineering) that are used in the creation of the database. At the top level, we can divide the classifications into two broad groups: failure through the actions of individuals on objects (Action on Objects) and failure through the actions themselves (Human Action).
When we consider the actions of individuals, we can broadly classify them into two groups: Regular and Non-Regular. Regular action refers to action taken by an individual that leads to failure regardless of whether or not there are any changes in external factors related to the situation. Non-regular action refers to action taken by an individual that leads to failure when changes in external factors occur and the individual's response to those changes is lacking in some way. Whereas, in the case of regular action, it is possible to implement preventive measures such as training and precautionary procedures, such measures are difficult to identify in the case of regular action, leaving the individual to bear the brunt of the blame. Note that, from their experience, many specialists attribute almost all failure to changes in objects and circumstances, but in our classification such failures correspond to failures caused by non-regular actions of individuals. (These failures are sometimes referred to as failures at the point of change.)
- Action on Objects is subdivided into Planning and Design, Production, and Usage.
- Human Action is subdivided into Regular Operation, Non-Regular Operation, Regular Movement, Non-Regular Movement, Incorrect Reaction, Malicious Act, and Non-Regular Action.
Note that some key phrases, such as Poor Planning and Nonobservance of Procedure, can be thought of as both cause and action, and so are included in both the Cause Mandala and the Action Mandala.
Fig. 9 Action Mandala
The top level key phrases and the second level key phrases under them are listed and described below. These cover all aspects in the lifespan of "objects" from planning, through production and use, to disposal.
Action on Objects
- Planning and Design : Poor Planning - Design Misuse
This also includes such aspects as copycat design and licensed production.
- Poor Planning
Failure caused by a plan that is inadequate or impossible to execute. This includes plans for construction, construction management plans, operational plans, and scheduling.
- Design Misuse
Failure caused by using a design for other than its original purpose without modifying or understanding the design or its purpose. (This includes copycat designs and licensed production.) For example, operation methods, instrumentation systems, software systems, and office management.
- Production : Hardware Production - Software Production
This includes the manufacture of tools, machinery, and materials, as well as architecture and construction works.
- Hardware Production
Failure caused by badly produced machinery, tools, and other equipment. However, if the fault lies with the software controlling the hardware, that is considered a software failure.
- Software Production
Failure caused by badly produced software not performing well. This includes software design, as well as the selection and purchase of electrical goods and instruments that use the software.
- Usage : Operation/Use - Maintenance/Repair - Transport/Storage - Disposal
Using machinery beyond the limits of its design or not according to the instructions. For example, reckless driving.
Failure caused by inadequate maintenance or repair. For example, using the wrong lubricant in machines with moving parts or repairing by the wrong method.
Failure caused by inadequate transportation or storage methods. For example, transporting at room temperature chemicals that require refrigeration or transporting sensitive measuring equipment in a truck with inadequate suspension.
Failure rooted in the preparation for, location of, or method of disposal. Note that violations of ethics/morality by persons involved in the disposal procedure are classified under Malicious Act below.
Human actions are divided into three categories: Operation, Behavior, and Action.
Operation covers actions taken to initiate the actual use of tools, machinery, and other equipment. For example, the opening and shutting of valves and the driving of motor vehicles.
Behavior covers the physical behavior of individuals operating or preparing to operate tools, machinery, and other equipment. This includes accidents caused by colliding with objects, dropping things, stumbling, and falling down.
Action covers the intentional or willful actions by individuals and interaction between individuals, but excludes actions on physical objects (which are covered by Operation and Behavior).
The second level Human Action categories are listed and described below.
- Regular Operation : Nonobservance of Procedure - Erroneous Operation
Times at which people operate tools, machinery, and other equipment as normal, including times when not operating tools, etc.
- Nonobservance of Procedure
Failures caused by individuals not following proper procedure when operating tools, etc.
- Erroneous Operation
Using the wrong settings during operation or misusing tools, etc. For example, entering incorrect values in a control panel or indicating left when turning right in a car.
- Non-Regular Operation : Change in Operation - Emergency Operation
People operating tools, machinery, and other equipment in a way other than is normal. This includes starting and stopping in an emergency.
- Change in Operation
Operational procedures are occasionally changed. Individuals may use an old procedure by mistake, leading to failure.
- Emergency Operation
Failure caused by using the wrong procedures or methods due to the need to take urgent action, including failure to take evasive action. For example, not noticing traffic jam ahead and failing to brake in time.
- Regular Movement : Careless Movement - Dangerous Movement - Wrong Movement
Actions or movement by the operator when operating tools, machinery, and other equipment normally, including collisions, falling, dropping, and inaction.
- Careless Movement
Movement without taking into account the immediate situation. For example, standing up while working in a confined space and hitting one's head.
- Dangerous Movement
Movement without heed to safety. For example, riding a bicycle on a crowded pavement.
- Wrong Movement
Failure due to movement guided by misunderstanding, misconception, or lack of appropriate knowledge. For example, making a left turn when one should have turned right to reach a destination.
- Non-Regular Movement : Movement During Transition - Movement During Poor Health
People operating tools, machinery, and other equipment in a way other than is normal.
- Movement During Transition
Failure caused by individuals not knowing about changes in circumstances. Individuals panicking because of an unforeseen event and causing a failure.
- Movement During Poor Health
Failure caused by a decrease in an individual's judgment or ability because of bad physical condition.
- Incorrect Reaction : Poor Communication - Self Protection
This includes when information is misrepresented, concealed, or ignored.
- Poor Communication
Failure because required information (including instructions and reports) was not communicated. This does not include cases where individuals withhold information for reasons of self-protection.
- Self Protection
Failure caused by individuals acting to protect themselves and/or their relatives. This includes putting off decisions, misrepresenting, concealing, and ignoring information, presenting false information, and buck-passing.
- Malicious Act : Ethics Violation - Rule Violation
Failure due to an incorrect or wrong act. An act contrary to the law or to current society's expectations of correct behavior.
- Ethics Violation
Failure caused by violations of standards, including non-codified standards. For example, violations of ethics, morality, religion, common law, and agreements.
- Rule Violation
Violations of public law, a corporation's ordinances or bylaws, or of design standards such as Japanese Industrial Standards (JIS) or American Society of Mechanical Engineers (ASME). This includes breaking a contract.
- Non-Regular Action : Change - Emergency Action - Inaction
This is a broad classification for failure caused outside of normal situation and conditions. This includes organizational changes, changes to plans, and failure to act due to panic brought about by such changes.
Failure caused or partly caused by change.
- Emergency Action
Failure caused by reacting to an emergency in a way that is not normal behavior. This includes panicked reactions.
Failure caused by necessary action not being taken.
Note that the third level items are not displayed in the Action Mandala. This is because this level is meant for field-specific items, to be created as relevant and required and positioned under the appropriate second level item.
In the Action Mandala, the key phrases are broadly divided into two groups: "Action on Object" and "Human Action". Only "Human Action" is further divided into "Regular" and "Non-Regular". Although the distinction between "Regular" and "Non-Regular" is not explicit for "Action on Object", such a distinction can nonetheless be made. However, as mentioned previously, "Action on Object" by definition involves a sequence leading from initial planning of some sort, followed by creation or production, use, and finally disposal of the object. For ease of understanding, this is more suitably expressed as a sequential pattern than in terms of "Regular" and "Non-Regular" events, and the Action Mandala is designed accordingly.
(c) Result Mandala (Fig. 10)
There are 10 top level key phrases and 29 second level key phrases. These key phrases can be applied to any of the four fields (Mechanical Engineering, Material Science, Chemicals and Plants, and Civil Engineering) that are used in the creation of the database. At the top level, we divide the classifications into six broad groups depending on "who" or "what" is affected by the failure: Results on Objects, Results with External Consequences, Results with Human Consequences, Results with Consequences for Organization/Society, Results that will Occur, and Results that may Occur.
- Results on Objects is split into three subclassifications: Malfunction, Bad Event, and Failure.
- Results with External Consequences is represented in the Mandala as Secondary Damage.
- Results with Human Consequences is split into two subclassifications: Bodily Harm and Psychological Harm.
- Results with Consequences for Organization/Society is also split into two subclassifications: Loss to Organization and Damage to Society.
- Results that will Occur is represented in the Mandala as Future Damage.
- Results that may Occur is represented in the Mandala as Possible Damage.
Fig. 10 Result Mandala
The top level key phrases and the second level key phrases under them are listed and described below.
Results on Objects
- Malfunction : Specifications Not Met - Poor Hardware - Poor Software - Poor System
Failure caused by products not performing as they should.
- Specifications Not Met
Failure caused by a product not performing according to its specifications.
- Poor Hardware
Failure caused by hardware not performing as it should.
- Poor Software
Failure caused by software not performing as it should.
- Poor System
Failure caused by a system not performing as it should.
- Bad Event : Mechanical Event - Thermo-Fluid Event - Chemical Phenomenon - Electrical Failure
Occasionally, what could be thought to be a negligible phenomenon triggers matters of great importance. For example, such occurrences as vibration, wear, heat generation, combustion, and leakage. The reason that Thermo-Fluid Event is a category here is that recently there have been many problems that cannot be categorized as heat only or fluids only.
- Mechanical Event
Mechanical defects caused by such occurrences as vibration or wear, but excluding actual damage.
- Thermo-Fluid Event
Failure caused by problems concerning fluids, heat, or both (generally the two are inseparable), such as heat transfer, temperature gradient, turbulent flow of gas or liquid, or flow at too high or low a speed.
- Chemical Phenomenon
Failure caused by chemical phenomena, such as combustion, chemical reactions, runaway reactions, and ignition.
- Electrical Failure
Failure caused by electrical events, such as short circuits, static electricity, and power failure.
- Failure : Degradation - Abrasion - Deformation - Fracture/Damage - Large-Scale Damage
Failure that involves objects of any scale breaking because of such phenomena as heat damage, corrosion, creep, sinking, and crashing.
The structural integrity of an object deteriorates because of, for example, heat, stress, or a chemical reaction.
The abrasion of material such as iron plates because of phenomena such as wear, erosion, corrosion, or oxidation leading to a loss of strength or rupture.
A partial or complete change in the shape of such objects as piping or tools, leading to a reduction in functionality.
Partial or complete breakage of such objects as piping or tools due to fatigue, cracking, stress, or corrosion.
- Large-Scale Damage
Large-scale damage or complete destruction of an object. For example, an airplane crash, a ship sinking, or an explosion at a chemical or power plant.
Results with External Consequences
- Secondary Damage : External Damage - Damage to Environment
Relatively large-scale failure that causes subsequent damage or destruction to other objects. For example, pollution caused by fires or explosions, or leaking or emitting toxins into the environment.
- External Damage
Events such as fires and explosions that caused by failure of one sort and in turn cause damage to other things.
- Damage to Environment
Failure that causes environmental destruction (such as air or water pollution) either directly or indirectly.
Results with Human Consequences
- Bodily Harm : Harm to Physical Well-being - Sickness - Injury - Death
Any ill effects on the human body.
- Harm to Physical Well-being
Failure that causes damage to the human body. This category is used when it is not known if the result is sickness, injury, or death.
Failure that causes illness. The sickness can be acute or chronic. This includes such phenomena as miscarriages.
Failure that causes injury.
Failure that causes death.
- Psychological Harm : Mental Trauma
This is a third level category including causes of fear, memory loss, loss of confidence, and grief.
- Mental Trauma
Failure that causes mental anguish such as fear, memory loss, loss of confidence, or grief.
Results with Consequences for Organization/Society
- Loss to Organization : Economic Loss - Social Loss
Failure that causes a loss to an organization, directly or indirectly. For example, compensation for damages, loss of confidence, or bankruptcy.
- Economic Loss
Tangible losses to an organization such as direct losses caused by accidents, rehabilitation costs, loss of income, and compensation for damages.
- Social Loss
Negative effects on an organization's standing in society, such as loss of sales, loss of confidence, or law suits.
- Damage to Society : Social Systems Failure - Change in Perception
Far-reaching negative effects on citizens and consumers, including failure of infrastructure, loss of faith in administrative organs and businesses, and changes in spending habits.
- Social Systems Failure
Disruption to social functions, such as by congestion of lifelines and damaging rumors.
- Change in Perception
General changes in the public's perceptions such as an increase in distrust of administrative organs and businesses and a rise in defensive instincts.
Results that will Occur
- Future Damage : Results to Happen - Foreseeable Results - Unforeseeable Results
The publicizing of a potential problem, such as global warming, which may not be apparent as a critical problem at present but will certainly be in the future. This type of failure is often made known as a result of in-house whistle-blowing.
- Results to Happen
A failure that is not such a large problem at present but will be a large problem in the future. For example, global warming or the collapse of a public pension scheme.
- Foreseeable Results
An event that, although it is not a big problem in the present, is predicted to have large consequences unless action is taken to counter it. For example, worldwide water shortages and climate change caused by the Earth tilting on its axis.
- Unforeseeable Results
A failure that we know nothing of at present and have no way of predicting, but will be a large problem in the future.
Results that may Occur
- Possible Damage : Near Miss - Potential Hazard
Negative effects that may or may not happen. Events with a low probability of occurring are not differentiated from those with a high probability of occurring.
- Near Miss
An event where a person is close to experiencing a danger, but no accident or failure actually occurs. A situation where a person realizes that a certain (quite feasible) combination of events could have led to failure.
- Potential Hazard
A situation of potential danger, where a certain (quite feasible) combination of events could cause an accident or failure (of which the person concerned may or may not be aware). Includes plans, decisions and actions taken without prior knowledge of future potential risk.
The component elements of cause, action, and result described above are grouped according to the structure indicated in Fig. 3, Caption (b) to create the Failure Mandalas indicated below in Fig. 11. Viewed on-screen in the Failure Knowledge Database, the "flow" down from Cause through Action to Result that makes up the scenario can be easily understood.
Note that, while the Committee for the Promotion of the Failure Knowledge Database believes that the Failure Mandalas are now the best available, it does not regard them as a perfect, finished product. As more and more data is accumulated, analyzed, and added to the database, and as more and more users' opinions are employed to refine the database content and structure, the Committee will update the Failure Mandalas accordingly to keep them as close to perfection as is possible.
Fig. 11 Visual Expression of Failure Scenario
5. Expression by Scenario
In order to better express and explain the structure of a failure case study, we have created what we call a "diagonal line drawing" based on Fig. 11. See Fig. 12 below.
Fig. 12 Diagonal Line Drawing of a Scenario
In this diagram, "Time" is represented on the horizontal axis and "Step" is represented on the vertical axis. Items are numbered in descending order from top left to bottom right. By means of the diagonal line drawing, the context that leads to the occurrence of failure (what we call the "scenario") can be clearly expressed. We call this a "diagonal scenario". It begins at top left with the cause. This consists of top level and second level key phrases (common to all failures) written out in order. These are supplemented, as required, by field-specific third level key phrase (and, if required, fourth, fifth and "deeper" levels). A double underline is inserted to separate the cause and action, and then the action is written out in the same way as the cause.
Note that, although it is possible to insert appropriately named key phrases at the third level and deeper, there is no need to add an underline before each key phrase. In a case where a single cause is followed by two actions, those two actions (or, more specifically, the key phrases that represent them) are separated by a single underline. So, in Fig. 12, the action is represented by two key phrases at points 4 and 7. Note that the numbering continues in a single unbroken sequence - it does not start afresh or branch off at key events in the scenario. As with cause and action, a double underline is inserted to separate the action and result, and then the result is written out in the same way as the cause and action. As progression from up to down represents the different steps involved in the failure, progression from left to right represents the passage of time. We feel that this diagonal line drawing not only best represents the unfolding scenario of a failure, from cause through action to result, but also renders the details involved extremely easy to grasp, analyze and remember.
A pictograph depicting the case study and a Knowledge Comment summarizing the case study can be inserted in the space on the top right or bottom left of the diagram, offering an almost complete description of the failure in one simple illustration.
The pictographs are simple drawings intended to convey a straightforward, instantly recognizable summary of the type of failure that has occurred. Using detailed illustrations such as excerpts from design blueprints or combinations of computer icons is not permitted as the extra detail tends to blur and delay understanding.
Finally, a note on searching the Failure Knowledge Database. Traditional database search methods employ inverted tree logic in an enumerative method, but the system used in the Failure Knowledge Database allows the user to search using terms that are of direct consequence to the user's understanding of a failure case study and reason for searching for that knowledge (i.e., for use in preventing future failures).
We have also found that using words only as search terms is not always the most efficient way of searching, and for this reason we have found it extremely useful to include a pictograph in the list of search results. This means that, if a particular search produces a large number of results, it is possible to determine which case studies are relevant by glancing at pictures rather than by reading numerous lines of text. We also believe that it is very important to be able to show the diagonal scenario and pictograph in the same window.