FMEA (Failure Mode and Effects Analysis) Quick Guide

FMEA — failure mode and effects analysis — is a tool for identifying potential problems and their impact.

Problems and defects are expensive. Customers understandably place high expectations on manufacturers and service providers to deliver quality and reliability.

Often, faults in products and services are detected through extensive testing and predictive modeling in the later stages of development. However, finding a problem at this point in the cycle can add significant cost and delays to schedules. The challenge is to design in quality and reliability at the beginning of the process and ensure that defects never arise in the first place.

One way that Lean Six Sigma practitioners can achieve this is to use failure mode and effects analysis (FMEA), a tool for identifying potential problems and their impact.

FMEA: The Basics

FMEA is a qualitative and systematic tool, usually created within a spreadsheet, to help practitioners anticipate what might go wrong with a product or process. In addition to identifying how a product or process might fail and the effects of that failure, FMEA also helps find the possible causes of failures and the likelihood of failures being detected before occurrence.

Used across many industries, FMEA is one of the best ways of analyzing potential reliability problems early in the development cycle, making it easier for manufacturers to take quick action and mitigate failure. The ability to anticipate issues early allows practitioners to design out failures and design in reliable, safe and customer-pleasing features.

FMEA Example — Example of FMEA – graphic excerpted from “Leverage Six Sigma to Manage Operational Risk in Financial Services” (Click to enlarge)

Finding FMEA Failure Modes

One of the first steps to take when completing an FMEA is to determine the participants. The right people with the right experience, such as process owners and designers, should be involved in order to catch potential failure modes. Practitioners also should consider inviting customers and suppliers to gather alternative viewpoints.

Once the participants are together, the brainstorming can begin. When completing an FMEA, it’s important to remember Murphy’s Law: “Anything that can go wrong, will go wrong.” Participants need to identify all the components, systems, processes and functions that could potentially fail to meet the required level of quality or reliability. The team should not only be able to describe the effects of the failure, but also the possible causes.

The sample shown in Table 1 can be used as an example when learning how the FMEA works. The team in this case is analyzing the tire component of a car.

Table 1: FMEA for Car Tire

Function or Process Step	Failure Type	Potential Impact	SEV	Potential Causes	OCC	Detection Mode	DET	RPN
Briefly outline function, step or item being analyzed	Describe what has gone wrong	What is the impact on the key output variables or internal requirements?	How severe is the effect to the customer?	What causes the key input to go wrong?	How frequently is this likely to occur?	What are the existing controls that either prevent the failure from occurring or detect it should it occur?	How easy is it to detect?	Risk priority number
Tire function: support weight of car, traction, comfort	Flat tire	Stops car journey, driver and passengers stranded	10	Puncture	2	Tire checks before journey. While driving, steering pulls to one side, excess noise	3	60

Recommended Actions	Responsibility	Target Date	Action Taken	SEV	OCC	DET	RPN
What are the actions for reducing the occurrence of the cause or improving the detection?	Who is responsible for the recommended action?	What is the target date for the recommended action?	What were the actions implemented? Now recalculate the RPN to see if the action has reduced the risk.
Carry spare tire and appropriate tools to change tire	Car owner	From immediate effect	Spare tire and appropriate tools permanently carried in trunk	4	2	3	24

Criteria for FMEA Analysis

An FMEA uses three criteria to assess a problem: 1) the severity of the effect on the customer, 2) how frequently the problem is likely to occur and 3) how easily the problem can be detected. Participants must set and agree on a ranking between 1 and 10 (1 = low, 10 = high) for the severity, occurrence and detection level for each of the failure modes. Although FMEA is a qualitative process, it is important to use data (if available) to qualify the decisions the team makes regarding these ratings. A further explanation of the ratings is shown in Table 2.

Table 2: FMEA Severity, Occurrence and Detection Ratings

	Description	Low Number	High Number
Severity	Severity ranking encompasses what is important to the industry, company or customers (e.g., safety standards, environment, legal, production continuity, scrap, loss of business, damaged reputation)	Low impact	High impact
Occurence	Rank the probability of a failure occuring during the expected lifetime of the product or service	Not likely to occur	Inevitable
Detection	Rank the probability of the problem being detected and acted upon before it has happened	Very likely to be detected	Not likely to be detected

After ranking the severity, occurrence and detection levels for each failure mode, the team will be able to calculate a risk priority number (RPN). The formula for the RPN is:

RPN = severity x occurrence x detection

In the FMEA in Figure 1, for example, a flat tire severely affects the customer driving the car (rating of 10), but has a low level of occurrence (2) and can be detected fairly easily (3). Therefore, the RPN for this failure mode is 10 x 2 x 3 = 60.

Setting FMEA Priorities

Once all the failure modes have been assessed, the team should adjust the FMEA to list failures in descending RPN order. This highlights the areas where corrective actions can be focused. If resources are limited, practitioners must set priorities on the biggest problems first.

There is no definitive RPN threshold to decide which areas should receive the most attention; this depends on many factors, including industry standards, legal or safety requirements, and quality control. However, a starting point for prioritization is to apply the Pareto rule: typically, 80 percent of issues are caused by 20 percent of the potential problems. As a rule of thumb, teams can focus their attention initially on the failures with the top 20 percent of the highest RPN scores.

Making FMEA Corrective Actions

When the priorities have been agreed upon, one of the team’s last steps is to generate appropriate corrective actions for reducing the occurrence of failure modes, or at least for improving their detection. The FMEA leader should assign responsibility for these actions and set target completion dates.

Once corrective actions have been completed, the team should meet again to reassess and rescore the severity, probability of occurrence and likelihood of detection for the top failure modes. This will enable them to determine the effectiveness of the corrective actions taken. These assessments may be helpful in case the team decides that it needs to enact new corrective actions.

The FMEA is a valuable tool that can be used to realize a number of benefits, including improved reliability of products and services, prevention of costly late design changes, and increased customer satisfaction.