iSixSigma

Process Maps and FMEA Help Prepare Utility for Disaster

The Six Sigma methodology contains many tools that can be used successfully throughout an organization to improve processes and prevent failures, regardless of whether the full DMAIC (Define, Measure, Analyze, Improve, Control) roadmap is used. Demonstrating the efficacy of these tools for preventing or solving real business problems is a powerful way to market Six Sigma. Two of these tools, process maps and the failure mode and effects analysis (FMEA), were applied in the following case study. A team at a wastewater utility used them to prepare for possible issues in the wake of retiring staff, and in the process helped ready the facility to face a powerful natural disaster.

Case Study: Virginia Wastewater Utility

A northwestern Virginia wastewater utility with a relatively small workforce was concerned about the potential impact of pending retirements of experienced operators. Industry statistics demonstrate why utility management was concerned:

  • Average age of wastewater utility employees – 47 years
  • Average age of lead operators – 52 years
  • Average retirement age – 56 years
  • Average years of service – 24 years

Experienced operators who were approaching retirement age were very knowledgeable, but much of their knowledge about operations was the result of years of experience and was not always well documented. This fact was particularly true of knowledge about infrequent but potentially high-risk events such as floods at the treatment plant.

To address this concern, the utility’s engineering consultant designed and facilitated a workshop to document critical knowledge about utility operations. A by-product of the workshop was an assessment of plant flooding risk, which helped the utility identify and implement preventive measures. Coincidentally, a major hurricane and subsequent flooding hit the area a few weeks after the workshop. The measures put in place through the workshop enabled the wastewater utility to operate continuously during the disaster.

Workshop Objectives

The primary objectives of the workshop were to map critical utility operating knowledge, identify the flow of work that directly addressed critical operating parameters, and prepare the groundwork for future knowledge capture and dissemination efforts.

The first task was to identify target processes and define basic process parameters, including the following:

  • Outputs
  • Customer(s) for outputs
  • Process owners
  • Inputs
  • Suppliers of inputs
  • Process boundaries
  • Quality characteristics of the process outputs

Workshop Process

Before the workshop, the consultants worked with utility management to identify teams and several processes. These included wastewater collection, water distribution, wastewater treatment and water treatment. In the workshop, the team mapped several critical subprocesses within each process. For example, for the wastewater treatment process, they mapped the subprocess for responding to a flood event. Figure 1 captures a SIPOC (suppliers, inputs, process, outputs, customers) summary of this process. Figure 2 is a more detailed process map.

Figure 1: SIPOC for Wastewater Treatment Plant Flood Response

Figure 1: SIPOC for Wastewater Treatment Plant Flood Response

Figure 2: Process Map for Flood Response

Figure 2: Process Map for Flood Response

Identifying Risks with FMEA

After the team mapped the flood response process, they compiled an FMEA to identify process failure risks. The FMEA is used to evaluate the nature and impact of a failure event, including the severity of the failure effect, the expected frequency of occurrence, and the likelihood that the current process will prevent or detect the failure. The team rated each attribute on a 1-to-10 scale (Table 1).

Table 1: FMEA Rating Scale

Severity

Frequency of Occurrence

Detectability
(Likelihood of Prevention)

Rating Description Rating Description Rating Description
1 Be unnoticed and not affect the performance 1 Once every 7+ years 1 Certain that potential failure will be prevented before it impacts productivity or schedule
2 Be unnoticed; minor affect on performance 2 Once every 3-6 years 2 Almost certain potential failure will be prevented before it impacts productivity or schedule
3 Cause a minor nuisance; can be overcome with no loss 3 Once every 1-3 years 3 Low likelihood potential failure will be prevented before it impacts productivity or schedule
4 Cause minor performance loss 4 Once per year 4 Controls may prevent the potential failure from impacting productivity or schedule
5 Cause a loss of performance; likely to result in a complaint 5 Once every 6 months 5 Moderate likelihood potential failure will impact productivity or schedule if undetected
6 Result in partial malfunction 6 Once every 3 months 6 Controls are unlikely to prevent the potential failure from impacting productivity or schedule
7 Cause customer dissatisfaction 7 Once per month 7 Poor likelihood potential failure will be prevented before impacting productivity or schedule
8 Render the product or service unfit for use 8 Once per week 8 Very poor likelihood potential failure will be prevented before impacting productivity or schedule
9 Be illegal 9 Once every 3-4 days 9 Current controls will probably not even detect the failure
10 Injure a customer or employee 10 More than once per day 10 Absolute certainty that current controls will not detect the failure

After applying the rating scale, the team was able to assign risk priority numbers (RPN), which are calculated as the product of the severity, frequency of occurrence and detectability scores. The team set an RPN of 60 as an initial threshold value to determine if drilling down on any step of the process was necessary to further define the response to a failure. In Table 2, for example, the RPN scoring for the risks deemed worthy of further mitigation (either by the RPN scores or the process knowledge of the experienced operators) are highlighted in yellow or red. After the workshop, the plant manager and his experienced operators used the results of the FMEA to clarify the response procedure and guide communications to the plant staff.

Table 2: Excerpt of FMEA Worksheet on the Flood Response Process
Process
Step
Potential Failure
Mode
Potential Failure
Effects
Sev. Potential
Causes
Occ. Current
Controls
Det. RPN
Monitor weather and flows Flows exceed filter capacity Backup clogs filters 6 Not paying attention to weather reports 4 Filter alarm; SOP checklist if high flows; operator knowledge 3 72
    Backwash water will overflow clarifiers 7 Not paying attention to weather reports 4 Filter alarm; SOP checklist if high flows; operator knowledge 3 84
    Permit violation 9 Not paying attention to weather reports 4 Filter alarm; SOP checklist if high flows; operator knowledge 3 108
Decision to prepare plant for flood Flow surge basins not activated Flood damage, backups 8 Poor judgment; lack of experience 1 Filter alarm; SOP checklist if high flows; tacit knowledge 2 16
    Permit violation 9 Poor judgment; lack of experience 1 Operator knowledge 2 18
Secure plant for high flows Flow surge basins not activated Backups 7 Poor judgment; lack of experience 4 Filter alarm; SOP checklist if high flows; tacit knowledge 3 84
    Permit violations 9 Poor judgment; lack of experience 4 Operator knowledge 2 72
Shut power to rotors Power stays on, rotors keep turning, shut off wrong rotor Ruin gear reducers; rotor bearings; solids blowout 8 Operator inattention, inexperience, lack of training 3 Training; operator tacit knowledge; SOP on rotor shut-off 6 144
Kill power to preliminary treatment building Operator does not kill power Trip breaker when water covers motor, other power failures, electrocution 9 Operator inattention, inexperience, lack of training 2 EOP on flood conditions 6 108
Decision – preliminary treatment building 1st floor flooded? Building is flooded Motor failure, loss of paddle, grit machine, manual bar screen, safety issues 10 Happens overnight; pump station failure; pressure transducer failure; redundant float system failure 2 Safety SOPs; training; tacit knowledge; high water wet well alarm 6 120
Monitor dry well water level Dry well failure Dry well floods, lose pump station 8 Operator inattention, inexperience, lack of training 2 Operator tacit knowledge; 6 96

Post-workshop Implementation

Process mapping and FMEA analysis provided a structured approach to identifying and mitigating risk for the flood response process. Utility supervisors recognized the value of the exercise by requesting (without any prodding) the process documentation produced during the workshop. The utility staff then refined their emergency response plans to address the risks highlighted by the FMEA. This updated response plan proved invaluable when, three weeks after the workshop, Hurricane Isabel ravaged the eastern seaboard.

The hurricane was at the time the costliest natural disaster in the history of Virginia. Strong winds affected 99 counties and cities, downing thousands of trees and leaving about 1.8 million people without power. Interior Virginia bore the brunt of the heavy rains and flooding, with a maximum rainfall total of 20 inches in the Shenandoah Valley, not far from the site of the treatment plant. The director of the utility credited the preparation inspired by the workshop and the FMEA exercise with ensuring that the treatment plant experienced only minor flooding and no interruption of operations.

You Might Also Like

Comments 1

Leave a Reply