Developing countermeasures proved to be a difficult step to complete. The team consisted of multiple SMEs from across the maintenance complex, and their individual traits, reasons for involvement and resistance to change made this step challenging to say the least. Interpersonal communication skills and change management techniques played a large role in assisting the facilitator. In the end, the team decided on five major changes to the existing way of conducting business:
Each of the countermeasures was based off root causes identified during root cause analysis and tested the team’s problem-solving abilities. Once again, the team was able to be creative and involved in providing a solution. Once countermeasures were developed, a pick chart was employed as a mechanism to evaluate impact versus ease of implementation. The tools used to develop these countermeasures were brainstorming, affinity diagrams and a pick chart. Figure 7 depicts the results.
Once the various countermeasures were evaluated based on impact and ease of implementation, the team began to develop an action plan. The action plan consisted of multiple COAs and ranged from small, internal objectives such as just do it (JDI), rapid improvement events (RIEs) to projects. It was important to identify the level of involvement as each COA was important to the LOPM since the resulting improvement can vary from an internal LO flight policy change to inter-command level resource allocation. The COA provided each person (volunteer) assigned a task an estimated time of completion date. In the post-improvement phase, the document also provides a level of accountability as the team knows who to contact when/if a problem arises that requires attention. Figure 8 depicts the results.
Confirming results and processes is the next step in the 8-step model. To do this, the team needed to compare the current-state performance to post LO production machine stand-up performance. Prior to standing up the machine, the 509th MXG wanted to reduce WIP variability and reduce average WIP from approximately five to three. They also wanted to reduce flowday variability and reduce flowday average from a peak of 45 days to 21 days. Finally, the ultimate goal was to provide a suitable fleet health posture able to meet wartime requirements and flying hour program goals. Figures 9 and 10 provide a graphical representation of the results over the first three months of implementation. It is important to note during this step the basis of this event was to take a deep dive in how the 509th MXS operates and how they can perform better. While the results conclude the objectives of reduced WIP and WIP variability have been met, the team continues to find new areas in which it can improve the machine and the quality of life for the 509th MXS LO maintainers.
In essence, the LOPM was based off a standardized successful process as we adopted an aircraft depot methodology and applied it to a field-level maintenance activity. This application will not only benefit the B-2 program but can apply to other airframes with unique requirements requiring similarly large downtime commitments. With the new B-21 raider (long-range stealth strategic bomber) contract being awarded, it gives the 509th MXG the opportunity to reach out into future enterprises and share lessons learned to airframes that may run into the same problems they have encountered. As a measure of success, the LO production machine scorecard (Figure 11) will be used to share with other units the benefits of standing up a production machine with a gated concept.
Perhaps one of the most important aspects of the AFSC production machine concept is the use of visual displays. By implementing factors such as standard work and visual displays, the AFSC is able to accurately measure and monitor throughput. Paramount to a production machine is the element of standard work and designing repeatable methods of accomplishing tasks. In most cases, standard work is inherent to aircraft maintenance in the form technical data publications which specifically detail steps of a repair process. However, the AFSC (and the LOPM) takes the standard work methodology a step further by creating a script that is able to communicate when a particular task is performed. Paramount to this idea is information sharing. Two levels of visual displays are used and are easily understood by everyone from floor workers to top managers and senior leaders. By creating visual displays that adhere to the basic elements of relevancy, simplicity and accuracy, the AFSC has created an information delivery system that is simple and straightforward.
Perhaps one of the most effective forms of visual display is the production status display or “waterfall” chart. The waterfall chart (Figure 12) expands on the production machine approach by tracking progress of individual pieces and providing status according to their performance along the predetermined critical path. A basic waterfall chart will consider one of three statuses: green (on time), yellow (potentially late) and red (late). When a task is green, the machine is behaving normally and business is good. When a yellow is encountered, the team begins to examine causal factors that impacted the regression. Nothing is changed during a yellow status as the piece still performs according the critical path; however, preparations are made to understand why. During a red scenario, constraints to the critical path are examined and a CPI event might be generated to resolve the situation. It is important to note that a red status is not considered detrimental or negative; this is simply a trigger point in which it is understood that an opportunity exists to improve and become more effective.
Through the power of reduced variability and CPI methodology it is possible to create a system that is able to produce unprecedented results and provide an improved quality of life for those involved. While the production machine philosophy was first realized by the AFSC in an aircraft depot environment, its application to field-level repairs and restoration provide an operational wing with the ability to better use resources and apply manpower. Furthermore, leadership sponsoring of CPI events builds a culture that promotes commitment and continuous improvement. In summary, the efforts of the LOPM has provided aircraft critical to national security, an increased readiness posture and improved working conditions for more than 250 airmen.
]]>The B-2 bomber is one of the most sophisticated, technologically advanced weapons systems in the world and the technicians who maintain it require the best processes available to ensure the B-2 is ready to execute its mission – nuclear deterrence. A major point of emphasis with the B-2 is the coatings applied to the skin of the aircraft. Low observable (LO) maintenance – that is, keeping up the stealth signature of the aircraft – requires significant manpower and man-hour requirements. To ensure efficiency and effectiveness within the process of LO maintenance, the 509th Maintenance Group (MXG) held a continuous process improvement (CPI) event to stand up the LO production machine. The LO machine is the culmination of months of hard work and research by a team of subject matter experts and key decision makers within the maintenance complex. This article will discuss the problems with previous processes, the need for an LO production machine, the road to goal, the CPI event, and action plan implementation.
The B-2 LO flight is one of the largest and most complex entities within the United States Air Force aircraft maintenance community. More than 250 airmen, Department of Defense civilians and contractors perform scheduled and unscheduled LO maintenance on 20 assigned B-2 aircraft. Perhaps the largest issue performing maintenance on these aircraft is the unpredictability within the aircraft scheduling and manpower distribution. There is a vast array of jobs to be performed on almost every square inch of the external skin of the aircraft with the biggest concern being heavy restoration projects. These heavy maintenance actions require the aircraft to be down for extended amounts of time while restoring them back to pristine LO conditions.
Maintaining all 20 B-2 aircraft in pristine condition is a task that would be easily obtainable if the B-2 wasn’t dual hatted. When planning maintenance for the B-2, the 509th Bomb Wing (BW) organization must consider two major driving factors: the annual flying hour program (FHP) – which comprises the number of hours needed to attain and maintain combat readiness and capability for its aircrews, to test weapon systems and tactics, and to fulfill collateral requirements such as air shows – and wartime posture for projecting nuclear deterrence. The 509th BW’s mission is to have a fleet of aircraft and flight crews ready to meet strategic objectives, but also has to train those crews so they are ready when called upon. To do so, it must find the appropriate balance of war readiness posturing to meet global mission demands while simultaneously meeting annual flying hour objectives. They don’t have the luxury of tipping the scales in one area over the other.
To appropriately balance resources and meet both the FHP requirements and wartime posture, the 509th BW has to balance aircraft availability (AA) to the utmost of its abilities. To accomplish this seemingly unachievable task, a production machine was developed specifically targeting the heavy LO restoration process. The production machine provides an element of predictability within the LO restoration and scheduling process while also meeting both the FHP and wartime posture. The production machine concept was borrowed from the Air Force Sustainment Center (AFSC), specifically its 2015 foundational text The Art of the Possible. Paramount to the development of a production machine is a theory known as Little’s Law.
The B-2 bomber has been stationed at Whiteman AFB for more than 20 years and the heavy LO restoration process has remained the same since inception. For decades, maintainers used the 200-flying hour hourly post flight (HPO) inspection as the trigger point for induction into heavy LO restoration. This was viewed as an opportunity as the inspection requires an aircraft to remain down for several days and provided a simple mechanism to schedule restoration. Depending on operational tempo, however, the HPO scheduling method provided situations in which some aircraft flew 200 hours in a few weeks while others took six months or longer. This unpredictability caused forecasting and planning nightmares, which ultimately led to an average of 4.75 B-2 aircraft in heavy LO maintenance at any given time during fiscal year 2015 as indicated in Figure 1.
Additionally, the number of aircraft in LO restoration varied from one to nine during the same period. This variability provided a situation which made it difficult to project manpower and resource requirements on a daily basis. Given the small fleet of aircraft, this scheduling method reduced the AA rate from 100 percent to 75 percent just for scheduled heavy LO restoration. Additionally, to fulfill programmed depot maintenance (PDM) and contracted technology modification requirements the AA rate dwindled to 60 percent as three more aircraft were unavailable. This situation left only 12 aircraft to complete a 6,000+ hour per year FHP contract, contend with unscheduled maintenance, and a 24/7/365 strategic deterrence posture.
To combat the unpredictability of heavy LO restoration, Little’s Law was applied to the scheduling process. Little’s Law (as adapted by the Air Force Services Corporation) states that work in progress (WIP) = throughput x flowtime. In the case of heavy LO restoration, it was determined through flying schedule analysis that an appropriate WIP for LO restoration would be three aircraft, increasing AA 10 percent or by two additional aircraft available for use in the flying schedule. Using a WIP of three as the customer requirement and a throughput (Takt time) of one aircraft input every seven work days, the flowtime requirement equates to 21 days. Figure 2 depicts Little’s Law applied to the LO production machine (LOPM). Once the framework of the production machine was developed the focus became reducing the variable current state throughput of 18 to 45 days to a consistent 21 days.
The application of Little’s Law spawned several issues that needed to be addressed before the LOPM could be launched. Most importantly, the 21-day flowtime requirement suggested that the entire LO restoration process required attention to reduce waste and build a critical path. In order to accomplish this feat, a CPI event was called. Team members for the event included subject matter experts (SMEs), senior leaders and customers from throughout the 509th MXG. The event was led by an Air Force CPI Black Belt and facilitated by the authors of this text. The event followed the Air Force CPI 8-step problem-solving model:
To clarify and validate the problem, the team brainstormed what needed to be accomplished during the CPI event. The first task was to fully understand what was expected of Whiteman AFB in the context of carrying out mission requirements. This was stated in the form of a question: “How do we balance combat commitments with the flying hour program?” Once they had a basic understanding of what the requirements were, the team could begin to formulate a quantifiable problem statement and goal(s) that would guide them to ultimate realization of their process. Tools used during formulation of the problem statement were brainstorming, value stream mapping and a SIPOC (supplier, inputs, process, outputs, customer) analysis.
Problem statement: The 509th MXG averaged 4.6 aircraft WIP, ranging from one to nine aircraft WIP for heavy LO restoration. On average, 25 days were required to correct 140 defects. There is no standardized trigger point for heavy LO restoration induction. Goal: Maintain three scheduled aircraft WIP in heavy LO restoration while accomplishing 140 jobs in 21 days and maintaining the appropriate strategic deterrence posture.
During the CPI event, the team spent the majority of its time collaborating in this step to understand where potential waste elimination opportunities existed in order to reduce flowtime to 21 days. The entire heavy LO restoration process was mapped using current state mapping to see and understand the process as it existed. As the team progressed through current state mapping, it was also able to identify improvement areas as each process step was labeled: 1) value added, 2) non-value added or 3) non-value added but required. Aside from creating a visual model of the current process, the value stream mapping exercise offered an opportunity to understand motivations and requirements of the many outside agencies that are involved in the LO restoration process. Tools used in this step were current state mapping and a gap analysis. It was determined that the process was hindered by overproduction, waste in motion, and waste in waiting. Figure 4 depicts the results of Step 2.
Once the problem was dissected into value added and non-value added activities, it was time to focus the creative minds of the team to understand what the LO production machine should resemble. Step 2 was easy to execute because of the vision, goals and problems presented to the team in Step 1 – three aircraft WIP with 21-day flowtime and seven-day Takt time. The team’s mission at this point was to execute the vision and design a functioning, maintainable machine focused on eliminating waste in the process. In order to do so, ideal state mapping and future state value stream mapping techniques were employed to direct the team toward a creative goal and provide a measurable indication of waste elimination. Once the team mapped the desired future state it was clear a closer look at manpower distribution was required since a consistent workload would be presented to them every seven days.
Once a detailed manpower study was completed, the team realized it had provided a source of waste as it mandated 24/7 coverage; a large portion of manpower was reserved to smaller weekend shifts. By adjusting manning requirements to accommodate a Monday through Friday shift schedule it became evident that a specific (consistent) number of personnel must be available per shift and per aircraft to make the LO production machine a reality. As a result, it was decided to eliminate the weekend duty team and absorb them into the new process. The team was able to increase workload on each of the three aircraft to push an aircraft out of the machine at the 21-day mark. Tools used in this step were ideal state mapping, future state mapping, SMART (specific, measurable, attainable, results focused, time bound) objectives and course of action (COA) analysis. Figure 5 depicts the results of Step 4.
Step 4 was when the challenge began – determining the root cause. In order to understand if the team had addressed the source(s) of the problem it would need quantifiable measures to verify it was on the right track. This is an important step because the team needed something to base its corrective measures from. (A person must understand the root cause to be able to measure success after implementation of the goal.) Once again, brainstorming and Pareto analysis led the team to form a common understanding of the problem while the 5 Whys led them to dig deeper into the issues. As a result, the team determined the root causes to be lack of communication/planning, poor scheduling, lack of support and personnel management. The tools used to determine root cause(s) were current state mapping, brainstorming, affinity diagrams, 5 Whys, and a Pareto chart. Figure 6 depicts the results.
Part 2 of this case study looks at steps 5 through 8 of the continuous process improvement approach.
]]>This article provides a brief overview of the intricacies involved in a survey design – without getting into complex statistical theories.
Designing a survey is an iterative process as shown in Figure 1.
The critical aspects of any survey design are the underlying construct, framing of questions, validity and reliability, and the sampling methodology. A survey is done to measure a construct – an abstract concept. Before designing a survey, the construct must be clearly defined. Once the construct is clear, it can be broken down into different dimensions and each dimension can then be measured by a set of questions.
Consider an example of a human resources (HR) department that is trying to study the attitude of employees toward a newly launched appraisal process. Assume that with some research, the HR department finds that the major dimensions of the attitudes toward the appraisal policy are “transparency,” “evaluation criteria,” “workflow” and “growth potential.”
After determining the dimensions, a set of questions needs to be written to measure said dimensions. Questions can be categorized into two groups: classification and target. Classification questions cover the demographic details of the respondent, which can be used for grouping and identification of patterns during analysis. Target questions refer to the construct of the survey. Table 1 includes tips for avoiding common mistakes while wording the questions and selecting their order.
Table 1: Tips for Question Development and Ordering | ||
Content | Wording | Sequencing |
1. The question must be linked to a dimension of the construct.
2. The question should be necessary in the manner that it helps decision maker in making a decision. 3. The question should be precise and should not seek multiple responses. 4. The question should contain all information to elicit an unbiased response. 5. The question should not force the participant to respond regardless of knowledge and experience. 6. The question should not lead the respondent to generalize or summarize something inappropriately. 7. The question should not ask something sensitive and personal that the respondent may not wish to reveal. |
1. The question should not include jargons, technical words, abbreviations, symbols, etc. Use simple language with shared vocabulary.
2. The question should be worded from respondent’s perspective and not researcher’s perspective. 3. The question should not assume prior knowledge or experience inappropriate for the given situation. 4. The question should not lead the respondent to provide a biased response. 5. The question should not instigate the respondent by using critical words. |
1. The target questions should be asked at the beginning followed by classification questions at the end.
2. The questions should be grouped logically. Under each group, the wording and scale should be similar. 3. Complex, sensitive and personal questions should not be asked at the beginning. |
Another important aspect of a survey questionnaire is the response format. There are two levels at which the questions can be classified with regard to response format: structured and unstructured (Figure 2).
Structured questions provide close-ended options for the respondent to choose from, while unstructured questions provide a free choice of words to the respondent. While structured questions are easy to analyze, the provided choices must be mutually exclusive and collectively exhaustive. Unstructured questions, on the contrary, are difficult to analyze – limit them in the questionnaire. Structured questions can further be classified based upon a measurement scale. The choice of the measurement scale depends upon the objective of asking the question and, in turn, influences the analysis and interpretation of response. Table 2 describes difference types of measurement scales, characteristics of data generated through them, their purpose and their impact on analysis.
Table 2: Questions Defined by Measurement Scale | |||
Scale | Characteristics of Data | Purpose (When to Use?) | Implications on Analysis (Limitations) |
Nominal | Discrete data with no sense of magnitude (can be binary or multinomial) | · Classification of certain characteristic, event or object
· Can be dichotomous (only two choices) or multiple choice |
· Only mode can be calculated as measure of central tendency
· Cross tabulation and Chi Square can be used for analysis · Arithmetic operations are not possible on nominal scale |
Ordinal | Discrete data with a sense or order/rank | Involves rating or ranking a particular factor on a numeric or verbal scale where distance between various alternatives is immaterial | · Median, quartile, percentile, etc. can be used for central tendency
· Various nonparametric tests can be used for analysis · Arithmetic operations are not possible |
Interval | Continuous data with a sense or order, and distance | Involves rating a particular factor on a numeric or verbal scale where distance between various alternatives is important
· A Likert scale is a type of interval scale with a neutral value in between and extreme values at both end |
· Mean or median can be used as measure of central tendency depending upon the skewness of the data
· Parametric tests such as t-test, F-test, ANOVA, etc., can be used · All arithmetic operations are possible except multiplication and division |
Ratio | Continuous data with a sense or order, distance and origin | Involves questions pertaining to specific measurements such as “number of incidents per month” | · All statistical techniques are possible on data generated by ratio scale
· All arithmetic operations are possible |
After designing the questionnaire, the next step is to establish its validity and reliability. Validity is the degree to which a survey measures the chosen construct whereas reliability refers to the precision of the results obtained. Table 3 provides a brief description of the considerations of validity and reliability, and how they can be evaluated. The table also states when a particular evaluation technique can be applied.
Table 3: Validity and Reliability | |||
Design Aspect | What and Why? | How? (Action Items) | When? |
Validity | Representational validity – the degree to which a construct is adequately and clearly defined in terms of underlying dimensions and their corresponding operational definitions. | Consider again the attitudes to the newly launched appraisal process; the representational validity can be established by getting the questionnaire objectively evaluated by HR experts. Face validity can be established by assessing the suitability of questions, their wording, order and measurement scale. Content validity can be established by assessing the adequacy of the dimensions and corresponding questions in measuring the attitude. | Before administering the survey |
Criterion validity – measure of correlation between the result of a survey and a standard outcome of the same construct | Criterion validity in the example can be established by comparing the score of attitude of an employee with his or her performance rating.
If the scores are compared to a current performance rating, concurrent validity has been established. Another part of criterion validity is predictive validity, which can be established only in the long run by comparing the survey results with long-term performance of employees. |
After the results are obtained | |
Construct validity – measure of extent to which the questionnaire is consistent with the existing ideas and hypothesis on the construct being studied | In the HR example, construct validity can be established by two means: 1) correlate the scores obtained with that of another survey on attitude toward appraisal (convergent validity) and 2) correlate the scores obtained with that of another survey on attitudes of employees toward the earlier appraisal policy (discriminant validity). By factor analysis, it can also be verified by whether the results support the theory-based selection of dimensions and the corresponding questions. | After the results are obtained | |
Reliability | Stability – measure of consistency of results obtained by administering the survey to same respondents repeatedly | In practice, it is difficult to establish stability. The main problem is the choice of interval for administering the survey again. It may result in respondents answering based on memory or confounding of results with actual change in the construct. It is recommended to decide on the choice of interval, taking into consideration all the factors that influence the construct. | After the results are obtained |
Equivalence – degree of consistency among the observers or interviewers | This form of reliability is more appropriate in interviews and telephone surveys. | After the results are obtained | |
Internal consistency – measure of consistency in the questions used to measure the construct | Internal consistency can be measured by dividing the questionnaire into two equal halves and measuring the correlation between their scores. | After the results are obtained |
After the questionnaire is designed, it is time to determine the appropriate sampling methodology. Sampling is done to save time and costs in situations where it is not feasible to reach out to the entire population. It is a part of both analytical and enumerative studies. An enumerative study is done to measure the characteristics of population under study while an analytical study aims at revealing the root cause of the pattern observed. In other words, an enumerative study asks the question, “How many?” while an analytical study asks, “Why?” This distinction influences the sample design as well as the interpretation of the results. For an enumerative study, a random sample is repeatedly taken from a population while for an analytical study a random sampling frame is selected repeatedly from a population and then a sample is selected randomly from the sampling frame. During analysis and interpretation of results, scores of analytical studies include standard errors while those of enumerative studies do not.
Sampling can be broadly classified into two categories: random probability and non-probability. Random probability sampling means that each element of the population or sampling frame has an equal and non-zero probability of getting selected in the sample. It is only within the scope of random probability sampling that standard error (the measure of variation in the sample due to sampling error) can be estimated. Only in the case of random probability sampling can an estimate of confidence be made for a sample statistic. Here it is also noteworthy that as sample size increases, standard error reduces as the calculation of standard error uses sample size in the denominator. Non-probability sampling means that the sample is selected based upon judgement or convenience. It is important to keep the difference between random probability sampling and non-probability sampling in mind while selecting and implementing a survey methodology. Sometimes a random probability sampling methodology used to select elements of a population inadvertently leads to a selection of elements on a non-probability basis. Assume that a survey is conducted by a marketing team wherein the team members select people in various locations and ask them to complete the questionnaire. There can be bias resulting from the selection of respondents by the team members and respondents choosing not to respond. It is imperative that these situations are thought through while planning data collection before it comes to interpreting the results.
After deciding on the sampling methodology, the survey is administered. The survey is also reassessed in terms of validity and reliability and is further refined. Often, a survey is pre-tested before being rolled out on a large scale. This helps in refining the survey before it is launched. This, however, depends upon the available time and budget. In general, it is recommended to pre-test a survey. Once the results are received, the data is analysed and interpreted using various descriptive and inferential statistical techniques.
It is highly recommended for a practitioner to understand the nuances of surveys before embarking upon any such assignment. Refer to this article to help refine survey design in shorter and shorter spans of time.
Contact:
Barbara A. Cleary, PhD
barb@pqsystems.com
(800) 777-3020
September 5, 2017 (Dayton, Ohio) – With improved alerting possibilities, new chart types, and a host of other new features to enhance ease of use and immediate access to information, the latest version of SQCpack^{®} will save time and improve data analysis for process and discrete manufacturing facilities.
SQCpack is a real-time statistical process control solution that enables informed decision making and improves quality results. Its ease of use allows operators and quality team members to quickly leverage results to improve their quality performance.
Alerting capabilities have been enhanced by an expanded notification approach that renders immediate access to critical information. Unlike other solutions, alerts can be generated by data entered in SQCpack or from virtually any data source. In addition, new messaging capabilities mean that communications with SQCpack users make information immediately available for effective decision making.
Cusum and Exponentially Weighted Moving Average (EWMA) charts in SQCpack detect process changes and even small or gradual shifts in data, supporting accuracy in managing production data. Matt Savage, vice president of product development, says the newest version of the SQCpack solution “revolutionizes SPC capabilities for all organizations.”
SQCpack is the easiest SPC solution for helping organizations utilize the power of data analysis to reduce variability, enhance productivity, and improve profitability. For more than 30 years, it has provided support for process data analysis and quality improvement for both process and discrete manufacturing facilities throughout the world.
About PQ Systems: PQ Systems www.pqsystems.com is a privately-held company headquartered in Dayton, OH, with representation in Europe, Australia, Central and South America, Asia, and Africa and customers in more than 60 countries. For more than 30 years, the company has been helping businesses drive strategic quality outcomes by providing intuitive solutions to help manufacturers optimize process performance, improve product quality, and mitigate supply chain risk. The company’s scalable solutions include SQCpack^{®} for data analysis and statistical process control and GAGEpack for system management. PQ Systems’ world-class consulting, training, and support services ensure that clients receive the maximum return on their software implementations.
###
]]>To be fair, much of this dialogue is healthy. Six Sigma will never solve all problems and a smart leader knows he or she needs a complete set of quality approaches – not just one. Any continuous improvement processes must also be customized to the environment in which it is employed and must evolve as that environment changes. By far, however, the main reason for the clamor calling for new problem-solving methods is that Six Sigma has become more complex than necessary.
What follows in this article is a description of Six Sigma reduced to its fundamental assumptions, or theorems. If these simple concepts are understood, all the tools, all the tollgate deliverables, and all the statistics and jargon are put in their proper supporting roles. Rather than mastering all the tools and attempting to build the program from the bottom up, this approach depends on a top-down, theoretical foundation. Rather than a cookbook approach, Six Sigma should be seen as a mathematical proof.
Customers only pay for value.
It seems so simple. Of course the customer only pays for value, but most businesses define value incorrectly. The products and services being sold are not what confer value to the customer. Products and services are vehicles to deliver value. Value is only created when a specific need the customer has is fulfilled. If a customer need is not met, even if the product or service is perfect, no value is created. Quality is not a measure of perfection, but of effect.
Failing to understand customers and their needs is the biggest driver of cost in most businesses. Quality for many products is defined by whether engineering specifications are met rather than whether the product delivers to actual customer needs. Similarly, in many service environments, service quality is defined by what the customer wants or complains about rather than how the customer uses the service. This distinction is important because it is possible to deliver everything a customer asks for and do so perfectly while not satisfying his or her needs. When this happens, the costs to deliver go up but the revenue from delivering does not.
First Theorem of Six Sigma: Changes in critical to quality (CTQ) parameters – and only changes in CTQ parameters – alter the fiscal relationship an organization has with its customers.
Those product or service qualities that alter the way a customer behaves with regard to purchase decisions are CTQs. Changes in CTQs, be they good or bad, drive customer loyalty. When CTQs are altered, a company’s ability to create customer value is altered. Improving the customer’s perception of value changes ultimately affects the fiscal relationship with that customer either in terms of price or cost to deliver.
The problem with managing to customer CTQs is not a question of intent. No business intentionally fails to deliver to CTQs. Companies fail to deliver to CTQs because of gaps in process knowledge. These gaps manifest themselves in process variation and poor process capabilities.
If a business has a profound and complete process knowledge, the products and services being delivered can be controlled so as to always create customer value. Gaps in process knowledge are the primary causes of failure and defect.
Process control focuses on ensuring that that the process is managed and executed in a consistent manner. If there are gaps in the understanding of how processes work or gaps in the understanding of how the customer ascribes value to the products and services being generated, process control is simply not possible. Six Sigma (and all continuous improvement processes) is fundamentally focused on closing these knowledge gaps.
Second Theorem of Six Sigma: Process outputs are caused by process, system and environmental inputs.
Y = ƒ (X_{1} . . . X_{n})
The “holy grail” of Six Sigma is the process transfer function. Once this is properly defined, managers have all the tools they need to make the process perform in any manner desired. The transfer function is never really perfect, but were a company to ever have complete transfer functions for all their processes, optimizing costs, production and customer value would be a simple matter of arithmetic.
The big myth is that Six Sigma is about quality control and statistics. It is all that – but a helluva lot more! Six Sigma ultimately drives leadership to be better by providing the tools to think through tough issues. —Jack Welch
Good leaders strive to make good decisions all of the time. If, however, there are gaps in understanding, incorrect assumptions or false assumptions, what looks like a good decision will result in bad outcomes. Six Sigma is about reducing the probability of these bad outcomes.
Most processes have subject matter experts and institutional knowledge. In most cases, these subject matter experts and institutional knowledge allow businesses to create transfer functions that are 90 percent to 95 percent correct, which gives the illusion of expert knowledge. It is the illusion of expert knowledge that makes it difficult. Since most of the rules for how to run processes are known and since the knowledge gaps are subtle, it is believed that the issues are execution-related, not understanding-related. This is a dangerous situation as these small process knowledge gaps, compounded over a multistep process, can result in significant losses even when people attempt to execute the process to the best of their abilities.
All variation is caused.
Often the only clue to gaps in process knowledge is the degree of variability in the process. Processes always have variation (remember – entropy increases!) but it does not spontaneously occur. It must be caused. Variation is just another output of systems. Transfer functions can be written to describe it; the more complete those functions are the better variation can be controlled. There is no myth or magic. When a system does not perform in exactly the same manner for a static set of process inputs, that simply means there are additional factors that are not yet understood that influence those processes.
Third Theorem of Six Sigma: Variation in process outputs are caused by process, system and environmental inputs.
∂Y / ∂X = ƒ′ (X_{1} . . . X_{n})
If Taguchi’s loss hypothesis (the cost of a system increases as it diverges from the performance expectations of its customer) is accepted, then process variation is the leading cause of customer dissatisfaction and operating costs. The factors that drive the process output and the process variation can be defined and controlled. Processes can then perform at the optimum balance between delivery and stability, thus driving the lowest possible cost and the highest possible customer satisfaction. In other words, if enough “profound knowledge” is added to the system, maximum value can be produced. The goal of Six Sigma, therefore, is always to increase process understanding. The goal is to populate transfer functions to the degree needed, and warranted, in order to best serve customers.
Given the right process knowledge and the ability to deliver products and services that satisfy the customers’ CTQs, management will always make the decision that most benefits the customer and achieves the highest possible return on investment.
The number one assumption when doing any type of continuous improvement is that if leadership is provided with the know-how, if knowledge gaps are filled in, leadership will make (or allow others to make) decisions that are best for the long-term well-being of the business. Some people are afraid to test this assumption; it is a trust issue. If power is held because of the ability to solve a recurring problem, if teams are rewarded for firefighting rather than preventing problems, then this process collapses. While it is tempting to let experience and tradition supersede structured problem solving, in the long term when overall systems understanding is improved, and leadership employs that learning for the advantage of all parties in the supply chain, the business prospers.
Given a choice between long-term sustainable growth and short-term profit, long-term growth will always outperform the short-term gain.
This is the final and most critical assumption. It is cornerstone to total quality management, the Toyota Production System (Lean) and Six Sigma. If people are helped with controlling their own destinies, and if they are provided with the wherewithal to achieve self-determination, they will naturally do what profits them the most. When educated about the long-term benefits of creating sustainable customer relationships, most people will choose to create value and maximize their payback on the relationship. This is a question of ethics.
There are three foundational theorems and a simple set of postulates or assumptions that these theorems are based upon. All the tools and processes of Six Sigma – both DMAIC (Define, Measure, Analyze, Improve, Control) and DFSS (Design for Six Sigma) – are grounded in these simple foundations. Get the assumptions correct and all else is commentary.
]]>The question was posed to me: “I have five samples to test from my population. From that data, how can I estimate capability against our specifications?” Of course, the brutally honest answer is, “Poorly.”
But Black Belts do not survive by being sarcastic, so a better answer might be: “Any estimate of sigma from a small sample will have very large confidence intervals, giving you little knowledge of the actual population.” The question, however, pointed me toward an interesting avenue of exploration into the various ways that we may estimate the sigma in small sample sizes.
First, let’s review some of the more common methods of estimating sigma (or standard deviation, SD):
A. Using the average difference between an observation and the mean adjusted by sample size (the classic formula for sigma).
B. Using the range of the data divided by a factor, C, where C varies with sample size. Common values of C include 4 and 6, depending on sample size.
SD = Range/C
C. Using the moving range (MR) of time ordered data (where we subgroup on successive data points), divided by a factor. This is the method used in individual moving range control charts.
SD = MR/1.128
D. Using the interquartile range (IQR) of the data divided by a factor (D) where D = 1.35 is the most commonly proposed value.
SD = IQR/D
E. Using the mean of successive differences (MSSD) to estimate variance.^{1}
F. Using the mean absolute deviation (MAD) , absolute deviation (AD) is an estimate for SD.^{2}
G. Also, you can use the data minimum, maximum, and median to calculate an estimate of variance for small sample sizes.
For the purposes of my study I chose to evaluate all of the methods except for F. I chose methods to study that were easy to calculate (B, C, D), were available in Minitab (A, E) and sparked my interest (G).
A successful search for a better estimate of sigma, when the sample size is N ≤ 10, would meet the following two criteria:
In other words, if I repeatedly take random samples from a normally distributed population and calculate sigma, then all my samplings of sigma should begin to form a distribution with the average estimate of sigma equaling the true sigma of the data. Improved estimates of sigma will have a tighter distribution of estimates from this repeated sampling than other methods.
(Note: I will be primarily using a visual analysis of dot plots of the distribution of estimates of sigma to evaluate each method of estimating sigma.)
I also calculated the absolute deviation from true sigma (ADTS) using a formula similar to the mean absolute deviation (Method F) in order to numerically gauge each method’s performance against the criteria using the formula:
where Xi is each individual estimate of sigma generated in the simulation and sigma (σ) is the true population standard deviation from which the random normal data was calculated. In this case, N is the number of total estimates in the simulation. This formula sums the absolute value of the difference between each estimate of sigma and the true sigma of the population and divides by the number of estimates. The higher the ADTS factor, the worse the estimation of sigma. (Note: The distribution of standard deviation is a chi-square distribution and is, therefore, not evenly balanced about the mean. This unbalance can be seen in some of the simulations, but was not always clearly observed.)
Minitab’s Random Data calculator was used to create random normal data for testing of the various methods. I decided to study only normally distributed data sets as calculating sigma for the purposes of this article.
I used Minitab to create thousands of random normal data points. These thousands of pieces of data were then sampled using varying sample sizes, from 5 to 50 samples. In most of my explorations that follow, I used 5,000 pieces of random normal data.
Out of the 5,000 random normal data points, I used the following:
From all these samplings of the random data, I could then calculate the sigma using each of the methods. This gave me a set of data for each method of calculating sigma. A dot plot clearly shows the spread and frequency of each of the estimates of sigma. I could have used other displays of this data such as the classic histogram, but I felt that the dot plot improved the visual clarity of the data.
For example, the dot plot shown below in Figure 1 was created from selecting 1,000 five-piece samples out of 5,000 pieces of random normal data with mean = 0 and sigma = 1. The sigma was calculated for each of these 1,000 five-piece samples using the classic formula (Method A).
We observe from the dot plot the range of all 1,000 estimates of sigma. As the population was created using a sigma of 1, we can add that line to the dot plot and observe how all of the 1,000 estimates of sigma gathered about the “true” value. In real life we would never know that true value unless we had some previous knowledge from a larger sample. What this shows us is that if we take five random samples from a population where the actual sigma is 1.0, we might get a sigma anywhere from as low as 0.2 to as high as 2.3.
Calculating the ADTS factor for this set of sigma estimations gives us a value of 0.28. Values of ADTS can be compared to each other as long as the estimates of sigma come from the same population (same population mean and sigma).
A randomly generated set of 5,000 normally distributed data points with mean = 0 and SD = 1 was used to evaluate the various candidates for estimating sigma.
Moving Range Evaluation (Method C)
Using the moving range (average MR/1.128) for an estimate of sigma is problematic as it requires that the data is in time order. If we have a small sample size of data and if the order of this data is either unknown or not relevant, then using MR to estimate sigma is not valid.
N < 10, the MR is a better estimate of sigma than the classic formula. I performed a test of this using a smaller subset of my normal population dataset and calculated sigma using both the classic formula and the MR. Figure 2 below shows that sigma calculated from MR gives no obvious advantage over the classic formula. The calculated ADTS values support this conclusion.
MSSD Evaluation (Method E)
From my 5,000-piece random normal data set, I calculated sigma using the classic formula (Method A) and compared that estimate to the SQRT (MSSD) method. Using Minitab’s Store Descriptive Statistics function along with the calculator function, I compared these two methods for sample sizes of 5, 10 and 50.
As can be seen in the dot plot comparison and ADTS values in Figure 3, the two methods have nearly identical centers and spread. Note that the value of ADTS becomes smaller as the sample size increases. This is as expected since the estimate of sigma improves as sample size increases. Given that the MSSD method provides nothing new over the classic formula, I dropped MSSD from the rest of my evaluations.
The “Hozo” Method (Method G)
In their article, “Estimating the mean and variance from the median, range, and the size of the sample,” mathematicians Stela Pudar Hozo, Benjamin Djulbegovic and Iztok Hozo proposed Method G for estimating variance and, thereby, SD involving sample size, minimum value, maximum value and the median.^{3}
The authors used simulations and “determined that for very small samples (up to 15) the best estimator for the variance is the formula” shown here as Method G.
I tested this formula using my 5,000-piece population of random normal data and compared to the classic sigma formula as shown in Figures 4 and 5.
When the sample size is five, the Hozo method shifts the range of estimates to the left. Even though the spread of estimates is better for the Hozo method, the ADTS values show the classic standard deviation formula is slightly better. At the sample size of 10, the shift is not as great and the ADTS values indicate little difference between the classic sigma and Hozo methods.
However, in fairness to Hozo et al, their study parameters were different from mine. Instead of a mean of 0 and sigma of 1 (which I used to create normal data), they “drew 200 random samples of sizes ranging from 8 to 100 from a normal distribution with a population mean [of] 50 and sigma [of] 17.” This is a large sigma when compared to the mean (17/50). I am not sure that is practical in engineering.
Therefore, I created 5,000 random normal data points using mean = 50 and SD = 17 to see how the Hozo method compared to the sigma formula for a sample size of five and 10. This simulation still shows that with N = 5 the estimate of sigma is shifted away from the actual population sigma of 17. For N = 10 the shift is less but still present. The ADTS values bear out this conclusion.
In their study, Hozo et al found that “when the sample size increases, range/4 is the best estimator for the sigma until the sample sizes reach about 70. For large samples (size more than 70)[,] range/6 is actually the best estimator for the sigma (and variance).”
Range/C (Method B) and IQR/D (Method D) Evaluation
While all the other formulae are definitive in their variables, the range and IQR methods require some way to decide what to use for the values of C and D.
SD = Range/C
SD = IQR/D
From Hozo et al I found that commonly used values for C are 4 and 6, and that the most commonly used value for D is 1.35.
I decided to test these values against alternate values of C and D in the hopes of finding an improved range or IQR formula for small sample sizes (e.g., <10). After preliminary modeling using a wide selection of values for C and D, I settled on testing the following factors:
Once I determined the values of C and D to be evaluated, I used the 5,000 pieces of random normal data to calculate the spread of estimates of sigma using values of C and D as defined above and compared them to the classically calculated formula for sigma along with their ADTS factor.
In the dot plots that follow I left out the plots of the classic sigma formula as I am focusing on the centering and spread of the various estimates using R/C and IQR/D.
The code of labeling is as follows:
Range Evaluation
Figure 8 shows that R/4 is greatly left-shifted when N < 10 and that R/2.5 is centered – although the spread of the R/2.5 data is large. At values of N > 10, using R/4 centers around the true sigma much better than R/2.5.
Plotting all the ADTS factors by sample size and sigma calculated by R/C and classic standard deviation shows how R/2.5 and R/4 change with sample size (see Figure 9). We also can see how these estimates of sigma compare to the classic standard deviation formula.
IQR Evaluation
In Figure 10, the IQR/D estimates show that when the sample size is 5 or 10, then IQR/1.55 is more centered and has less spread of estimates than when compared to IQR/1.35. With sample sizes greater than 10, however, this pattern shifts and IQR/1.35 slightly improves.
As with the range data, I plotted the ASTS values for each IQR/D method and compared them to the classic formula for standard deviation (Figure 11).
A summary chart of these two methods is shown in Figure 12. This chart shows that all methods are almost equal when N = 5. When N = 10, the IQR/1.55 is better than R/2.5 but not as good as the classic formula for standard deviation.
Given that the best estimates for sigma appear to be IQR/1.55, R/4 or R/6 (depending on sample size), I created a new set of 5,000 pieces of random normal data and re-ran all of the calculations of ADTS for each combination.
The graph in Figure 13 is interesting in that it shows how IQR/1.55 is actually pretty robust over sample size. The IQR/1.55 method would be a good choice if picking a method for estimating sigma (that was not the classic formula).
The IQR/1.55 method has another advantage. Both the R/C method and the classic sigma method are prone to outliers, especially with small sample sizes. The IQR/1.55 method is not affected by an extreme outlier in a small sample of data.
For example, let’s look at a set of seven data points to see how an outlier affects our estimates of sigma. Below are two seven-piece random and normal samples of data from a population with a known sigma of 1.0. The first set of data does not have an outlier; the second set of data does have an outlier (2.0) as confirmed by Grubb’s test for outliers.
Table 1: Sample Data | |
Data 1 | Data 1_1 |
0.229762 | 0.22976 |
0.370426 | 0.37043 |
0.402137 | 0.40214 |
0.589118 | 0.58912 |
0.776588 | 0.77659 |
0.845852 | 0.84585 |
0.969874 | 2.00000 |
From this data we can calculate the following estimates of sigma and see how the IQR method is robust to an outlier. The classic method and the R/2.5 method change significantly with the presence of an outlier.
Table 2: Comparing IQR and R to Classic Sigma | |||
Classic Sigma | IQR/1.55 | Range/2.5 | |
No Outlier | 0.76 | 0.307 | 0.296 |
With Outlier | 0.596 | 0.307 | 0.708 |
For ease of calculations, if you are given a choice, a sample size of N = 7 allows the IQR to be easily solved. For N = 7, the third quartile is the sixth data point in ordered data, and the first quartile is the second data point in the ordered data. When N = 11, then the third quartile is the ninth point and the first quartile is the third point.
A successful search for a better estimate of sigma centers the estimates about the true population sigma and has a tighter spread of estimates than given by the classic formula for standard deviation.
In this study, we examined several candidate formulae for sigma when N ≤ 10. We also hypothesized two new formulae (R/C, where C = 2.5 and IQR/D where D = 1.55).
Other methods of estimating sigma (MSSD, Hozo and MR) do not appear to offer any advantages over the classic formula for sigma. The Hozo method also seems to shift the center of the estimates of sigma to the left of the “true” sigma.
Estimates of sigma using IQR/1.55 appear to be good when the population data is normal – regardless of sample size. Although the classic formula appears to give lower ADTS scores for every sample size.
It is recommended that when the sample size is small, that a test for outliers (e.g., Grubb’s test) be performed. If an outlier exists and if the reason for the outlier cannot be determined, then the IQR/1.55 method is recommended.
The decision by an investigator to use the IQR/1.55 method, Range/C method or the classic standard deviation formula is situational. But it may be argued from this data that the classic formula of standard deviation is the best estimator of sigma – regardless of sample size.
USL, upper specification limit; LSL, lower specification limit.
*Estimated sigma = average range/d2
Common understanding includes the fact that C_{pk} climbs as a process improves – the higher the C_{pk}, the better the product or process. Using the formula above, it’s easy to calculate C_{pk} once the mean, standard deviation, and upper and lower specification limits are known.
But what if you have only one specification or tolerance – for example, an upper, but no lower, tolerance? Is C_{pk} advisable under these circumstances?
When faced with a missing specification, consider one of the following three options:
Examining a specific situation may clarify the outcome of each of these possibilities. A customer of a plastic pellet manufacturer has specified that the pellets should have a low amount of moisture content. The lower the moisture content, the better, but no more than 0.5 units is allowed; too much moisture will create manufacturing problems for the customer. The process is in statistical control.
This customer would undoubtedly not be satisfied with option 1 as C_{pk} has been specifically requested. With option 2, it could be argued that the LSL is 0, since moisture levels below zero are impossible. With a USL at 0.5 and LSL at 0, the C_{pk} calculation would be as follows.
Assume the X-bar = 0.0025 and estimated sigma is 0.15.
The customer is not likely to be satisfied with a C_{pk} of 0.005, and that number does not represent the process capability accurately.
Option 3 assumes that the lower specification is missing. Without an LSL, Z_{lower} is missing or nonexistent. Z_{min} becomes Z_{upper} and C_{pk} becomes Z_{upper} / 3.
Z_{upper} = 3.316 (from above)
C_{pk} = 3.316 / 3 = 1.10
A C_{pk} of 1.10 is more realistic than one of 0.005 for the data given in this example, and is more representative of the process itself.
As the example demonstrates, setting the lower specification to 0 results in a lower, misleading C_{pk}. In fact, as the process improves (here, moisture content decreases), the C_{pk} should have increased. If 0 was used as the LSL, however, the C_{pk} would have decreased. This is one clue that entering an arbitrary specification is not advised.
“What should be done when only one specification exists?” The (only) specification you have should be used, and the other specification should be left out of consideration or treated as missing. In these cases, use only Z_{upper} or Z_{lower}.
C_{pk} can and should be calculated when only one specification exists, provided only the remaining valid specification is used. As the example demonstrates, the missing specification should remain missing and not be artificially inserted into the calculation.
]]>While one of the statistical methods widely used in the Analyze phase is regression analysis, there are situations that warrant the use of other nonparametric methods. Violation of the basic assumptions of normally and independently distributed residuals, and the presence of nonlinear relationships, are the most common situations where using a nonparametric method, such as a classification and regression tree (CART), is more appropriate. In addition, CART can be appropriate in service industries such as banking and healthcare where many potential causes of variation and defects are categorical in nature (e.g., geographical locations, products, channels, partners). The problem with using regression or generalized linear models (GLM) in such cases is that a lot of dummy variables make it difficult to interpret the results. CART is a useful nonparametric technique that can be used to explain a continuous or categorical dependent variable in terms of multiple independent variables. The independent variables can be continuous or categorical. CART employs a partitioning approach generally known as “divide and conquer.”
Assume there is a set of credit card transactions labeled as fraudulent or authentic. There are two attributes of each transaction: amount (of transaction) and age of customer. Figure 1 displays an example map of fraudulent and authentic transactions.
The CART algorithm works to find the independent variable that creates the best homogeneous group when splitting the data. For a classification problem where the response variable is categorical, this is decided by calculating the information gained based upon the entropy resulting from the split. For numeric response, homogeneity is measured by statistics such as standard deviation or variance. (For more information on this please refer to Machine Learning with R by Brett Lantz.)
Two important parameters of the CART technique are the minimum split criterion and the complexity parameter (C_{p}). The minimum split criterion is the minimum number of records that must be present in a node before a split can be attempted. This has to be specified at the outset. C_{p} is a complexity parameter that avoids splitting those nodes that are obviously not worthwhile. Another way to consider these parameters is that the C_{p} value is determined after “growing the tree” and the optimal value is used to “prune the tree.”
In this example, Figure 2 shows that the first rule formed is x2 > 35 → fraudulent transaction. Similarly, other rules are formed as shown in Figures 3 and 4.
In this way, the CART algorithm keeps dividing the data set until each “leaf” node is left with the minimum number of records as specified by minimum split criterion. This results in a tree-like structure as shown in Figure 5. The C_{p} value is then plotted against various levels of the tree and the optimum value is used to prune the tree.
The following example contains a hypothetical dataset of 600 dispatch transactions of a bank.
The dependent variable is the attribute “defective,” which is a categorical variable with two classes (yes and no). Each transaction is labeled either “yes” or “no” based on whether there is any printing error in the deliverable. The independent variables are “amount,” “channel,” “service type,” “customer category” and “department involved.” The first step in applying any analytical method is to explore the data using descriptive statistics. Assume that in exploring the data all of the independent variables seem to have a significant relationship with the dependent variable. In order to carry out the CART analysis, the dataset is randomly split into two sets, the training and testing sets. Nonparametric studies are not based upon theoretical-probability distributions; it is widely accepted practice to build a model on one set of data and test it on another. This helps in ascertaining the accuracy of the model on unknown future records.
The CART model is used to find out the relationship among defective transactions and “amount,” “channel,” “service type,” “customer category” and “department involved.” After building the model, the C_{p} value is checked across the levels of tree to find out the optimum level at which the relative error is minimum. The optimum C_{p} value is then used to prune the tree.
Post-pruning, the “final” tree can be created as shown in Figure 8. The model can also be validated against test data to ascertain its accuracy.
As with other nonparametric techniques, CART does not require any assumptions for underlying distributions. It is easy to use and can quickly provide valuable insights into massive amounts of data. These insights can be further used to drill down to a particular cause and find effective, quick solutions. The solution is easily interpretable, intuitive and can be verified with existing data; it is a good way to present solutions to management.
Like any technique, CART also has limitations to take into account before doing the analysis and making any decisions. The biggest limitation is the fact that it is a nonparametric technique; it is not recommended to make any generalization on the underlying phenomenon based upon the results observed. Although the rules obtained through the analysis can be tested on new data, it must be remembered that the model is built based upon the sample without making any inference about the underlying probability distribution. In addition to this, another limitation of CART is that the tree becomes quite complex after seven or eight layers. Interpreting the results in this situation is not intuitive.
Conclusion
CART can be used efficiently to assess massive datasets and can provide quick solutions in the Analyze phase of DMAIC. CART can be one of the quickest and most effective tools in the bag of any process improvement practitioner. CART should not, however, replace corresponding parametric techniques. The latter is always more powerful in terms of explaining any phenomenon owing to the nature of underlying distribution.
]]>Contact
Barbara A. Cleary, PhD
800-777-3020
Dayton, Ohio, April 14, 2017—An update to GAGEpack 12 from PQ Systems demonstrates the commitment of developers to respond to customer needs with enhanced ease of use and improved user interface.
In this updated version, visual improvements streamline the window view by collapsing the navigation panel and moving it to the ribbon, making filter information more readily available, and altering less-used commands by invoking them through buttons.
Additional improvements to the user experience include:
Customers with maintenance agreements will receive the updated solution automatically, allowing for a smooth transition to the use of new features.
GAGEpack is a powerful gage calibration solution that maintains complete histories of measurement devices, instruments, and gages. To guarantee timely calibration, the software provides a variety of tools, such as:
o Calibration schedules and reports
o Alerts about failed and past due calibrations
o Gage location and status tracking
o Gage repair records
o Audit trail for traceability
o A Task tab with a “To do” list
o Gage event alert system
About PQ Systems: PQ Systems www.pqsystems.com is a privately-held company headquartered in Dayton, OH, with representation in Europe, Australia, Central and South America, Asia, and Africa and customers in more than 60 countries. For more than 30 years, the company has been helping businesses drive strategic quality outcomes by providing intuitive solutions to help manufacturers optimize process performance, improve product quality, and mitigate supply chain risk. The company’s scalable solutions include SQCpack® for data analysis and statistical process control and GAGEpack® for measurement intelligence. PQ Systems’ world-class consulting, training, and support services ensure that clients receive the maximum return on their software implementations.
####
]]>There are two types of metrics to consider when selecting KPIs for a project: outcome metrics and process metrics.
Outcome metrics provide insight into the output, or end result, of a process. Outcome metrics typically have an associated data-lag due to time passing before the outcome of a process is known. The primary outcome metric for a project is typically identified by project teams early on in their project work. This metric for most projects can be found by answering the question, “What are you trying to accomplish?”
Process metrics provide feedback on the performance of elements of the process as it happens. It is common for process metrics to focus on the identified drivers of process performance. Process metrics can provide a preview of process performance for project teams and allow them to work proactively to address performance concerns.
Consider an example of KPIs for a healthcare-focused improvement project:
In the example above the project has one primary outcome metric and four process metrics that compose the KPIs the team is monitoring. Well-crafted improvement project KPIs will include both outcome metrics and process metrics. Having a mix of both provides the balance of information that the team needs to successfully monitor performance and progress towards goals.
Teams should develop no more than three to six KPIs for a project. Moving beyond six metrics can dilute the effects of the data and make it more challenging to effectively communicate the progress of a project.
Common questions coaches can use with teams to generate conversation about potential KPIs include:
Coaches should keep the three Ms of crafting KPIs in mind when working with teams.
Remember that successful KPIs:
Crafting KPIs is an important step to guide teams through a continuous improvement process. A coach needs to keep the team focused on what success looks like and how best to measure it.
]]>