Language Data: The 'Other Data' of Six Sigma, Part 1 of 2: Nature of Language Data

Part 1 of this article focuses on helping practitioners understand the role language data plays in Six Sigma work and how they can benefit from understanding its nature, including the types and refinement of the data as well as how to gather it in the most effective way.

Part 2 – Using Language Data is about the tools used for processing language data.

There is a kind of data that gets less press than the numbers so visible in almost all Six Sigma work – language data. The fact that it receives less exposure doesn’t mean it is less worthy of exploration.

Isn’t Six Sigma About the Numbers?

Everyone involved in Six Sigma, from leaders and champions through belts and team members, learns the importance of fact-based communication traveling in two directions – input (listening, gathering facts and, distilling) and output (reporting, convincing and motivating). Six Sigma has done so much to improve the way quantitative and graphical tools strengthen communication, it is easy to think that the numbers must be the key to success. In practice, though, the numbers without the story will not drive all the learning and reporting necessary over the course of most DMAIC (Define, Measure, Analyze, Improve, Control) and DFSS (Design for Six Sigma) projects. This is especially true as the number of people involved increases (Figure 1).

Figure 1: Points of Contact Increase Communication Risk

A few people working closely together on the same project can manage communication rather informally among themselves. As the number of points of contact grows through the size of the team, stakeholders, customers, suppliers, etc., the need for additional lines of communication also grows. In a typical project, the varied perspectives, motivations and levels of focus across these lines make it clear that communication cannot be left to informal means.

A fundamental drive to better understand language data is built into the problem-solving process that is the foundation of the DFSS and DMAIC methodologies. During project selection, even before getting to the numbers, one must formulate the problem or opportunity using information from customers, stakeholders and the target environment. That information is largely in the form of language data.

Figure 2 illustrates the use of data to drive progress in problem solving. This builds on Jiro Kawakita’s ‘W’ model and Shoji Shiba’s ‘WV’ model, providing unique insights into the role of language data. The vertical axis depicts an individual or team moving back and forth between thinking, where they reflect on data previously acquired and distill it to gain understanding, and experience, where they become immersed in the data at its source. Thinking involves planning for the gathering of any new data deemed useful; and experience implies making the best use of limited time to gather the most useful and accurate data possible.

Figure 2: The Role of Language Data in Problem-solving

A key point in Figure 2 is that, at the front end, language data is often all there is to work with. DFSS projects in the early stages, for example, deal with customer and business environment data that consist predominantly of statements about requirements and observations about a target environment. Distilling that data helps a team focus on the important aspects of the opportunity or problem in order to think about the next, more detailed set of data to collect. The region labeled “II” is where DMAIC projects typically begin with a strong focus on some aspect of a potentially broader issue.

Managing Language Data Measurement Variation

While language is different than numbers in many ways, there are some parallels in that data is sometimes tainted with unwanted variation. This variation is often imposed by the measurement system used to collect the data. Language in its raw form (conversations, email, etc.) is often colored with emotion, judgment, inference and unclear measures. While these may convey some meaning, they often represent noise and make the meaning unclear. To improve the accuracy and repeatability of the measurement system, semanticist S. I. Hayakawa distinguishes the value of “report language,” which focuses on the traceable facts (Figure 3).

Problem-solvers generally operate more effectively when qualitative data is gathered and processed in the form of report language. That doesn’t always come easy, as people tend to generate the emotion, judgment, etc. reflected at the top of Figure 3. It takes energy and effective probing to acquire the data needed.

Figure 3: Removing 'Measurement Noise' During Language Data Acquistion — Figure 3: Removing ‘Measurement Noise’ During Language Data Acquistion

Without giving it a name, people routinely exercise the powerful skill of distilling common aspects in a number of specific events or things, classifying them at the appropriate detail layer in their mental database. This is called abstraction, which is a core communication and thinking skill. Hayakawa developed the useful notion of the “ladder of abstraction” (Figure 4), which helps one understand that abstraction isn’t about being vague. It involves precise generalization, the process of finding the right rung on the ladder with enough detail for clarity yet not so much that the detail gets in the way. People would find it difficult to converse or think without the seamless ability to move up and down the ladder. No one says, “I’m going to the apple, banana, grape, strawberry stand,” when “fruit stand” conveys the idea with appropriately less detail. On the other hand, most people don’t ask someone to bring them back “some food” for lunch. While these examples seem trivial, they show how automatic the process of abstraction is. Figure 4 provides a view of abstraction at three levels. Part of the challenge is finding the right level for a particular use.

Noting the distinction between “context data” and “needs data” can help anyone gather better information and enhance the way the data is processed. Needs data conveys something valuable to have or be able to do. Context data, on the other hand, refers to observations or statements about an environment. Each of these kinds of data carry important information that can help in articulating a complex problem, developing stated and latent requirements, or discovering factors important to robust design. Companies that understand the value hidden in context data pay a lot of attention to gathering and processing that data.

Focused Discussions for Better Language Data

A good approach to gathering language data that contains the right level of abstraction, report language if appropriate, and the right mix of useful context and needs detail is a focused open-ended interview. As illustrated in Figure 3, moving from data “as we find it” (often an interviewee’s first response) to the level of detail and clarity needed, usually involves probing since much of the richest data is found in the answers to follow-up questions. More detail and many useful tips in this area are available in Edward F. McQuarrie’s classic text, Customer Visits.

Part 2 – Using Language Data

Language Data: The ‘Other Data’ of Six Sigma, Part 1 of 2: Nature of Language Data

Isn’t Six Sigma About the Numbers?

Managing Language Data Measurement Variation

Focused Discussions for Better Language Data

About the Author

David L. Hallowell