iSixSigma

TaaG Analysis – Fast and Easy for Comparing Trends in Large Data Sets

TaaG (trends at a glance) analysis is a fast way to compare trends of subsets of data across large data sets. It is an ideal tool to use in the Measure and Control phases of DMAIC (Define, Measure, Analyze, Improve, Control) projects.

The value of TaaG analysis is best understood by way of example. Suppose you had the following data that includes three distinct groups tracking the number of negative incidents each group worked on, each month, over a rolling 13-month period.

Table 1: Example of TaaG Analysis
Year 1Year 2
Jan.Feb.MarchAprilMayJuneJulyAug.Sept.Oct.Nov.Dec.Jan.
Group 13444777010111795117114137168180192
Group 210691118105951119610511410092110107
Group 319518719515214511611710510188869382

One might wish to make three run charts with trend lines representing the linear regression of each group. This type of view provides a quick visual depiction of which groups are trending toward the positive and which ones are not.

Figure 1: Group 1 – Number of Incidents by Month

Figure 1: Group 1 – Number of Incidents by Month

Figure 2: Group 2 – Number of Incidents by Month

Figure 2: Group 2 – Number of Incidents by Month

Figure 3: Group 3 – Number of Incidents by Month

Figure 3: Group 3 – Number of Incidents by Month

What happens, however, if there are 100, 200 or 500 groups that require a similar type of analysis? The task grows dramatically more difficult. The arduous chore of creating 500 run charts with trend lines is daunting. And after the charts are complete there is still the task of comparing those 500 charts to see which has the most pronounced trends. A much easier and faster solution can be found in TaaG analysis and the construction of a TaaG table.

Handpicked Content :   Attribute Data: Making the Most of What’'s Available

Building a TaaG Table

As you look at the charts above, notice that each chart can be represented by two key values:

  1. The arithmetic mean of the data points
  2. The slope of the linear regression

Although the average of the three groups is fairly similar, the slopes of the three trend lines are different. Group 1 has a slope that is a positive value, Group 2 has a slope of almost zero and Group 3 has a slope that is a negative value.

A TaaG table is simply a table (like Table 1) which lists all the data points pertaining to the individual groups, with the addition of their arithmetic means and slopes added. Although the equation for each of these two numbers is listed below, the easiest way to calculate them is to simply use the functions already built into your favorite spreadsheet or statistics software.

Equation for Arithmetic Mean

Equation for Arithmetic Mean

Equation for Slope of Linear Regression

Equation for Slope of Linear Regression

The following is an excerpt of an example TaaG table with 500 groups.

Table 2: TaaG Table with 500 Data Groups
Jan.Feb.MarchAprilMayJuneJulyAug.Sept.Oct.Nov.Dec.Jan.MeanSlope
Group 141363224278813.920.22
Group 267544456532674.92-0.05
Group 347833317223.08-0.49
Group 4178656117414.31-0.94
Group 541216158410124.150.77
Group 6214333476101514.540.66
Groups 7-497 not shown
Group 497111371417711714656109.85-0.43
Group 498119864247454.62-0.66
Group 4991284356521313.85-0.71
Group 500116421731183.380.45
Handpicked Content :   CHARTrunner Solves Hospital's Daunting Data Task

Once the data is in this format it is easy to sort the data by slope to see which group has the most prominent trends. For example, below is the sorted table showing the top 10 most prominently trending groups from the whole data set.

Table 3: TaaG Table Sorted by Slope
Jan.Feb.MarchAprilMayJuneJulyAug.Sept.Oct.Nov.Dec.Jan.MeanSlope
Group 31634447770101117951171141371681803899.387.12
Group 1601461512142626323332374822.003.61
Group 11531071617162229232032405322.153.23
Group 1352451211131615263035294218.463.14
Group 3103551015171723152628344318.542.92
Group 225436911191418192622374217.692.90
Group 27525610891415283023323116.382.65
Group 340257129131420221330304016.692.64
Group 235138517151614142733293216.462.61
Group 1901881113121714192517284316.622.44
Handpicked Content :   Statistics Do Three Things - Describe, Compare and Relate

In this view you can see that far and away the most significant trend is from Group 316. The slope of 7.12 for this group signifies that on average, for the past 13 months, they have increased to just over seven incidents each month. Since these incidents are negative incidents, this represents a worsening trend. Why is this group having a greater impact than the others? The insight from a TaaG table will help guide a Six Sigma practitioner toward which subsets of data are driving the greatest need for improvement.

It is easy to quickly sort the data using normal spreadsheet software. The analysis is easily repeatable if data is gathered on a regular basis (for example, monthly reporting). This makes TaaG analysis ideal for the Control phase of improvement efforts as well as the Measure phase.

Studying Relative Slope

Another aspect of TaaG analysis that can be valuable is the study of relative slope. With large groups of data, it is likely that some subgroups will trend more significantly than others. The trend relative to the normal performance of that subgroup, however, may be better than other subgroups.

For example, consider the data in Table 3 above. Group 316 has the most significant trend at 7.12, with a mean of 99.38, whereas the second group (Group 160) has a “better” trend of 3.61 and a mean of 22.00. It is noteworthy to consider that while the overall trend of the first group is worse, its trend relative to its mean performance is better. To highlight this point consider Table 4 where the relative slope equals slope/mean.

Handpicked Content :   A Six Sigma Case Study - Tutorial for IT Call Center - Part 4 of 6
Table 4: Subgroups of Relative Slopes
MeanSlopeRelative Slope
Group 31699.387.120.01716
Group 16022.003.610.1641

While Group 316 has a “worse” trend, it represents about 7 percent of its mean monthly performance, whereas Group 160 represents about 16 percent of its mean monthly performance. At times it is valuable to sort a TaaG table by the relative slope instead of the slope to see which subgroups of the whole data set are trending the most poorly – relative to their individual performance. Table 5 is an example from the same data set used in Table 4 – except it is sorted by relative slope; Table 5 shows the 10 subgroups out of the 500 that have the greatest relative slopes.

Table 5: Subgroups with Greatest Relative Slopes
Jan.Feb.MarchAprilMayJuneJulyAug.Sept.Oct.Nov.Dec.Jan.MeanSlopeRelative Slope
Group 4253125351295104.230.860.2039
Group 440231491788104.080.830.2035
Group 511114354797124.230.850.2013
Group 465157361111151718322611.692.310.1978
Group 24511223335559123.920.770.1975
Group 8021545258812114.850.930.1916
Group 901144437657124.150.770.1865
Group 1852576551491312227.691.430.1864
Group 320121353235157104.380.810.1842
Group 210113244647773.540.650.1832
Handpicked Content :   Estimating Sample Size for Process Capability with Special Causes (with Template)

Note that there is no overlap between the two tables of top 10 subgroups. Although the subgroups with the highest relative slopes do not appear to have as significant an effect on the total data set as a whole, taking notice of relative slopes can often bring insight to poorly trending subgroups. That notice, in turn, can result in movements toward improvement.

Trending – Better or Worse

Note in the examples above the higher the positive value of slope, the “worse” the trend because the data set represents the set of negative incidents occurring. If a subgroup has an increase in incidents, this is an increase in negative impact. However, if the data set is, for example, something like stock prices, the higher the slope the better – it represents increasing stock value.

Therefore, when considering the use of a TaaG table in analysis, take note of the data set and whether positive slopes are “good” or “bad” and sort the TaaG table accordingly.

Summary

TaaG analysis can be useful if there is a need to quickly and easily evaluate large amounts of data and compare trends across subsets of data. In whatever industry the improvement opportunity may exist, TaaG analysis provides a rapid first look at a data set to guide the practitioner into deeper analysis toward the root causes of issues and then enable corrective actions.

You Might Also Like

Comments 6

  1. Lew Yerian

    I wouldn’t make a decision on this data without knowing the R-squared values. Assuming linearity could lead to incorrect decisions.

    0
  2. Nik

    Brad,

    Thanks for posting the article. I may not use this so much for analysis, but it might have metric/visual management applications. For example, my company manages hundreds of products of varying size and complexity. At the supervisor level, we were looking at charts of individual product performance (<50 products each), however senior management felt this was too much information for their level (<1000 products each). TaaG could be a way to summarize some key statistics at a higher level so senior management can still see where to focus without looking at each of a thousand charts and identify where to focus.

    0
  3. Filiep Samyn

    So this is really nothing more than a simple linear regression which Excel and Minitab will create without using any formulas. Why do we have to introduce a new name for a standard tool?
    I do not see how this will steer to root causes, possible causes maybe.
    What do you suggest when we have multiple x’s that influence the y? TaaG will only work with a single or two x’s, beyond that graphical analysis breaks down. You stated large data sets. I would definitely want some software to crunch the numbers for me.
    When comparing slopes between regression lines you will need to use hypothesis testing, you cannot compare point estimates for the slopes to determine differences. As a minimum you will need CIs for the betas.

    0

Leave a Reply