# Mean or Median? When to Use

Six Sigma – iSixSigma › Forums › General Forums › New to Lean Six Sigma › Mean or Median? When to Use

- This topic has 9 replies, 10 voices, and was last updated 6 years, 8 months ago by Jeremiah Lewis.

- AuthorPosts
- April 27, 2010 at 9:34 am #53431
Hi

I am currently working on a project when the data type is discrete. i.e. Number of Errors. The data for errors is collected for past 7 months.

Can anyone let me know which type of measure should I use..Is it Mean or Median.

And also can you please explain the reason for the same.Thanks in advance.

0April 28, 2010 at 7:31 pm #190081I use the Median when my data is not normal and by its nature have outliers.

Unlike the mean, the median is not affected by the outliers (extreme points on the data).

An example of a process with outliers would be the salary by associate on my company: If someone ask me how much is the pay for the people that works in my plant and I use the mean for getting the data, the salary of very few people like Directors would inflate the metric. Instead, I would use the mean which will not be sensitive to the outliers, and I would have more reasonable idea of how much money most of the associates make.Hope this help you

Regards

Sergio0April 30, 2010 at 4:25 am #190096Can you really use the mean as a measure of central tendency with discrete data? A mean is used for continuous data.

0April 30, 2010 at 11:07 am #190098Hi,

The use of Mean/Median/Mode or any other statistics depends upon the type of the data i.e., whether it is nominal/ordinal/interval/ratio. Ratio data is the one where a true zero exists, hence errors is the count and there exists a true zero i.e., its starts with a zero. For more information on the type of data please check the below link.

http://en.wikipedia.org/wiki/Ratio_data

For ratio data all statistics can be used.

One small clarification. I am assuming that your project BIG Y is Count of Errors.

One can take count as the metric when the process is always constant. Example every month you process the same number of documents which result in say data entry errors. Its obvious that in this case as data entry documents increase errors increase (directly proportional). Hence we should be taking % Errors/Documents processed as the metric. Because when one drills down the BIG Y to let us say employee wise, document type here errors are all related to the underlying number of documents processed in that category, which may lead to wrong analysis.Just a thought keeping in view experience. Correct me if I am wrong.

Venu

0May 4, 2010 at 2:57 am #190111

abadi samosirParticipant@abadi-samosir**Include @abadi-samosir in your post and this person will**

be notified via email.I agree with Venu,

Mean/median/mode is used to measure central tendency of the data especially those gathered from measurement data (continuous data). But if you are dealing with number of errors (counting data) you will be likely have a poison or binomial distribution data, then use proportion or counting: if your sample size is the same, then use counting of errors while use proportion for different sample size.Abadi

0June 18, 2010 at 8:01 am #190350As pointed in earlier posts…do not go for mean and median here. Instead, check for sigma level of the process. Also, decide what you want to focus on – defects or defectives. Check the different type of activities within the process, may help you in narrowing down. Use appropriate control charts. Localization is important…

Remember – questions lead, tools follow. Why do you want to look for mean/median? If this is for gaging the potential or current state of your process then there are other better and appropriate methods for discrete data.

0September 18, 2013 at 4:13 pm #195947@katiebarry How did this get in here? If M updated this 1 day ago how come the date is from 2010?

0September 18, 2013 at 4:17 pm #195948

Katie BarryKeymaster@KatieBarry**Include @KatieBarry in your post and this person will**

be notified via email.@Darth — There was a spam message posted to the thread today that I removed.

0December 1, 2013 at 11:18 am #196304

Kimmy BurgessParticipant@cashinasnap**Include @cashinasnap in your post and this person will**

be notified via email.I wish to point out the fact that one of the common complaints against six sigma is that it seeks to search for variations and search for significant factors which lead to variation. It does not address the robustness of a process which would eliminate the need for searching variations. Therefore I suggest walk through the process and identify the weak links and then use the statistical techniques.You will acieve what you sought out to be achieved.

0December 30, 2013 at 7:26 am #196433

Jeremiah LewisParticipant@jerestat**Include @jerestat in your post and this person will**

be notified via email.Statistical tests using the median are usually considered nonparametic statistics. The drawback to these is that they hold less statistical power than their parametric counterparts.

I side with one of the best of the best (Dr. Doug Montgomery) when I say that it would be easier and all-around better to find an appropriate transformation if your data is not normal.

Also, while the F-test is not robust to non-normality and outliers, the T-test is quite robust.

0 - AuthorPosts

You must be logged in to reply to this topic.