# Sample Size

Mohammed Majam
I am currently in the Measure phase of my first Green Belt Project. I am very new to this and seeking some advice. This project is for certification.

Industry: – Medical => Clinical Trial Pathology laboratory.
Project: Data capturing errors and test requisition form variability during patient registration onto the Lab Information System.

In our planning, we were going to measure for 5 full days, approx 1000 requisition forms. However, after 3 full days of measurement we have done 400 forms and have already found 26% data capture error rate and 75% variability in the actual forms. When measuring variability, we are measuring it on six parameters. We believe the high variability might be the cause for errors, however, we are not there just yet. The lab does approx 3500 – 4000 forms a month.

We feel that we have gathered sufficient data for analysis. Is my sample size of 400 sufficient?

Suresh
Consider high-end of the range, 4000 forms/month. That adds upto 48000 forms/year. Assuming form is discrete data (Good or Bad), the sample size for discrete data would be 382 forms with a confidence level of 95% and a 5% margin of error.

Caution, pull sample size of 382 forms that is representative. Utilize data segmentation technique to get a representative sample that equals 382 forms.

Hope this helps …

Robert Butler
As written there really isn’t an answer to your question. The question is – sufficient for what?

You have indicated you are interested in looking at errors and you have an overall percentage for errors in the sample you have taken. That sample error rate is what it is but, other than some kind of overall count, I don’t see that it tells you much.

You have also indicated you are “measuring variability…we are measuring it on six parameters.” What exactly does this mean?

Questions:

You are looking at the result of path lab reports. Based on the numbers you have provided it looks like the lab is turning out 133 reports per day and at 4000 a month this would suggest a 24/7 kind of operation.

1. Are all of the reports exactly the same thing?
2. If they are not, are the mistakes to field entry on the forms the exact same fields and are they physically located in the same place on each form (i.e. top right hand side of form – first X number of lines/boxes)?
3. Is the incoming flow of work constant across the week or does vary?
4. Does any mistake anywhere on the form have the same weight in terms of severity of consequences as any other mistake anywhere else on the form.
5. If it is a 24/7 effort are there different shifts of workers?
6. If the reports are different – what kind of frequencies of occurrence are you facing?

If you can provide answers to these questions perhaps I or someone else could offer additional comments.

