# Estimating Duration from Binary Data (momentary time series)

Six Sigma – iSixSigma › Forums › General Forums › General › Estimating Duration from Binary Data (momentary time series)

- This topic has 2 replies, 2 voices, and was last updated 9 years, 1 month ago by Pace.

- AuthorPosts
- February 22, 2011 at 3:06 am #53739
Hello there,

I am evaluating the efficiency of our workshop. The more time our machines spend cutting, the better. The question I have is to

**statistically estimate “how much time (or % of time) each machine spends cutting every day? – ie duration”**In each machine there is a program which poles every 3 minutes to evaluate if the machine is cutting material

**(1)**at that point in time or not**(0)****(momentary time sampling)**. I have a lot of data as my logging starts runs daily from morning to night. Looking at an example, the below data was collected from 9:00AM in the morning onwards. At 09:06 my program logged that my machine was busy cutting at that particular moment in time.9:00,0

9:03,0

9:06,**1**

9:09,**1**

9:12,**1**

9:15,**1**

9:18,0

9:21,0

9:24,**1**

9:27,**1**

…Initially I thought answering my above question is as easy as

**calculating the average of each day**and that would be a % estimate of duration. So for example the above would be 6/10 and I would say “the machine cut for 60% of the time today” or .6*8=4.8 hours But then I asked myself,**“what’s the confidence interval?”**and realised this couldn’t be the right track.I thought a

**control chart**would be better, ie get the average of every hour and then put those values in a control chart. However, this can logically never be in statistical control as each machine does not cut just one type of part. It varies widely with each sequential part.**Which model is most appropriate**for this kind of data so that I may answer my above question and later demonstrate improvement over time? Perhaps**some sort of logit function?**Lastly, if anyone has an

**effective/creative way of graphically communicating**this binary data to a non-mathematical audience I’d really appreciate your suggestion(s).Thank-you in advance,

Michelle0February 24, 2011 at 10:06 am #191293

Eric MaassParticipant@poetengineer**Include @poetengineer in your post and this person will**

be notified via email.Hi Michelle,

One way to handle this is to extract the time for each sequence of 1’s bookcased by zeros’ before and after. So, for the short set of data you provided, if I understand correctly, it seems that the equipment started working sometime between 9:03 and 9:06 AM, and stopped cutting material sometime between 9:15 AM and 9:18 AM.

In the reliability evaluation world, this would be referred to as interval censored data.

In terms of your efforts, this sequence shows a processing time that is most likely around 12 minutes, but with a range between 9 minutes and 15 minutes.

Can you extract these numbers for the full set of data you have?

If so, you probably estimate the overall mean from the data, but estimating the standard deviation would probably be best done using a technique from the field of digital signal processing / digital filtering – a method called dithering. If the data is normally distributed, those steps would probably be sufficient. If the data is not normally distributed, you would want to try to identify the type of distribution – I’d probably suggesting using Crystal Ball (an Excel add-in) for that purpose.

How about this – if you can extract the most likely times from the data following the example I did at the start of this, and if you end up with at least 50 values that vary a fair amount, you can email me that data, and I’ll show you how to analyze it step-by-step.

Best regards,

Eric Maass

Master Black Belt and Senior Program Manager, DFSS / DRM

Medtronic

[email protected]0February 25, 2011 at 3:49 am #191298**poetengineer wrote:**In the reliability evaluation world, this would be referred to as interval censored data… If so, you probably estimate the overall mean from the data, but estimating the standard deviation would probably be best done using a technique from the field of digital signal processing / digital filtering – a method called dithering.

Can you extract these numbers for the full set of data you have?

Hi Eric, That sounds great – none of which I have even heard of before so even better. (so much for my stats background which seems to be ever proving the extent of vast gaps in both my memory and knowledge!)

I have not been able to think of a way to extract the data in either minitab or exel. I’ll write a small little script tonight and send the data through on your email address. Just bearing in mind however that, for example, an actual solid 30 minute cutting event can look like this: 1, 1, 1, 1, 0, 1, 1, 0, 1, 1

A big thank-you, I really look forward to learning about this technique and the results of the data. Until this evening then, Michelle

0 - AuthorPosts

You must be logged in to reply to this topic.