Performance – Regression
- May 15, 2009 at 1:58 pm #26865
MrMHeadParticipant@MrMHead Include @MrMHead in your post and this person will
be notified via email.
I am in-house trained BB, and now my role is in IT metrics – to put it simply.
I’m thinking of using Minitab’s regression tests to find What’s Important to application response time.
I’ll have the app mapped out through all the systems it touches, get data collections, etc.
The one thing I am not sure about is the data collection.
With several different systems, several different measurements, they are not all going to be collecting data the same way.
Some questions are:
Do all metrics have to report at the same frequency?
If we cannot get to the same frequency of collection, do I use subgroup means to reach a common interval?
What is the impact if there is “drift” among the times of measurements? (ie component A measures at T=00:01 and comp B measures at T=00:02)
Does anyone have experience with this type of analysis?
Any other suggestions?
M0August 7, 2009 at 6:06 pm #65299
RemiParticipant@Remi Include @Remi in your post and this person will
be notified via email.
Im not in the IT but assembly production. We often do analysis with data gained at different times.
not the same frequency: Possible but not recommended. The consequence is that you can only investigate the C&E relations seperately and not together.Try to perform a short experiment in which the frequencies are made equal for the time. (or use data on the frequency where they occur together: the lcm).When you use subgroep means; the stdev’s will be changed!! so be very carefull there.
drift: measuring on subsequent times is not much of a problem; register the time and use it as an extra X in the analysis.
Example of physical production: a Product has 3 process steps: 1-forming; 2-finishing and 3-painting. Total production time = 1 hour. At end the Y=CTQ is measured; for each process step an X is measured.
We want to investigate to which of the 3 X’s the Y is most sensitive.Evaluationtool = Multiple Regression.
Way of Working:We gather data from 33 products and for each product we have the value of X1,X2,X3 and Y but also T1,T2,T3 and Tend (the time that measurement was done). The 3 T-parameters have different values for each product because the X’s and Y are measured on different moments due to the consecutive process steps. So we generate a datablock of 33 rows (the products) and 8 columns (all the measurements on each product).
First a screening of the data is performed.- it is seen that the last 4 products had a big gap between T2 and T3 (> 1 week). So apparently for some reason they where not following the standard process flow. We take these products out because they show a different production pattern than standard production.- Also 1 product has a very large X1 value (>10S from Mu) so we take that flyer out.- the X-space is covered well (the X-value’s from the dataset cover the area of X-value’s that we encounter during production)
1> the 5 taken out products are investigated for the cause of their different behavior. if possible the causes are prevented form ever occuring again or a measure (=see that it is happening again) and repair program is implemented.
2>the other 28 products are investigated for relations between X’s and Y. With Regresssion we can see that X1 and X2 are the only significant ones so we get a Regression Model Y= c0+c1*X1 + c2*X2. The residuals show nice noise behaviour and the R-sq is >80% so we don’t have a reason to look for another X; X1 and X2 predict the Y very well already.
Note: in IT maybe an R-sq of 80% is not good enough OR it is automatically 100% because there are no random noise influences?? I don’t know enough about this so check this with somebody who know’s
hope this helps,
The forum ‘Software/IT’ is closed to new topics and replies.