Remi
@RemiMember since October 8, 2008
was active Not recently activeForum Replies Created
Forum Replies Created

AuthorPosts

June 15, 2010 at 2:51 am #190341
Hai gsx,
Why did you do this? ie why do you want to make the data ‘Normal’ ?
It doesn’t help you solve the problem.
Counted data will not be normal distributed in general.And no: I would not trust any data manipulated in this way.
Remi
0May 13, 2010 at 6:11 pm #190150Hai Awylan,
whenever I have a data set that contains a lot of 0’s I split my capability analysis of the CTQ in two.
CTQ1= % nonzero’s; Target is very often 0; goal of project part1 to reduce average value and variation of CTQ1
CTQ2 = CTQ if CTQ is nonzero. target = 0; goal of project2 to improve Cpk and reduce values.
By splitting this up in tow problems the data is easier to investigate.Good luck,
Remi
0February 1, 2010 at 4:20 pm #188912hi BMWSAND,
your formula assumes that the data only has variation caused by the measuring process.
Remi0January 29, 2010 at 12:49 pm #188849Hai The Kid,
You can’t. A gage r&R study is an experiment you do to investigates the quality of your datacollection/measurement. Unless you can perform that experiment you cannot do a gage r&R.
This doesn’t mean that your old data is worthless. only that you can’t proof that it can be trusted.
So the only way forward is: Write down in your report that a gage r&R is not done and that all conclusions from any analysis on these data has to be further verified by an experiment (for which a gage r&R has proven that the data from the experiment can be trusted).
Remi0January 22, 2010 at 7:47 am #188576Hi Mark,
as a first approximation you could sum them up.
More correct is to compensate for products that have more than one characteristic failing. If the characteristics occur independently of each other you get the following calculation (if not a similar formula can be derived but you have to count the multiple occurences).
Pi = P (char i failed). Then P(reject) = P( at least one char failed)= P1+P2 +P3+P4 – SUM PiPj + Sum PiPjPk – P1P2P3P4.
Note that if P1= 600 ppm =P2 then P1P2 =0.36 ppm so ‘forgetting’ to count the products that have more than one characteristic is in your calculation resolution.
Remi0January 18, 2010 at 1:44 pm #59659Hi outlier,
No, Boxplot does not use normality; it is distribution free. What happens is: IF Data is Normal distr. THEN (1.5*IQR and 3* Stdev give the same value) and (Median and Mean give the same value). So in that case points further than 1.5*IQR happen to cross the +/ 3S line.For detecting potential outliers mathematicians have calculated what a treshold could be: the 1.5*IQR of boxplot and UCL/LCL of contral chart are just these. But remember: the treshold indicates only that “an occurence like this is very unlikely to happen (in a stable situation) so when it happens we should investigate if there is a special cause for it to happen”.If you have multiple populations (e.g. batches with shifted Mu) or a changing variation over time (e.g. heavy wear) the calculation goes wrong because the situation is not stable.There is no formula for ‘ís this an outlier?’
Remi0January 18, 2010 at 12:48 pm #188455Hai Mike,
You already wrote in your message a solution: the problem (as seen by you) is created by Sales that has to get to target in Q4.A simple (uncacceptable by Mgt?) solution is to arrange for Sales Quarter targets. Then your Year problem changes into a Quarter problem. If that is still too rough change it into a Month problem etc.
Better is: make a description of Current and Wished situation (production). Make visible what the difference is in Money (or another CTQ/KPI that Mgmt is interested in). Then show them that Sales is part of the cause of the Current and other solutions cost more.
Good luck, Remi
0January 14, 2010 at 12:12 pm #65366Anshu,
for a Six Sigma project you generally only use the tools that you need (for making the project a success). Since nobody knows beforehand which those will be a lot of them are covered in the GB course .
BUT
if you are doing a six sigma project to get certified, the certifier wants you to show him/her that you are Capable of doing Six Sigma projects. And then the EXTRA requirement is that ‘several of the (statistical) tools are used in that project’.
So if you use ‘none’ of the statistical tools it is possible to be a good GB project but that it will not satisfy the certifier (and therefore will not help you getting cerified).
Remi0January 11, 2010 at 9:40 am #188207Hai MBBinWI,
thanks for the advice. I didn’t see that Oops until you mentioned it. You are right.
Remi0January 8, 2010 at 2:22 pm #188123Brody,
if your supplier would supply you their components in a bimodal form would you be happy? Even if all products were within specs?
At the least you would want to get them in two batches. In that case you could improve the process by taking into account the move of the Mu (=between batches variation). For each of the batches a standard Cp(k) calculation would be satisfying. BUT YOU MAY NOT COMBINE THOSE Cp(k) VALUES!!!!Later on you would want that supplier to remove the variation source that caused that move of the Mu.
So talk with your customer and ask him how much it is worth to him that you improve in this way. Our customer was very happy when we could supply him the components in batches with smaller variation and he had no problems at all with a moving Mu because he could correct this with a simple resetting of the Temperature of his oven setting as long as he knew the Mu of each batch. Later on we were even able to fixate the Mu and he was even happier.
Remi0December 30, 2009 at 11:23 am #59654Hai Outlier,
distributions don’t have outliers, whether normal or not.
If you mean “how can I see if a datapoint is an outlier in a data set that does not follow a normal distribution”:
– make a graph (dotplot, histogram,…)
– if you see a point ‘far away’ (purely subjective eyeball mark1 measurement) investigate WHY that point is different (find the physical reason, not mathematical)Hint: especially the boxplot works well for this; points ‘far away’ are marked as seperate stars in the graph.If you know WHY you can decide wheter to remove it from the data set or not. And redo with the reduced data set
– check if the data set fits any known nonnormal distribution (with and without the suspicious point). Look up for that distribution when points are suspected to be outliers and find out WHY again.
Hope this helps,
Remi0December 29, 2009 at 12:46 pm #57787Hai Choochor,
make a process mapping (=flow chart) of the process steps.There is 1 input location (=entry).Put 1.000.000 product(part)s in the process entry. Write down for every process step how many parts (of the 1.000.000) enter that process step (calculate from parallelsplitup) and how many leave the process step (calculated from RTY of that process step). So you go through all the possible process steps from start to end.There is 1 output location (=exit).
The amount of good products that arrive there is your RTY ppm.
Remi
0December 29, 2009 at 10:25 am #187792Hai KD,
the correct answer is: it depends.
If y = air usage, it does not matter how the holes are placed. The air usage is the Area of all the holes together.
If Y = cooling strength on a small area, then situation with holes all near the centre gives a different result than holes on the border.
So the question is: what is your Y=CTQ and how is it influenced by the X’s. The function itself is just a mathematical model for this relation.
Remi0December 29, 2009 at 10:08 am #187791Hai Darth_X,
here is a way to do it.
Input is LSL, T(arget), USL.
1] you have to choose a formula for the lossfunction. THE LOSSFUNCTION does not exist (as far as I know); an often mentioned one is the quadratic form L=K*(yT)*(yT) where the K is the lossfactor. If you have T = (USL+LSL)/2 (i.e. in the middle) you can determine K if you have decided what the value of L should be for Y=LSL. If you want a different loss value for USL than for LSL you could add a term C*(YT).
2] you have to choose the interval [a,b] for your graph; a is often chosen smaller than LSL and b is often chosen larger than USL
3] In XL create a column with Y values: a+ k*d for k=0,….(ba)/d.Choose d small (d is the distance between the Yvalues; i.e. the resolution of your dots that form the graph). I often start with d=0.1 and lower it if the resolution is too coarse.
4] Calculate the column of the L values from the column of 3] based on the chosen formula from 1].
5] make a line plot from both columns with 3] as Xaxis and 4] as yaxis value
Hope this helps, Remi0December 22, 2009 at 9:13 am #187688Hai Anthony,
Ask your experts to give a score on the quality of the mark on a scale of 1 to 10: a “5” is considered perfect marking; a “1” is for ‘burned’ and a “10” is for ‘no mark’. In this way you have a ranking of the laser marking from ‘too much’ (1) through ‘good’ (5) until ‘too little’ (10).Use this score as the Y of your DoE analysis and optimize to a Target of “5” (you don’t want the maximum or minimum Y). Don’t forget to check the residuals.
For finding the perfect optimal setting this method probably is too crude but for a screening it will work all right.Maybe you even can finetune with decimal points if all products are between “4” and “6”.
Good luck, Remi0December 22, 2009 at 7:22 am #187686Hai Boon,
I don’t know your template but general the answer is NO.
First a warning: be carefull with the sentence “to measure the Gage R&R of the QC technicians”. Operator is only part of the variation sources. A gage r&R investigates the consequences of all variation source of measurement variation (that happen during the experiment). So if you see that Operator 6 shows more variation the reason for that could be the bad SOP and not the person!
Back to your question. I expect that if you measure the r&R 4 times (4 different operators) this will show up as repeatability in the calculations and that is wrong.
Generally one uses a sample of 3 operators from the population of 12. How to choose the 3 is up to you but try to cover the variation of the operators as best as you can (i.e. if it is a visual test use one operator with glasses/lenses and one without; use an old/experienced and a new operator; use a woman and a man; …). The idea is that the sample of 3 is ‘representative’ for all 12.
Remi0December 10, 2009 at 2:48 pm #187460hai Darth,
in my answer I mentioned that the residuals should be normal distributed. Y= f(x) + error; the ANOVA analysis can be usedIFF ((error is normal distributed) AND (equal variances test is passed)).
I realize that “error is Normal” is not exactly the same as “each of the groups is normal” but don’t remember enough of my statistics to say which is more correct. Most of the times there is no difference, but once I encountered data where the residuals together were Normal distributed but the two groups were not. We found out that this was caused by how the line was split up in two parallel lines (the 2 groups)due to the weights of the products. We corrected for this and solved the problem.Remi0December 8, 2009 at 11:34 am #187396Susan,
So you have 2884 lengths of fishes divided over 8 groups.
First make a picture: boxplot and decide if you want to perform analyssi on all the data or if some of the points should be analysed different.
Then do 1Way ANOVA.
The data itself does not have to be normal distributed (and often is not if there is a large meandifference); but the Residues should be (this is ‘the check for Normality’).
You also have to check for equal variances. If the variances are not comparable the pvalue outcome of the 1way ANOVA should not be used. I generally recommend to postpone the meanissue until you have resolved the sigmaissue (but some others of this forum do not agree with me).
Remi0November 26, 2009 at 2:35 pm #187090Sorry Sathya, but your explanation of boxplotconclusions is WRONG.
Duplicate the datasets sever times (make sample Size artificially high). The boxplots will not be different; the Mean and StDev do ‘not’ change; but the pvalues will get arbitrary low: the larger you make the N the smaller the Pvalue gets.
Remi
0November 26, 2009 at 9:21 am #187085Euh Max,
I expect that all your managers will have moved to a new job before you have enough data with only 4 data/year.
Forget this approach and put all your effort into finding a way to get data more often (and of the contiuous type). The good news is that you have at least 3 months before the next datapoint makes your question a little bit more urgent.
By the way: you cannot calculate sample size if there is no variation. So you have choose how many zero’s there have to be behind the percentage decimal point.
Remi0November 26, 2009 at 8:43 am #187084Hai K,
yes and No. You could do it but the 3d factor may influence the result as a lurking factor.
Example.
Suppose your ‘real (unknown) model’ is: Y= X1 + X2 + 20*X3 + 100*X1*X3If you leave out the significant X3 two things happen (both bad for conclusions):Your SSE (error=noise) gets large due to 20*X3 so your Rsq will drop to a low value
Your X1 influence may look large due to the X1X3 interaction so the X1 will have a low pvalue as a result
I recommend that you do the Analysis on 2X (costs only PC time) and check how the Residues look and how large the Rsq has become. If everything is OK the pvalue of X1 can be used otherwise it is only a weak indication (and you already have that now).
Remi0November 25, 2009 at 2:24 pm #187049Hai marife,
you don’t.
For deciding which distribution fits best with your data you use the pvalues that you get from that same investigation (H0: My data distribution is equal to the tested distribution). And in general it is the same as with Normality test: if the pvalue is high your data fits the distribution well; if the pvalue is low it doesn’t fit.
Since you have probably several high pvalues the next question is: who is the winner?A mathematical answer would be: the one with the highest pvalue because that one fits best.A more practical answer: among the high pvalues, the distribution that you like the most AND/OR agrees with the idea’s on how the data should behave.
Remi0November 25, 2009 at 2:15 pm #187046Hai K,
Remark: since you didn’t include centerpoints you can not check if the linear model is correct (maybe a nonlinear model would be more correct).
My assumption is that for each run you performed a break test and measured the load value at the point of breaking.
Due to physics we can (=I) assume that for the nobreaks the load value of the moment of breaking would have been higher than the measured value of the DoE. So the Y value in the DoE is for these points an under estimate. As a consequence: a significant X would stay significant if we would do the correct DoE. You can simulate this by replacing the nobreaks with a load value Y that is 10% higher than the one measured. This simulates that you just missed the breaking point. You can then see what happens with the DoE results. WARNING: when you do this you are inventing results. So only do this to get a feeling of the behaviour of the X’s on the Y and not to get real results.
There is also the possibility that an X that you found to be insignificant happens to be significant in reality but due to the problems during the DoE you can not see this anymore.
Remi0November 25, 2009 at 1:55 pm #187045Hai UKB,
Question: are the intermediate weights important for you or the customer? If not then the standard paired test is applicable. You could also perform the paired test for each pair of months (1,n) for n=2..6 in order to see on which month the average weight loss is significant.
But remember: with statistics you only will prove that the difference is significant not that it is relevant for the customer.
Other Idea’s:Perform Regression on each Customer to see the time influence and possible predictions. You could do a paired ttest on the coefficient of the linear model (but you will then ignore partially the within (customer) variaton)
Recalculate to coded values / Recalculate the weight to %weight (relative to startweight): %Weight(n,m)= weight(n,m)/weight(1,m) for n=1..6 and m=1..43.
Good luck, Remi
0November 25, 2009 at 7:59 am #187039Hai Danial,
like Darth said: what are you talking about? In your message there is a contradiction: Paired comparison can not be done with different sample sizes. Look up the definition of paired comparison.
I can guess that you might have meant that you have several subgroups of paired data and the subgroups have different sizes and different means. But that’s just guessing on my part.
So state your question more clearly and maybe we can help (post the data set if it is not to big or an example).
Remi0November 24, 2009 at 9:12 am #187011Hai MB,
first of all make all the pictures/graphs (create extra Y column with 0/1 for good/bad for Matrixplot) to get a feeling for the data. Don’t forget to use time as an extra X.Decide which data you want to analyse based on the info from these pictures. Check if the data of the X’s cover their possible values well (i.e. sample is representative of population). And maybe some X’s are corelated which could be visible from the Matrixplot or found by Principle Component Analysis. Also possible is to try a socalled reverseboxplot (be carefull when drawing conclusions because Cause and Affect are shown in reverse): fill in an X as YinBoxplot and use your Y for subgrouping.
Analysis: Use either Chisquared analysis if your X has few different values (up to 10)or Binary Logistic Regresssion on all X’s.
Remi0November 23, 2009 at 9:56 am #57781Hai Ajay,
there is no reason that a forecast should be normal distributed. And analysing a forcast on its own makes no sense to me. Why would you want to do that?For Forecast Robustness: look at Prediction Interval and Confidence Interval that you get from the regression analysis. They give you info about robustness.
The general answer is always: the further your forecast is away of your datapoints the more incertainty you will have. And of course the main assumption is that your model of the past (data) is a good prediction of what will happen in the future (that is why it doesn’t work on stocks).
Remi0November 17, 2009 at 11:25 am #65357Hai Sri,
No problem; I am not an IT expert so don’t know what CMMI is.CTQ can be translated into ‘variable that is important to measure’. It generally is a parameter of which your customer has decided that it is important to know what it is doing.
Explanation by example.
I want to measure Delays.Each time there is a delay I record the # of (micro)seconds of delay; when there is no delay the recorded result is 0. Values smaller than 0 are not expected.
I will see two groups of data: group1 where there is no delay (the 0’s) and group2 where there is a delay (values greater than 0). The better my process is the more data will be in group1 instead of group2.
As quality measurement I can evaluate what is happening inside group2 and try to improve the amount of delay (reduce the mean delay time). So I improve INSIDE group2. (CTQ2)
Inside group1 I have no info (all values 0). So I cannot improve there (and no need). But what I can try is to change the sizes of both groups relative to each other (improve such that more products become of group1 instead of group2). This is what I mean with CTQ1. So my parameter I measure is over a certain time period the Size of Group1 (and or group2). So for each time period “# datapoints in group2″/ “data points total” would be also a measurement for my quality. And this measurement can not be smaller than 0 (LB= Lowerbound=0). But also my Target would be 0 (I want no delays at all). But maybe for your project you would choose a small value instead because ‘no delays at all’ is not possible. You customer generally will not want to give you an upper spec on delays (USL) because if he did he would tell you that any delay smaller than that USL would be acceptable.
Does this help you?
Remi0November 16, 2009 at 12:07 pm #186834Tom,
It is the other way round.
Whenever taking an S (st.dev of Sample) the Confidence Interval around that S is large if the sample size n is small.
A tradeoffs between ‘what is needed (based on statistical calculations)’ and ‘what is possible (based on expense and reasonability decisions)’ leads to recommendations on n that often are remembered by students as ‘laws’.
One is: if you use XbarS chart then n should be at least 10.Another one is: if you calculate Cp(k) then n should be at least 30.
In your case: if n10 the Xbar S is a trustworthy file.
Hope this helps.0November 12, 2009 at 3:03 pm #186776fara
this is the end of a chisquare test which gave as outcome a chisquare value of 2.258 and degree of freedoms equal to 3 (so you had 1 column with 4 entries OR 1 row with 4 entries) and the calculated pvlaue = 0.521 (sig stands for significance level).
Look upo CHisquared test or proportion testing to see what it all means.
Remi0November 12, 2009 at 2:59 pm #186775Hai Jeff,
Chisq value = SUM over all cells of {(the square of (observed value – expected value)) divided by (expected value)}.You can see this also in the Help when you choose the CHITEST function in XL (click on fxbutton and then on ‘help with this function’).
But for this you need the original data.
If your input is pvalue and df use the function CHIINV instead.
Remi
0November 11, 2009 at 1:56 pm #59646Hai Arturo.
Such a test only will divide your products into 2 groups: passed (1) and failed (0). The test is not very good for evaluating your products and process behaviour because you only can analyse groups of samples (how many 0’1 and 1’s you have over time).
I recommend you to change the test in the following (easy?) way:Put a stress of 400psi on the product and measure the moment of time it fails. Stop the test at t=5 seconds. In this way you will generate a continuous measurement that is rightcensored (tmax=5 because after 5 seconds you always stop measuring) . This data can be analysed like Reliability data.
For a measurement system analysis you have to investigate 2 things: First how well is the setting of 400psi each time a product is tested and Second how accurate your time measurment is.
Remi0October 26, 2009 at 2:29 pm #186375Zubdah:
you are thinking wrong and missing the point of what Cp(k) is intended to do.Cp= “Voice of Customer” / “Voice of process”.In order to get a value you fill in USLLSL for “Voice of Customer” and fill in 6S for “Voice of Process”.
If you choose USL = Mu +2S then you choose the specs based on “Voice of the Process”. THIS IS WRONG!As a result Cp and Cpk will ALWAYS be 2/3.
You should instead ask your Customer what according to him are the correct USL and LSL and calculate the Cp for him based on his choices (each customer gets his own Cp !!!).
If the value is too low to your liking you have 2 options: talk to the customer if maybe larger specs can be allowed (he solves the problem) OR improve your process by making S much smaller (you solve the problem).
Remi0October 6, 2009 at 3:20 pm #185946Hai Fairy,
something strange in your output: the point estimate is not inside the CI. And the Ci is inverted (3,26) would normally be written as (26,3) (= all real values bigger than 26 and smaller than 3)
Explanation output: (can be found in any Minitab output by clicking in the output and then clicking on the Special help (next to the ?button; it is the purpleandyellow SIGMAbutton)
With Eta is meant the Median population.The MW test looks if the data complies with the H0: ETA1ETA2=0It gives a pvalue outcome: the value after “significant at”. So here the pvalue = 0.0000.In general that is smaller than any alfa level you would have chosen. So the H0 has to be rejected. So the data does not agree with the H0.So Median1 is not equal to Median2 (population values).
I worry about the CI though; if a mistake happened which gave that strange CI this same mistake could give you a wrong pvalue and thus a wrong conclusion.
Did you graph the data to see what is going on?
Remi0September 30, 2009 at 8:08 am #185816Lynn,
you can also use Dotplot: it shows individual points.Use Boxplot if you are only interested in potential outliers/flyers.Use timeseries graph if you want the order of data to be included. Use the IMR controlchart if you want info about control.
Remi0September 22, 2009 at 2:22 pm #57772Hai Jody,
Ah now things are clearer.
# data appears to be large enough. In the Scatterplot you should see a point (#18) far away from the regression line.Maybe analysis is improved when this point is removed from the data. But this may only be done if you can find out why this point shows different behaviour than the other points.
Both Correlation and Regression analysis say that there is a reasonable strong relation between Y and X (pvalue=0.005).
But r=0.367 only so the relation is not strong. In Industry often 0.5<r<0.5 is considered as useless.The Regression agrees because it says Rsq=13.5%.So this X is NOT THE ONLY X that influences the Y (assumption: the X influences the Y at all). This X only explains an influence of 13.5% on Y and there iss till missing 86.5%.
Next: check if Rsq goes up when not using all 65 points (visual screening of the data; do they form several subgroups that should be analysed seperately). Find out what other X could possibly be missing (this is why DMAIC6 is thinking about X’s first and analysing patterns later).
If you have that X: do you have measurements and perform Regression again with the new X. If you’r unlucky the X you have started with shows up in a Pareto as #1 and you have a lot of other X’s with even smaller Rsq. Note: If you have >1 X analyse them all together because interactions can play a role.
So you did nothing wrong (as far as I can tell) you just happen to have an X with a weak influence on Y.
Good luck, Remi
0September 22, 2009 at 1:37 pm #57770Hmm, Jody;
If you mean that the Correlation disagrees with the Regression analysis, I think you must have made a mistake somewhere.
If one has 2 columns of data in Minitab and performs Correlation (Stat => BasStat=>Corr) and Regression (Stat=> Regr=>Regr) one will get the same pvalue (because essentially the same H0analysis is done). And also the square of the Correlation value (r*r) will be equal to Rsq from the Regression.
It is possible that you know from Expertise that there is no (known physical) reason for Cause and Effect Relation and that the Correlation Analysis shows different, but that is something ELSE.Apparantly the data shows a relation anyway. Now you ‘only’ have to find out why (sometimes it’s your data collection plan; sometimes it happened because of special purposes and sometimes it only looks that way (often your sample is too small)). And sometimes you get a new insight; the theoretical expertise was wrong in this situation (most often because one of the assumptions was wrong).
How many datapoints (2 datapoints show always a strong correlation)? What are the Minitab outputs?
Remi0September 22, 2009 at 1:18 pm #65331Hi ZH,
see articles on Gcontrol Chart (search Main page)
Other option is: define 2 CTQ’s.
CTQ1= % of nonzero’s over longer time period; and treat it as you would reject level (no USL, LB=T=0)CTQ2= count (or %) if >0 over the time/amount already chosen and analyse this as reject level.Both together give an indication of quality. Danger is in analysing CTQ2 over time because of gaps in time axis (each time there was a 0 in your original data).
Remi0September 22, 2009 at 11:53 am #57768Hi jody,
How do you know they are correlated? If from Data then the same data should give you a good Regression relation (mathematically true).If from Expertise then find out why the data does not correspond with what you know ‘has to be true’.Be aware that your way of data collection can kill any theoretical relations.
Check data with Expertise by making a PICTURE (scatterplots)!Remove all data that is ‘wrong’ but only if you know why!Can you trust the data to be correct (gage r&R,…)Check that data covers whole range of possible values (too much zooming in or out or taking a specific subsample can give totally different conclusions).
If relation is not linear then you have to think how to redefine your parameters (BoxCox maybe).If 2 X’s use 3D Scatterplot; if >2 X’s it gets complicated.
Remi
0September 22, 2009 at 8:28 am #185591Hi James,
if your measurement system is perfect:with Power=0.5 you need 11
with Power =0.9 you need 26
Minitab: Stat => Power&SampleSize => 2 proportions; fill in 0.5 and 0.1 (worst case) and Power.
Remi0August 21, 2009 at 2:05 pm #184956Hi Edward,
Combine drugs with work and you will be replaced by your boss.
Remi0August 20, 2009 at 11:52 am #184935Hi William,
an alternative you could use comes from data where Meas resolution was much worse than production made. As a result 90% of the data had value 0.
We used 2 Measures to describe quality: “% of 0” and “Cpk of the non0’s”. Both are analysed seperately and both together are necessary to show the quality (so not 1 value shows all).We made pchart of the first and ImR of the second to see behaviour in time.
Warning: be sure that when the measurement device gives an FID signal that this is a ‘perfect’ product and nothing else.
Clarification:The CTQ (Leakage) had 0 as Target AND Lower Bound; and 10 as USL.Theory said there was always some (minimal) Leakage: so 0 was theoretical impossible.The meas. equipment had a resolution of 0.1 (a factor 100).
Production made products with value <0.1 (so measured value =0)when everything was ok, but sometimes made products with values 0.1 … 7 and the occasional disaster failures when a disturbance had happened of 8100.The meas. System was not good enough to distinguish between all products, but good enough to measure for the customer. It was decided as acceptable on Toloption and not on %SV because both we and the customer where not interested in distinguishing products with a value <0.1.
Hope this helps,
Remi0August 20, 2009 at 11:28 am #184934Hi Carlos,
you draw the right conclusion based on a wrong reasoning. (you lucked out; if you want to keep on gambling buy lottery tickets)
You should conclude: pvalue High (> alfa=0.05 ) and therefore does the data give no proof of difference (in short: not significant).
ANOVA explained (look it up in your coursematerial):
SSFactor = “Strength of the signal of difference in the data”SSError = “Strength of the noise in the data”Each can become any possible value (depending on the data). Each on its own is not enough.
F= (SSFactor/dfFactor) / (SSError/dfError) = “the signaltonoise ratio in the data”. This is a measure for significance.pvalue = “probability of getting this Fvalue or higher in case of No Difference (=H0 true)”.
“Oldmethodology” students have learned to work with Fvalue; “newmethod” students learned that the software will give a pvalue and to draw conclusions on that.
Remi
0August 13, 2009 at 8:08 pm #184841hai mike,
if 3level design (cube with middle of faces/ribs): analyse with Stat>DoE> Response Surface Design. Analysing is comparible to when using Stat>DoE> Factorial. only ABC and higher order terms will not be present and A^2 (written as A*A) etc will be. Analysis wil give coefficients for each of the terms in the quadratic model and pvalues on the significance of the term if extra d.f. present. Removing of terms to simplify the model is also the same. The optimizer can find optimum inside the area investigated (top of curve). Residuals are analogous.
If at least one term> 3 levels: bad luck No higher order polynomial model can be calculated. Only values at investigated points will be shown and straight lines will connect the values. Analysis can be done via Stat>DoE> Factorial but you have to choose General Design when defining the design.
‘Dirty’ Trick. Even when you have >3 levels you could analyse it as either quadratic (use RSM) or linear model (use Factorial). Just define the data as such a design. Difference is wether A*A or ABC will be analysed as terms (you could try them both to see which one you like best).When analysing it this way Mintab will ‘complain’ that the design is not balanced; botched runs etc; but the analysis will be ok.In fact you will force Mtab to fit the data to a quadratic or linear model. So if your data is in reality very different from such a model the model that you find will have no predictive value (=be of nu use).
Good luck,
Remi
0August 13, 2009 at 7:37 pm #184838hai Angus,
with a gage study you investigate the gage system (gage eqt + persons that use it + environment +..).
You only need parts because otherwise you can not measure. But the parts are not under investigation (at this moment in the project).Most people see a lot of problems because they are measuring parts that change during the gage study.Solution: don’t use those parts; use other ‘parts’.
Example: I have to measure oven temperatures. Those vary in production. During my gage study I take a diverse set of materials with a known melting or cooking point (to ensure that i know the real value the measurement should give). On those materials I perform the gage study. If the outcome is good then clearly I can measure my oven temperatures well.
Necessary conditions:the way the measurement is performed is the same (reality and Study)
the range of values investigated should be representative
the influence factors that differ between reality and study should be investigated (e.g. placing the thermocouple in the oven is ‘difficult’ to do consistently while on my materials this is easy)
good luck
Remi0August 7, 2009 at 6:11 pm #65300Hai reinaldo,
yes you can. But Tony already mentioned that if you do it on ‘unimportant’ parameters you are probably wasting time and resources.
And of course check that you are using the right one (e.g. an XbarR chart on # rejects is not such a good idea in general).
Remi0August 7, 2009 at 6:06 pm #65299Hi MrMhead,
Im not in the IT but assembly production. We often do analysis with data gained at different times.not the same frequency: Possible but not recommended. The consequence is that you can only investigate the C&E relations seperately and not together.Try to perform a short experiment in which the frequencies are made equal for the time. (or use data on the frequency where they occur together: the lcm).When you use subgroep means; the stdev’s will be changed!! so be very carefull there.
drift: measuring on subsequent times is not much of a problem; register the time and use it as an extra X in the analysis.
Example of physical production: a Product has 3 process steps: 1forming; 2finishing and 3painting. Total production time = 1 hour. At end the Y=CTQ is measured; for each process step an X is measured.
We want to investigate to which of the 3 X’s the Y is most sensitive.Evaluationtool = Multiple Regression.
Way of Working:We gather data from 33 products and for each product we have the value of X1,X2,X3 and Y but also T1,T2,T3 and Tend (the time that measurement was done). The 3 Tparameters have different values for each product because the X’s and Y are measured on different moments due to the consecutive process steps. So we generate a datablock of 33 rows (the products) and 8 columns (all the measurements on each product).
First a screening of the data is performed. it is seen that the last 4 products had a big gap between T2 and T3 (> 1 week). So apparently for some reason they where not following the standard process flow. We take these products out because they show a different production pattern than standard production. Also 1 product has a very large X1 value (>10S from Mu) so we take that flyer out. the Xspace is covered well (the Xvalue’s from the dataset cover the area of Xvalue’s that we encounter during production)
Analysis:
1> the 5 taken out products are investigated for the cause of their different behavior. if possible the causes are prevented form ever occuring again or a measure (=see that it is happening again) and repair program is implemented.
2>the other 28 products are investigated for relations between X’s and Y. With Regresssion we can see that X1 and X2 are the only significant ones so we get a Regression Model Y= c0+c1*X1 + c2*X2. The residuals show nice noise behaviour and the Rsq is >80% so we don’t have a reason to look for another X; X1 and X2 predict the Y very well already.
Note: in IT maybe an Rsq of 80% is not good enough OR it is automatically 100% because there are no random noise influences?? I don’t know enough about this so check this with somebody who know’s
hope this helps,
Remi0July 31, 2009 at 11:54 am #184650Hai persian.
offcourse carelesness is not the only RC. But it is the main RC people find when they have the following mindset: “typing text over from a piece of paper is easy to do”. Example: picking up a 10 kilo bag is easy; do it continuously for 8 hrs longterm is not easy but backbreaking.
tips:
1} use the Ishikawa (Fishbone) to decide what sorts of RC’s can be found in the other 5 bone’s.
2} Brainstorm with others about RC’s: other people look at things differently so they can help you see things you haven’t thought of
3} Ask the operators: why do they find it difficult to perform faultless continuously
3b} For one week take over the job of an operator And complete that week (40 hrs), not just a short time. Then you really know how it is to be an operator and what the difficulties of the job are.
Examples of other RC’s:
Difficulty of the form (# fields to fill in; location of the fields; lettertype,..)Readability of textDifference between formsEnvironmental disturbances (light, noise,..)Variabel and Interesting work
Good luck,
Remi0July 28, 2009 at 8:41 am #184588hai Trew,
my answer is: Yes and No.
On “Production level”: only the pieces you are working on are used in the calculation for WIP/CT.
But on “Sales level” or “Company level” (sometimes called metaproduction level): all pieces that were ordered are used in the calculation.
So depending on who you talk to (somebody from production or some “white collar”) the veiw of what WIP entails may differ.
Maybe OIP and OT could be used as alternate names to make the distinction more clear?Where OIP = Orders in Progress and OCT = order cycle time.
Remi
0June 10, 2009 at 10:07 am #65293Hai Yan,
yes: 80%
Agreement > 80% : good;Agreement < 60% : doubtfullIn between : use carefully
Remi0June 10, 2009 at 9:19 am #57753Hai lost123,
remember: for predictions of Regression it is recommended to only interpolate (i.e. only make predictions in the area where you have data).
The high pvalue of the constant says nothing else than that your prediction around the origin will be ‘bad’.
You can correct this by extending your data collection to the area around the origin.
Remi0June 5, 2009 at 9:49 am #184490Hai navin,
A question: are you still interested in the difference in means if there is such a (large) difference in variation?
From experience I know that if you tackle the problem of making the variation equal the means will be changed in value. So generally I attack Sigma first and aftwerwards try to equalize the Mu’s.
Also by investigating WHY the sigma’s are so different I get a better understanding of what is going on.
And Yes the means should become equal in the end (with emphasis on the last 3 words).
Remi0June 5, 2009 at 9:39 am #184489hai Tom,
it is related to the difference between ‘fixed’ and ‘random’ analysis in GLM.
For ‘fixed’ you are interested in the means and their differences (the delta of the Mu’s); for ‘random’ you are interested in the between variation of the means ( the Sigma of the Mu’s).So Mtab (and othere S/W as well) does not give the possibilty to compare the means if the variable is chosen as ‘random’ (because by choosing this you tell the S/W you are not interested in that info.
So if you want to see those means anyway: do an extra analysis where you choose that parameter ‘fixed’ also.
Exampleexplanation: Suppose you have 2 X’s that you want to investigate. X1= Operator. X1 has 2 values since there are only 2 Operators (O1 and O2).X2 = BatchID. X2 has theoretically an infinite number of values if you stay in business infinitely. But for the investigation you have decided to investigate 4 ‘representative’ batches.
In GLM X1 is considered a ‘fixed’ parameter because it has only 2 values and both are investigated. In general for X1 you are interested in the difference between the 2 means of O1 and O2.X2 is considered a ‘random’ parameter because you are investigating only 4 of the many values of X2. You are not so much interested in what the difference between those 4 is but more in the sort of variation that is caused by producing with a new/different batch (the betweenvariation caused by the batches).
Hope this helps,
Remi0May 26, 2009 at 11:08 am #184390Hai Lorax,
your method sounds sound but it isn’t. You have 2 risks that can garbage your conclusions.
Risk 1: the X’s are not independent.
Risk2: the X’s have interactions (Y feels X1 an X2 together different than seperately because they strengthen their influence on Y)
The recommended method.
1: Check if the X’s are independent;2: Make a regression model of Y as function of all X’s together (also interactions)
How to:1: Matrixplot of X’s (only 1on1 independency or use Principal Components Analysis (BB level knowledge)2: (in MINITAB) use GLM or analyse as a linear DoE; pvalues tell you which X’s or interactions to remove; Rsq how much you still are missing.
Good luck,
Remi0May 26, 2009 at 10:26 am #184389Nancy,
if “I have no USL” means “that dimension can be as big as it wants” then you will have no rejects due to too big in that direction. So as far as quality is concerned only LSL is important. Use Cpk or Ppk calculation instead of Cp calculation (i.e. calculate Zvalue only on LSL and forget USL). All S/W packages use this option automatically if you only fill in an LSL and no USL.
Remi0May 25, 2009 at 9:34 am #184373Hai Socrates,
the answer to your question is “No”.
Just like nobody can empty the seas; count all the grains of sands or walk to the moon.
Remi0May 14, 2009 at 8:56 am #184111Rick,
I (don’t) agree.
Cp and Cpk have value also for nonnormal distributed data (as long as the data is stable in time (=statistically in control)).
BUT
The Cp(k) should be calculated with the correct formula and not with the Default formula (i.e. do not use Cp= (UL)/6S ).
I agree with you that when people use the Default formula in the case of a nonnormal dataset the calculated value has not the right meaning (i.e. a value of 2 could still mean 50% rejects). The main reason is that Sigma can always be calculated but has (often) no connection to ‘width of the data’ so to variation in the case of nonnormality.
Remi0May 6, 2009 at 8:26 am #183915Hai Damaris;
A number that is (almost) of no use to anybody.
other xamples are#Gu * #Ru
2^(#Gu)
Sin(#Ru)
etc…
All mathematically possible to calculate but not much use to anybody.
It is better to think the other way around:what am I interested in?
what would be possible ways to calculate this from numbers I have?
which of these possible ways is the most usefull to me?
Example:
ad1, the difference between two data sets A and B
ad2, AB, BA; A/B, B/A; (AB)/A, (AB)/B, (AB)/(A+B);…. (absolute difference; proportion; proportional difference; ..)
ad3. choose the one that fits best
Remi0May 6, 2009 at 8:02 am #183914Yan,
Mathematecally: No. But the idea behind it has a common basis.
Realize: if you are measuring on a noncontinuous scale you can not expect to be accurate/precise! (And that is a final stop. )
Both FMEA and Kappa calculation have as basis an estimated value made by persons. So it is very subjective. But since it sometimes is the best you have; that’s what you have to live with.
And if you translate mathematically 15 into % you get multiples of 20%. And since a 5 is never absolute a score of 80% is considerd high. You would like to have a 100% Kappa score but that is inherently impossible. If you have that you (implicitely) have a continuous measurement
Remi0May 4, 2009 at 12:37 pm #57745Hai James,
BoxCoxTransformation tries to find a best Lambda (L) between 5 and 5 such that Y**L is as close as possible to a normal distribution (exception for L=0 the Log(Y) is taken).
This ‘goes wrong’ if:data contains negative or 0 values: calculation is not possible; BCT refuses an outcome {add a constant to all data to make all positive values; stay on safe side for next sample}
BCT finds that best L is not within [5,5]: data is ‘not transformable’. This can often happen with data that is measured ‘too rougly’ like other poster mentioned {measure better}
BCT finds L but data is still not Normal distributed. This can happen if you have several populations (camelhump){Split populatons}.
Make a picture of the raw data first (so you see what is going on). look for outliers and combined populations.
If data is not normal distributed:
if 1 population:find out what a fitting distribution is (in minitab Stat > Qual Tools > IDI) and calculate Cpk (use in minitab: Stat> Qual Tools > Cap Ana > nonnormal > choose correct function)
Replace 3*Sigma in Cpk calculation with (P[0.9985]P[0.0015])/2 where P[k] is the location where k*100% of the datapoints lies to the left of that location.
if more than one population: split up and calculate each seperate. Do not combine / average the Cpk’s!!!
Good luck
Remi0April 27, 2009 at 9:31 am #183714Hai Jorge,
Be extra careful. The data analysis assumes correct values (into infinite number of digits; 100 is 100.00000000). But the original data looks to have a Resolution of 1. So all but one of the difference could be because of rounding off?
Note: Did you investigate what sort of difference between the 2 values is the interesting one? The one chosen by you is the linear (or absolute) difference YX. Other options are Y/X, X/Y, (YX)/X, (YX)/(Y+X),… Each has another physical interpretation.0April 27, 2009 at 8:53 am #183713And don’t forget the confidence intervals when you make predictions with that function y=f(x).
Tip: make sure that the (x,y) data covers the whole range of ‘possible measurements’ so that you have interpolations and no extrapolations.
Remi0April 3, 2009 at 10:29 am #183117BTDT your values are correct.
To all: Monte Carlo is not necessary; you can just calculate the value (only Median).
If you have a data set that is symmetric around 0 (both X and Y data satisfy this) AND you want to square the values, the New Median = Q1squared = Q3squared.
For X =U[0.01,0.01] this gives (0.0050)^2 and for Y= N(0, 0.0036) this gives (0.04047)^2 (around 2/3 StDev).
Then Square, Sum and take Sqrt gives Median= 0.04078. And this calculation does not use the exact distribution only that they are symmetrical (around 0 just makes the calculation easier)
Remi
0April 1, 2009 at 12:43 pm #183040Hai SN,
I missed the mistake you made the first time around, sorry.
You calculated StDev of 1000 products with 996 perfect on target and 4 exactly 1 cm of with spec area of +/ 0.2.
1} can you see a difference in products of size 0.1 or is the smallest difference you can see 1cm? If you see worse than 0.04 your measurment system has a resolution problem.
2} your data set is strongly notnormal distributed. A stDev can be calculated but has ‘no meaning’. You can not calculate zvalue using the formula like you did (formula only correct for close to normal distributed data). So ofcourse you are surprised with the high outcome of Zvalue.
Actions: measure at least 10 * more precise (one extra digit (that has meaning)) and the data will probably follow a normal distribution and then you can use the formula.
Good luck.0April 1, 2009 at 12:21 pm #183039Hai ck,
sounds ok to me.
Be careful that you don’t give too much value to the mathematically calculated outcomes in the case of real products. You will seldom have perfect normal distributions. And if you base the calculations on only 30 products of each then a simpler method is to calculate the number of “overlapping” (or differing) products. That has as extra advantage that it is distribution independent but as a disadvantage predictions are difficult.
An advantage of pvaluereasoning is that you not only look at the difference as such (abolute comparison) but on how much compared the found difference is to a difference caused by Chance (relative comparison, H0 principle).
Remi
0April 1, 2009 at 12:09 pm #183038Hai Bower Chiel,
13.11 is close but not correct.
Assumption: data is perfect N(Mu, Var).
Then N(Mu1,Var1)=N(Mu2,Var2) collapses into
(XMu1)/Stdev1 = +/ (XMu2)/St Dev2.
In this case (x12)/0.5 = +/ (x15)/1 gives X= 9 and X= 13.
I just realized that you have to use both values because only between 9 and 13 N(12,.25) > N(15,1) (one can almost ignore the area left of X=9 because it has value < 0.00001)
If data is not perfect ofcourse all these calculations are just an approximation to a value that will never be known.
Remi0March 31, 2009 at 1:37 pm #183003Hi Sigma Novice,
like Cobb said: for a continuous parameter it is equal to 3* Cpk (if calculated correctly).
For noncontinuous parameter there is the socalled dpmoformula. Here you calculate the zvalue according to certain steps. One of the last steps is the much discussed “+1.5” trick: Sigmalevel = calculated zvalue +1.5 (‘because you have used production data’).
The idea behind the “+1.5” trick is the idea that there alway will be quality loss on long term due to extra variation sources. So if you have a zvalue of 4.5 calculated on long term data (where those variation sources are present) we will call this a (short term) sigmalevel of 6 (six sigma).
Hope this helps
Remi0March 31, 2009 at 1:17 pm #183000Hi ck,
wherever would you need that for? Are you sure that this is what you need to be calculated?
Also the 2 values of N are noninformative because they have no influence on the outcome of the question like you stated it.
here is a way to calculate:0] draw a picture of how it looks like 1] calculate at which location X the 2 graphs meet (not 9 but the 13 is the correct outcome (see 0])). 2] Calculate for N(12,0.25) the area to the right of X: P(x>X) and 3] calculate for N(15,1) the area to the left of X: P(x<X)4] Sum up 2] and 3] for the outcome.
Steps 3 and 4 are easy to do in Minitab (Calc > Prob distr> Normal>….).
Good luck, Remi0March 26, 2009 at 10:39 am #182766Doesample,
you can use as an approximation the sample size calculation for ANOVA where “Size of Cat X” = “# cornerpoint runs in your DoE”.
Ofcourse only if Y is continuous and easured well.
If you really need the formula: google it or buy a good statistics book.
Remi
0March 19, 2009 at 1:03 pm #182514Hai PK,
translate Life issues into Business aspects and go for it. It will work. Whether you can do it quicker by jumping to a solution is ofcourse debatable.GB project on time to work:
1> Define: what is the problem and why is it one. How improtant is it; what and how much do you want to change?
2>Measure: CTQ is Arrival time. Specs are … Measurement system is…. Prove that you can measure consitently well.
3> Analyse: CAPA: measure for 6 weeks current situation. Brainstorm on X’s that cause change in Y (example: Departure time, travelling time; transportation method; going to bed time of day before; time for breakfast;interupts on the way; Party the night before; broken transport;…)
4> Improve: Determine how the Y feels those X’s and what you can do to improve. Test it and implement it; keep on improving until satisfied.
5> Consolidate and Control. Check for 6 weeks if solution is working. Check for 1 year if seasonal changes give extra X’s that have to be taken into account (if yes go back to Improve).
That’s it.
And yes if you don’t commit to the improvement you will some days be still too late (personal experience).0March 19, 2009 at 12:50 pm #182513Hi Mike,
Best tool depends on what you (really) want (whywhywhywhywhy).
Start with a graph ( dotplot or boxplot, with grouping). This will give you for each value of the categorical X a plot of how the Yvalues look like.
ANOVA will not tell you about correlation but about difference in Mu: can we prove that (at least) two of those plots are located different from eachother (low pvalue=Yes, high pvalue=No). Outcome of ANOVA is ‘only’ valid if residuals are normal distributed and the sigma’s are equal (enough): look at Residuals and perform equal variance test. If one of the two is not true replace ANOVA with Kruskal Wallis (test on medians).
Remi
0March 16, 2009 at 3:59 pm #182412Hi Don R,
no problem. Was just curious.
Hope your team wins, Remi.0March 16, 2009 at 10:11 am #182403Hai Don R,
I’m not a D, I know what basketbal is; I played it myself.
But you can’t expect from an European guy that he knows all those US team names.
I would see myself more as an E: Who cares what they do..
Remi0March 13, 2009 at 3:51 pm #65252Hi kumar: for question#1: yes: google Monte Carlo and study
Remi0March 13, 2009 at 3:51 pm #182360Hi kumar: for question#1: yes: google Monte Carlo and study
RemiThis thread has been moved to the IT/Software discussion forum. Please click here to continue the discussion.0March 13, 2009 at 3:49 pm #182359Hai Chris,
The boxplot was meant for continuous data when it was ‘invented’.
But like any tool you can use it differently if it helps.
You can make a boxplot of discrete data. No problem. The software will work. Whether the picture has any meaning to you is something else. What does Median, Q1 and Q3 mean for you?
Other possibilities: – The Y is measured as continuous data for 3 machines. See machine as a discrete X and make 3 boxplots (X on Xaxis and Y on Yaxis) If the discrete data has > 10 different values (often > 7 works already in practise) and the values are really discrete (not names that are shown as numbers) so that calculation has a meaning: just treat it as continuous data with a measurement system that has a bad resolution.
Remi0March 13, 2009 at 3:44 pm #182358Hai Stevo,
the answer is of course “YES”.
With Six Sigma methodology you can improve anything. So also the performance of 1 of the teams which would make the difference between them larger. Possibly with sample size you could predict how many improvements projects you would need.
On the other hand: if your goal is to reduce variation, you could enlarge the # of overtimes by making the difference between the 2 teams smaller.
So whatever you want.
Bythewei, what sport do they play? You could then see each teammember action as a process step where the result would be a scorechage (a product). You probably would have to go for the new Six Sigma version where you not only have process steps that (try to)improve a product but also stpes that try to make it worse (use Markovchain language to describe)
Remi0March 13, 2009 at 3:31 pm #182355Hai RK,
sorry.
1> Use the categorical CTQ as an externalCTQ and the continuous one as an internalCTQ. Work with the internal one (Measure, Analyse, Improve) and show end reults (to bobo’s) on external one.
2> Use pchart to show time behaviour of your categorical CTQ. pchart is for %. 1 DPMO is just the same as 0.000100 % DPO. The definition is different but the statistical usage is the same.
Remi0March 13, 2009 at 3:23 pm #182354Hai PB,
not only ipod but also mobile phones (target curstomer=Mgr; biggest customer (after introduction)= teenage girl) and CDrom.
So just as you described a company can use a team of ‘smart guys’ to come up with ideas of new products that are on some aspect Better than current products (and optional by new technology possibilities). This gives a list of Idea’s and then this is sent to a marketing Brainstorm team to evaluate possibilities for customer. So the Designer + Marketeer invent a potential Customer and decide what would be the customer needs.The risk ofcourse is that there is in reality no consumer that plays that customer role. This happens sometimes with small companys build around a designer with “a perfect idea for a product”.
The QFD can be used as a tool here too. You only have to redefine (rename) the first house and add a risk indicator per feature (=functionality) of the product.
Remi0March 13, 2009 at 1:35 pm #1823530March 11, 2009 at 3:33 pm #182251Hai huang yanwei,
the idea of QFD is to start at the high level and go down to lower levels. So Customer is First. They ‘give’ you a list of requirements/expactations/needs/wishes. Then you TRANSLATE this into design requirements. The customer requirements are often in their ‘language’ and you have to translate that in your ‘language’ (i.e. design parameters). How ‘good’ you are at this depends on your expertise. Often with new products this is done in several loops because of learning track.
Check in the matrix that each row is connected to at least one column (otherwise you do not cover that Customer Req) and each column to at least one row (otherwise nobody requires it). The emptier the matrix is the less conflict of interest you have and the less you have to use compromises.
Example:
A metal plate will be used in a cupboard. Customer Requirments: 1>Strong; 2> no sharp ends; 3>doesn’t bend too much (buckle); 4> looks nice
The team translates this in design parameters (their choice) with which they expect to cover the customer expectations (Check with customer if they agree).
Example of translation (design parameters): 1a> Material type; 1b> Thickness of plate; 2a> sharpness at edges; 2b> # of burrs; 3a> 3b> same as 1a> 1b>; 3c> extra supports; 3d> size of plate; 4a> Finishing; 4b> Visual Evaluation Score.
Hope this helps0March 9, 2009 at 2:49 pm #182164Hai RK,
didn’t you learn that a continuous metric is preferable to a categorical metric? And now you take a ‘continuous’ metric and change it into a categorical one. Just one advise: DON’T!!!!
Use the metric “# of days” with Upper Spec = 90. This makes life much easier. Be a little careful with some calculations because this metric is not normal distributed. If you improve (=lower) Mean and Sigma of this metric your dpmo will go down.
Tip: weekly 1 measurement will make your analysis, improve and control phase a long term endeavour. Just try to get the measurement per product. It is there somewhere otherwise the weekly summary could not be made.
Remi0March 9, 2009 at 2:28 pm #182163Hai Arun,
your confusion I have encountered many times in training.
The basis lies in ‘bad’ communication. What happens is that people often say ‘sigma’ when they mean ‘sigmalevel’.
Sigma:Word in Greek language that describes their letter “s” or “S”.
mathematics: it is the name of the standard deviation of a population (often estimated on the calculated st.dev from a dataset that is a sample of the population). St.Dev is one of the measures for variation, spread or deviation.
a lazy alternative of ‘sigmalevel’
Sigmalevel: the quality level of a process. Calculated from data on the process and the specs from the customer. It represents for a normal distribution the number of steps of size 1 Sigma between Mu and closest Spec OR the number of steps of size 1 Sigma between both Specs (reated to Cpk and Cp respectively).
Tip: To determine Sigma you don’t need the customer specs, it is a property of a process; to determine ‘sigmalevel’ you need the value of the sigma of the process and the specs of the customer.So if you are making products that you sell to different customers, you have only one Sigma but for each customer a Sigmalevel.
Hope this helped, Remi0March 2, 2009 at 3:25 pm #181855And you suppose that we will know what you are talking about? We are not mindreaders.
C1 is a brand of Binoculars
C4 is either a Car type or a kind of explosive or a textshortcut.
If your C1 and C4 are different than maybe the link on the left =New to Six Sigma can help you
Remi0March 2, 2009 at 3:07 pm #59565Yes you can,
but it is on the level of “stop with your job because it costs too much time” and replace it with “buy lottery tickets to have an income”.
That MSA is too time consuming and costly is in the eye of the beholder. Calculate how much it can cost you if you have a bad measurement system or if you don’t know how good it is (Sending bad products to customers + complaint handling +… )
And suddenly an MSA looks very cheap.
Good luck, Remi0March 2, 2009 at 2:06 pm #181851February 19, 2009 at 4:46 pm #181456Hai Petros,
depends on the person you talk to.‘lazy’ person: they are the same
otherwise: Gage r&R is part of MSA: the study on repeatability and Reproducibility. Before that you should have checked: Validity; Resolution; Bias (also Linearityand Stability).
Remi0February 19, 2009 at 3:15 pm #181442Hai Stan,
I agree with you that my ‘solution’ does not address the issue of ‘seperating the productchanging from the measurement variation’.
But I didn’t (at first) conclude from his post that this was what he wanted. If he only needs to ‘proof’ that he has a good MSA than maybe my answers have helped him enough.
Now I have read the whole string of posts (I skipped most of them the first time) and then I have to agree with you.
Remi0February 19, 2009 at 2:56 pm #181440Hebbat:
Customer first.
You don’t make output because you can but because somebody is willing to pay for it. Then you decide with which inputs your process can realise this output.
So Output is first; Inputs are later.0February 19, 2009 at 9:32 am #181431CI Guy,
you have a mix of 2 variation sources (parts that change value+ measurement variation).Like mentioned before this can not be seperated in 1 study.
Also options were mentioned:take 1 part; measure several times (repeat and/or reproduce) and look at average value found. Declare this the true value of the product. Do this for 10 parts. Now you can use these true values to perform MSA.
replace partwithvaryingvalue by partwithstablevalue (PWSV) just like when destructive testing. PWSV needs to be measured in ‘exactly’ the same as the original part and needs to cover the same area of values.
But: what is really your question?From your first post it looks like “is my gage r&R ok?” (reference: [quantify their measurement system variation]) Then you don’t need to know how the variation is split up as long as the result of the gage r&R is low enough.
If you perform a gage r&R study and ignore the partvarying this will be calculated by the software as extra repeatability (if the operator has no influence in how the part changes its value) because it assumes that the parts have constant value. The repeatability result will be higher than the repeatability in reality is (Ssq calculated repeatability = Ssq ‘real’ repeatability + Ssq part varying). But if the calculated value is low enough than the ‘real’ value is certainly low enough.
Hope this helps.0February 16, 2009 at 4:50 pm #181236Curious,
because the square of 1 is also 1 (and there is only 0ne other number that has this property)
Also “dividing by Sigma” is part of the “coding” strategy often done by mathematicians (coding data often means bringing all data in [1,+1] interval)0February 16, 2009 at 9:49 am #65236Type in a 6 and then a Sigma.
What else?
0February 12, 2009 at 2:17 pm #65234Hai Wkuchinsky,
it depends jow each of the courses is setup in detail.
Most Six Sigma course I have seen have as starting point that the student knows how to do a project and they only show how that is translated into a Six Sigma project. But there are possibly Six Sigma courses that explain in more detail on project management skills.Of course it also depends whether you already are an experienced PL or new on this part of the job and whether you can fall back on experienced colleagues or not.
Good luck.0February 12, 2009 at 2:04 pm #62260Tony,
I assume that with “the lowest point of expected variation” you actually mean “the lower bound of the confidence interval of the population Variation (sigmasquared) based on the data”. If that is not true my answer doesnot answer your question.
If the data follows the normal distribution then from the data not only the S (st.dev of sample) can be calculated but also the Confidence Interval (CI) of the Sigma (st.dev of population). The lower value of that CI is the square root of the answer you are looking for. In Minitab: Use Stat > Basic Statistics > Graphical Summary.The pvalue of the AD test tells you if the sample follows a normal distribution. The lower bound and upper bound of the CI of St.Dev. are on the last line of output. Squaring those gives you your answer.
If the data is not normal distributed the calculated bounds are very probably wrong.
Hope this helps.0February 11, 2009 at 3:13 pm #181035Hai CCoup,
Indeed the same holds true for the individuals chart .
Sorry but I can’t help you on a citation. I don’t know if such exists because seldom it was lookedat in your way.
But it follows directly how the run rule is defined and applied by the software. Maybe you can get away with an “as one can easily verify… this results in…” remark? Or a reference to an IT program that actually does this? (maybe ask the helpdivision of the software package you use?)
Or just cite the original Shewart book (or derivate) where the run rules are defined/explained?
Good luck0February 11, 2009 at 10:22 am #181023Hai CCoup,
Xchart?? I assume you mean Xbarchart
not only CAN but WILL (and the same holds for the other types of charts).
In short: only only for rule 1 the answer is NO because then only 1 point is considered.
Example: If you have a group of 15 consecutive Xbar points (call them 1001,… 1015) that are on the same side of the Center Line. Then run rule 2 will be violated first at point 1009 because this is the first time that a series of 9 (points 1001,…1009) is on the same side. But also points 1010 – 1015 will be the end of a series of 9 consecutive points that violate run rule 2. And for each of these violations the point 1009 is used in the serie of 9 points. So in this case point 1009 is used 7 time for run rule 2.
Hope this helps0February 9, 2009 at 2:12 pm #65230Hai inquirer,
answer in short: St Dev has the same physical dimension as the parameter itself.
Long explanation: Variance = Sigmasquared; St.Dev = Sigma.From mathematically viewpoint Variance is a good measure of ‘how far apart can points be away from the middle’. So mathematical people looked for the ‘best’ formula to describe this and found the formula for Variance as it is used today (in the past many others were tried).
For nonmathematicians the Variance has one drawback: the physical dimension is the square of the dimension of the parameter you are measuring. Example: you measure Y=Length in [cm]. Then Variance is in [cm^2] (squared centimeters). But many people see [cm^2] as a measure for Area.To avoid confusion of this kind nonmathematical people started to use St.Dev. instead of Variance (St.Dev. of Y has dimension [cm] in the example).
Remi0February 9, 2009 at 1:12 pm #62251Sreenu,
depending on the type of data + how data collection was done:
2 proportion test or 2 sample ttest
Remi0February 6, 2009 at 4:58 pm #180775Hai Karolos,
If you do Regression with several X’s you generally start with all X’s that are expected to show significance. From the Regression results you will see that some are not significant (based on your data). Remove these from the model (perform the Regression without them) and the pvalues of the others (also the X that holds your special interest) will also change.
Don’t forget to check the Residuals for ‘strange behaviour’ and the Rsq to see if you have used the X’s that cover the Y (if Rsq is low that could mean that you are missing significant X’s in your model).
Good luck0February 6, 2009 at 4:52 pm #180774Hai Waseem,
Talk with the operators. Find out under which circumstances they make the mistakes (most of the time). Find out their improvement recommendations. Brainstorm for improvement Ideas
Some things I can think of:Preventive: Stamp on a dummy first (cost of dummy vs profits of less faults; dummy could be piece of paper); Check automatically with Vision system what the made stamp # looks like and show this to operator/check with wanted stamp # (expensive?); Make #’s that are expected to be used ‘today’ in the morning and let operator only use prepared stamps in order presented to him (larger buffer of single digit stamps needed)
Repairative: Stamp softly first; check and stamp final (loss of cycle time though)
Hope this helps0 
AuthorPosts