Simple Linear Regression Analysis
Six Sigma – iSixSigma › Forums › Old Forums › General › Simple Linear Regression Analysis
 This topic has 9 replies, 6 voices, and was last updated 18 years, 11 months ago by Robert Butler.

AuthorPosts

February 7, 2003 at 5:00 am #31409
I have a simple regression analysis where the predictor (x) is pressure, and the response (y) is flowrate. The data pairs are (5, 5.55), (2.5, 2), (1, 0.202), (0.5, 0.04), (0.4, 0.004), (0.3, 0.004), (0.25, 0), (0.2, 0.006), (0.18, 0.006), (0.15, 0), (0.12, 0), and (0.08, 0). Questions:(1) Each response is a mean value of 5 data point because the testing is repeated 5 times. However, some responses data from low pressure looks like categorical data. For instance, for pair (0.18, 0.006), the 0.006 is the average of 0, 0, 0, 0.03, and 0 (0+0+0+0.03+0)/5 = 0.03. Does it make sense to use 0.03 in the regression analysis or shall I use 0.006? Any thoughts?(2) At high pressure, the x and y seem to correlate pretty well. However, x and y do not correlate well when the pressure is low. If I were to ignore data pair from high pressure, and do a regression using low pressure pairs, the regression is pretty bad. In case like this how should I handle the regression? It is kind of worry me because I need to get a good model so that I can extrapolate the model to really low pressure to get the corresponding flowrate value. (3) If I were to extrapolate, what kind of statistical analysis that I can do to ensure that I got a reliable flowrate value when the pressure is really really low?Any thought? Thanks.
0February 7, 2003 at 12:48 pm #82790
Michael SchlueterParticipant@MichaelSchlueter Include @MichaelSchlueter in your post and this person will
be notified via email.Woey,Interesting question. So your measurement system consists of 2 parts:1. the instrument (flowrate meter) 2. your algorithm (take 5 samples and average)Your data seem to follow a powerfit y ~ x^2.328 . You can find more details in the attached Excel sheet. I leave it up to you to justify that your meter should actually behave almost like x^2 ;)What I did: A) I checked your original data (sheet “posted data”) in a linlin, linlog, loglin and loglog view. Respectively, a linear, an exponential, a logarithmic or a power dependence should be plotted as a linear in those diagrams. Assuming a powerfunction is the most reasonable thing I can do with your data.B) To find the powerfit I had to transform your data just a little bit (sheet “powerfit”). There are 2 steps: 1. map log(y) vs. log(x) 2. shift the curve towards the origin (orange area). The diagram shows you the results.C) Those data now should follow a linear curve through (0,0). When I go through a little bit of Taguchi’s calculations I find the slope beta=2.328 and the residual error +0.335 in the loglogview.D) The final fit is [(log(y) + 1.2859) = 2.328 * (log(M) + 0.1961) + 0.335], which you can boil down to the y~x^2.3 equation.E) In the lower part of sheet “powerfit” you will see my comparision between your original data and the fitformula. It looks like you should review/exclude (0.4, 0.004) and (0.3, 0.004).Best regards,Michael Schlueter Download: Simple Linear Regression Analysis [Excel file]Viewing Tip: Usually, you can click on a link to view the document — it may open within your browser using the application (in this case Microsoft Excel). If you are having difficulty, try right clicking the link and selecting “Save Target As…” or “Save As…” to save it to your computer harddrive.
0February 7, 2003 at 1:42 pm #82791You need a new flow meter that measures in the low range you’re interested in. Don’t fiddle with stats until you get one, especially if it’s really critical that your prediction is right.
0February 7, 2003 at 2:55 pm #82793
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Metering systems are usually built for response over some part of a range of possible measurements. From your description it sounds like you have the situation that Opey has cautioned against. In addition to Opey’s comments I would also recommend against building a regression model on the averages. Your flow measurements will be individual readings and you will want to know the ability of your model to predict those individual readings.
If you run your regression against the individual measurements (we’re assuming the five measurements at each point are replicate and not duplicate measurements) and then plot the fitted line and its associated confidence intervals against the individual points not only will you better highlight the regions of poor regression fit you will also highlight those regions where the sensitivity of your metering system begins to fail.
0February 7, 2003 at 2:55 pm #82794Since you know the numbers that are averaged to get the y value in each of your ordered pairs, why not just use each of the five numbers as separate points? You have more data to fit a regression to in that case.
Your other concern about the low pressure readings might sugget that data is not really linear, maybe a log or exponential transformation is neccesary. The powerfit spreadsheet in the earlier post looks reasonable. Minitab would have found that for you by doing a BoxCox transformation.0February 7, 2003 at 3:07 pm #82795
Michael SchlueterParticipant@MichaelSchlueter Include @MichaelSchlueter in your post and this person will
be notified via email.Woey,
The remarks from Opey, Robert and Zilgo are very valid; the limitations of your instrument are quite obvious to me, now.
Why do you need to use it? Why do you run it in its critical range? What do you want to achieve, finally?0February 7, 2003 at 6:16 pm #82805Sounds like your “Simple Linear Rgression” might be quadratic.
0February 8, 2003 at 3:30 am #82832What I want to achieve is to establish a specification for a medical device that my company is making. Given a acceptable pressure value to average patient population, I need to find out the best fit flowrate value. From the flowrate, I can find out the relationship between volume and time. Thanks everyone for the inputs.
0February 10, 2003 at 3:06 am #82853Thanks for all the work. I tried to do a polynomial regression, and got a pretty good y and x relationship. The R^2 is 0.998. I would like to use powerfit, however I have no idea how good the powerfit in relating y and x.
0February 10, 2003 at 2:54 pm #82877
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Woey, the tone of your 9 February email worries me. If I’ve misunderstood your post please accept my apologies in advance. You made the comment that you “tried to do a polynomial regression, and got a pretty good y and x relationship. The R^2 is 0.998. ” You leave the impression that you are more enamoured of R2 than the propriety of the analysis. R2, by itself, is of little value and it is very easy to make it anything you want. Indeed, if you have no replicated measures of results of Y for a given X you can, with enough terms, get an R2 of 1.
Since you are measuring individual patients you need to run your regression on individual readings. If you build a model on average results the confidence limits around the regression will be for averages and they will be much tighter that the confidence limits for individual measurements. If you don’t take this into account you could wind up thinking that the precision of your instrument is much better than it is.
If you are going to use polynomial regression you need to keep a constant check on the significance of your polynomial terms as you add them, the corresponding reduction in model error, the residual vs predicted plots, and of your lackoffit at each stage of the model building process. It’s a very easy matter to overfit. Finally, your initial missive suggested that you will need to extrapolate outside the ranges that you have measured. You will need to carefully check the behavior of the regression equation and the confidence limits in these regions. It is very easy to get an excellent polynomial fit that falls apart the minute you go beyond the range of actual data.0 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.