iSixSigma

Minimum data points required for Regression

Six Sigma – iSixSigma Forums General Forums Methodology Minimum data points required for Regression

Viewing 5 posts - 1 through 5 (of 5 total)
  • Author
    Posts
  • #53469

    JRM
    Participant

    Hello everyone,
    What is the minimum number of data points required to get a valid R-squared value on a regression line for a scatter plot?

    Thank you for your assistance!

    0
    #190263

    Venugopal. G
    Member

    Hi,

    Please find the below.

    ” Unless the relationship between X and Y is very strong, a small sample (<15) may not be large enough to detect it. Also, small samples do not provide a very precise estimate of the strength of the relationship, which is measured by adjusted R2. If a precise estimate is needed, larger samples (typically 40 or more) should be used."

    Extracted from Minitab 16 – Assistant Regression.

    Venugopal. G

    0
    #190286

    Marty Y.
    Participant

    I am not a statistics guru, but I am pretty sure that the p-value takes sample size into account. I.e. the smaller the sample size the less confidence in your results and the higher the p-value. So, if you have a p-value that is less then 0.05, then I think this is all the information you need. Your sample size is adequate. If you have a p-value greater then 0.05, then you either need more samples or your data is not correlated.

    0
    #190287

    sjg BB
    Member

    If you are talking about multiple linear regression, it will also have to do with how many independent variables you have.

    I’m no statistician, but I thought that you needed at least 2 more data points than there were Xs. You need 1 for each X, 1 for the intercept and 1 more for the error term. More points beyond that gets you a more reliable error term. (again, I’m no stats person, but that is something I recall from my training).

    0
    #190293

    Robert Butler
    Participant

    As asked your question is meaningless. An R-squared or an adjusted R-squared is a computed statistic and they are whatever they are – there is no such thing as a “valid” R-square.

    If you could elabortate on what it is you are trying to do perhaps I or someone else could offer some additional thoughts.

    0
Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic.