Backfilling data into a post designed experiment

Six Sigma – iSixSigma Forums Old Forums General Backfilling data into a post designed experiment

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
  • #50418


    I’ve started a new job and have tons of historical data on mixtures of thermoset molding compounds and the resultant mecanical properties. However, the experiments were done one at a time comparing one combination to another. No experimental design was developed ahead of the experimentation. Any advice on how to use this data to put into a post designed experiment? I want to know the effect of the various mixture components on the mecanical properties of the molded samples


    Robert Butler

      You will need to have some way to examine the properties of the X matrix – that is the co-linear relationships of the X values of the data you have.  At a minimum you will have to have a way to compute VIF’s and the eigenvectors and condition indices.
      I’d recommend spending a lot of time identifying the temporal grouping of the data and also identifying the results of the “same” experiment (if it was one-factor-at-a-time there should be a lot of this).
      Use graphical methods to look at the results of these “same” experiments.  Plot them over time and connect them to their respective blocks of data.  Examine the plot – identify any of the data points that seem to be “different” from the rest. Don’t waste time with statistical tests – just use your calibrated eyeball. For those that give the sense of different, set them and their respective blocks of data to one side.
      With the data that is left check for such things as similarity of scale – i.e. pop bottle experiments vs. small lab reactor vs. actual runs and group experiments so that they are “equivalent” with respect to parameters such as this.
      Once you have identified one or more groups of data that display some sense of homogeneity with respect to the above run the diagnostics on the X matrix to see what kinds of terms you could include in the model.  When you have this list, build the model and see what you can see.
      You are, of course, making the very large assumption that nothing other than the values of  the X’s changed and you are also assuming that all of the factors that were recorded as being varied during each of these one-factor-at-a-time really were the only things that were actually varied – Not! But there will be no way for you to know this.
      You can build a model for each block of data that meets your ad hoc requirements for homogeneity and you can examine the final models to see if they exhibit any agreement with respect to presence of terms and signs of the respective coefficients.
     This is a lot of work and if my experiences in doing this are in any way typical what you will probably find is that, because the data was developed one-factor-at-a-time, those tons of historical data are going to be reduced to mere ounces of blocks of data you can actually analyze.
      My most recent experience in this regard happened a few years ago. We had data on over 3000 experiments run over a 10 year period. The first pass – looking for experiments where similar X’s had been run/recorded reduced this list to around 400.  After doing all of the other things listed above I had 35 data points I felt I could treat as though they had been run at one time. I ran a multivariate analysis on this block.  There were several other blocks of data ranging from 12-24 data points each that allowed me to run a univariate analysis for one X.  The single X was different in each case.  Since there was some overlap between the X’s in the multivariate case and those in the univariate case I was able to compare signs of the coefficients to see if there was an agreement with respect to direction of the effect. There was no point in looking at the magnitudes because the blocks of data represented different scales of experimentation.
      For the single X’s I did plot the data and coded it by experiment scale.  This allowed me to see if there were similar trends and if the offset between the blocks of data could be viewed as an issue of scale.
      At the end of the effort I reported what I had found – reminded everyone of all of the assumptions and caveats of this large fishing expedition – and made a recommendation concerning variables that should be included in a design.  They took my advice under consideration, made some changes and additions to the list, I built the design, we ran it, and the company went to market with a new product.
      SinceI had made a serious effort to look at what had been done, everyone was left with sense that their data had been thoroughly examined and that there really wasn’t any reason to spend additional time looking at it. Most importantly, the analysis provided guidance with respect to possible variables to include in a formal design



    Thank you Robert for taking the time to share your experience and good advise. I think I will use the historical data to get an idea for direction but start new experiments to gain confidence in the reliability of the data.

Viewing 3 posts - 1 through 3 (of 3 total)

The forum ‘General’ is closed to new topics and replies.