# help required

Six Sigma – iSixSigma Forums Old Forums General help required

Viewing 12 posts - 1 through 12 (of 12 total)
• Author
Posts
• #36007

indresh
Participant

dear all,
i am currentlt working on a project where there are three variables namely time, load, traffic and there output on rise in temperature
load is the number of hardware components installed and traffic is number of people, time is the time duration and mapping is done for the rise in temperature
what is the best way of analysis that can be applied with output contineous but variables are both contineous and discrete
rgds,

0
#102604

chhabra
Participant

Hi
What Do you exactly want to do in your analysis?I mean what is the objective of your data collection?

0
#102605

indresh
Participant

find a relation between the factors and how they effect the rise in temperature
factors being the load, time and traffic
rgds,

Time
Mts
Rise in degree celcius
temperature
Traffic

6
14:30
35
5.00
28
198

6
14:55
60
4.00
32
212

6
15:30
90
1.00
33
235

6
16:00
120
2.60
35.6
195

6
16:30
150
1.00
36.6
206

6
17:00
180
1.00
37.6
219

12
15:00
30
11.70
32
206

12
15:30
60
6.00
38
185

12
16:00
90
4.00
42.1
166

12
16:30
120
4.00
46.5
214

12
17:00
150
2.00
48.4
229

12
17:30
180
2.00
50.1
237

10
13:00
60
11.00
34
540

10
14:00
120
4.00
38
645

10
14:30
150
2.00
40
597

6
15:00
30
4.50
26
480

6
15:30
60
4.00
29.6
456

6
16:00
90
3.00
33
465

6
16:30
120
5.00
38
427

6
17:00
150
2.00
40
565

6
17:30
180
2.00
42
611

5
17:30
60
4.00
25
222

5
18:00
90
2.20
27.2
278

5
18:30
120
2.80
30
252

5
19:00
150
2.60
32.6
275

5
19:30
180
1.80
34.8
274

15
13:20
60
5.00
31
374

15
14:20
120
7.00
37.9
319

15
15:20
180
7.00
45
370

15
15:50
220
3.40
48.4
380

0
#102608

Ken Feldman
Participant

Looking for relationships between variables usually calls for some type of regression analysis.  If the Y is continuous and the Xs are both continuous and discrete, as you mentioned, then the discrete values need to be substituted with dummy or indicator variables.  Search the web or a stat book to learn the specifics of doing this.  It can easily be done in Minitab if you have access to that program.

0
#102614

Gollapudi
Participant

Indresh, looks like your data is not summarized properly. Can you give me brief on your data collection procedure.

0
#102617

Tim F
Member

As Darth pointed out, it only takes a few minutes with Minitab. The fact that some of the data are discrete shouldn’t be a problem. Since you have five variables, you may as well fit all of them and see what happens. The only challenge is time, since it is coded as “16:30”, so these values were all converted to a numeric value, where 12:00 = 0.5, 18:00 = 0.75, etc.The results from Minitab are then…———————————————-Regression Analysis: Rise versus Load, Time, Mts, Temp, Trafic The regression equation is
Rise = 2.92 + 0.467 Load + 2.04 Time – 0.0257 Mts – 0.066 Temp + 0.00278 Trafic
Predictor Coef SE Coef T P
Constant 2.924 7.316 0.40 0.693
Time 2.041 9.403 0.22 0.830
Mts -0.02568 0.01334 -1.92 0.066
Temp -0.0665 0.1039 -0.64 0.528
Trafic 0.002785 0.002902 0.96 0.347
S = 1.87584 R-Sq = 56.7% R-Sq(adj) = 47.7%
Analysis of VarianceSource DF SS MS F P
Regression 5 110.698 22.140 6.29 0.001
Residual Error 24 84.450 3.519
Total 29 195.148
Source DF Seq SS
Time 1 32.218
Mts 1 33.109
Temp 1 1.724
Trafic 1 3.240
Unusual ObservationsObs Load Rise Fit SE Fit Residual St Resid
7 12.0 11.700 7.481 0.855 4.219 2.53R
13 10.0 11.000 6.404 0.838 4.596 2.74R
27 15.0 5.000 8.504 1.100 -3.504 -2.31R

0
#102619

Ken Feldman
Participant

Nice effort Tim.  What strikes me first is the fact that the R square adj. is only about 48%.  I hope the original poster has more Xs since he is only explaining about 1/2 of the variation in Y with the ones he has.  Did you try any analysis of residuals or multi-collinearity?  Did the p-values for the individual Xs show significance?

0
#102620

V. Laxmanan
Member

Dear Indresh:
I just saw the data that you have posted.  Imagine the following two experiments.
Experiment 1: Something is falling.
We can do experiments and find the rate at which it is falling.
Experiment 2: Something is alternately rising and falling.
Again we can do experiments and find the rate at which it is rising and falling.
Now, you want to understand what is going on.
Both these experiments that I have mentioned were done and changed our understanding of the universe.  The first experiment was done in the last part of the 16th century.  The second experiment was done in the first part of the 20th century.
Also, interestingly, just as in your problem, the author of the second experiment suspected that there are three types of forces acting on that “something” that is alternately rising and falling.  He thought about what his experiments were telling him deeply.
Then he developed a simple mathematical model to explain what he was observing.  He then tested the model thoroughly and did hundreds of experiments. He kept such good records – his lab notebooks are writtten so neatly it is unbelievable – that even today we can go back and analyze the data and see if we agree with his conclusions.
Interestingly, he did never uses any statistical arguments. After more than 8 years of experimentation he established something so important that he was awarded the Nobel Prize!
We can all learn from this first American born and educated scientist did. And, what is more interesting is that he started in Nobel Prize winning experiments when he well into what we call middle age.
So, age is no barrier to the desire to learn and enquire deeply into our observations. We could all benefit from a deeper understanding of what this great scientist did.
Now, let me give you his name.  His name is Robert A. Millikan and he is one of the few scientists who has written his autobiography.  I have learned a lot from what he did, and how he used pure mathematical logic, to discover two fundamental constants of nature. This was done, let me repeat, without the help of any of the statistical tools that we used today.
Surprisingly, he does not even use least square regression, although he finds linear relations in his experiments!
Your data reminded me of Millikan’s experiments.  I don’t know what we can learn, but I just wanted to share this perspective.  Good luck with your project. Regards.
Laxman

0
#102624

Robert Butler
Participant

Your data looks like repeat measure data.  You appear to have set a load condition and then you let the process run (?) for a period of time which seems to be 30 minutes but can be longer. At the end of that initial time you take a measure of the temperature and the traffic. You then repeat this process at 30 minute intervals under the same load condition. The number of 30 minute intervals per load condition varies.
If you are interested in building a predictive equation connecting load, traffic, and time to either the absolute temperature or the delta temperature and if the data is in fact repeat measure data, you will have to find a package capable of properly handling repeat measure data.  Regular linear regression packages will treat the repeat measures as independent measures and the resultant model will be of little value.
1. Are the assumptions I listed above valid?
2. Is there a reason for the varying amount of minutes before first measurement?
3. Is there a reason for the varying number of 30 minute intervals?
4. Do you have any control over the variable “Traffic”?

0
#102659

indresh
Participant

dear all,
TIM , LAXMAN, DARTH, ROBERT
The regression equation is Rise = 2.92 + 0.467 Load + 2.04 Time – 0.0257 Mts – 0.066 Temp + 0.00278 Trafic
even i tried that, though that shows no significance
ok will tell you what was the data collection procedure for this
– this is data of every cell site (GSM), load is the number of sectors that the site supports and traffic is the number of calls that were reported during that time period
– aim is to schedule the battery before the diesel generator so that when the power goes (mains fail) the site should first go on battery, thereby reducing the run time for DG. Now when the site is on battery the AC (air conditioner) does not work and thereby the temperature rises. the temp rise is dependent on the current the equipment are drawing from the battery which is dependent on traffic (number of calls ). At every site the technician was asked to switch off the mains supply putting the site on battery and note the tem rise after interval of time (somewhere it was 30,60,90,120,150,180) and somewhere it was taken on an hourly basis (statistically don’t think that should be a big factor for creating a bias in analysis). The battery life and the equipment life is uneffected till 40degree C average for the entire year, beyond which it is reduced to half.
– we are finding out a optimised path to schedule the time for which battery should run before the DG is switched on in order to again run the AC to bring the temp down, without effecting the effeciency of the site
on analysing the data, i found that in the first hour the temp rise is high but after the first hour is over it follows a normal pattern of rise, this can be attributed to matching the inside (shelter, where equipment is placed) to the outside temp, since its insulated it takes almost an hour to get to the outside temp.
– if we remove the rise of temp for the first one hour and then do the regression analysis, will it be of use, i still need to do that at minitab which i shall do it today and post it here
I WOULD REQUEST TO PROVIDE A LAYMAN’S EXPLANATION OF ANALYSIS ALONGWITH FOR BETTER UNDERSTANDING,
thanks again for all efforts
rgds,

0
#102680

Robert Butler
Participant

Based on your latest post it would appear that the issue is not temperature rise. Instead, your critical measure would seem to be when and how your system approaches the region of 40C.  If you take the measured temperatures and difference them from 40 and use this as your Y variable and plot your data you will find a very nice linear relation between this difference and minutes as well as what looks to be a log relation between load and the difference.  A plot of traffic against this difference doesn’t reveal any trend.  If you take the variables of traffic, load, and minutes and normalize each of them to a -1 to 1 range and then run a simple linear regression against the difference you will get a model with a snginficant minutes and load effect and no significant traffic effect.
I don’t like focusing on R2 as a lone assessment of a regression equation for the simple reason that R2 is only one measure of one aspect of a regression and it is easily manipulated, however, since you seem to want the value the R2 for this simple model is .76 and the adjusted is .73.  If you build a simple linear model with only load and minutes and run a proper regression analysis you will find the residual patterns to be acceptable.  The various statistics for residual normality are also within limits.  The residual pattern suggests you may get a better fit if you run the regression against normalized minutes and the log of the load. The mean square error for the model is 3.5.
If you build the equation in this fashion and confirm it with additional observations you will have an equation relating your systems temperature difference from 40C to minutes on battery and load. Your data would also suggest that for the ranges of traffic recorded, traffic isn’t the main issue.  If the equation is confirmed you will have an equation that will permit you to make decisions concerning when to turn on the DG.

0
#105964

indresh
Participant

Site name
Configuration
Initial temperature inside shelter
Rise in temperature
Initial temperature outside shelter
Time elapsed in minutes
time
Traffic

1
18
30

24
0
12:15
734

18
31.5
1.5
34
30
12:45
799

18
35.3
3.8
31
60
13:15
845

18
39.3
3.9
29.4
90
13:45
836

18
42.9
3.7
30
120
14:15
760

18
45.8
2.9
33.3
150
14:45
671

18
48.8
3.0
35
180
15:15
653

18
30.0

25
0
16:15
693

18
37.6
7.6
29
30
16:45
705

18
41.7
4.1
28
60
17:15
774

18
45.6
3.9
28
90
17:45
941

18
48.9
3.3
27
120
18:15
960

18
52.7
3.7
27
150
18:45
892

18
54.9
2.2
26
180
19:15
925

18
25.0

25
0
20:15
957

18
39.5
14.5
25
30
20:45
957

18
44.0
4.6
25
60
21:15
958

18
46.3
2.3
24
90
21:45
898

18
50.4
4.1
24
120
22:15
853

18
52.2
1.8
24
150
22:45
824

2
6
21.3

32
0
11:00
359

6
29.2
7.9
32.5
30
11:30
382

6
34.1
4.9
34.5
60
12:00

0
Viewing 12 posts - 1 through 12 (of 12 total)

The forum ‘General’ is closed to new topics and replies.