help required
Six Sigma – iSixSigma › Forums › Old Forums › General › help required
 This topic has 11 replies, 7 voices, and was last updated 17 years, 9 months ago by indresh.

AuthorPosts

June 29, 2004 at 10:48 am #36007
indreshParticipant@indresh Include @indresh in your post and this person will
be notified via email.dear all,
i am currentlt working on a project where there are three variables namely time, load, traffic and there output on rise in temperature
load is the number of hardware components installed and traffic is number of people, time is the time duration and mapping is done for the rise in temperature
what is the best way of analysis that can be applied with output contineous but variables are both contineous and discrete
rgds,
0June 29, 2004 at 12:12 pm #102604Hi
What Do you exactly want to do in your analysis?I mean what is the objective of your data collection?
0June 29, 2004 at 12:23 pm #102605
indreshParticipant@indresh Include @indresh in your post and this person will
be notified via email.find a relation between the factors and how they effect the rise in temperature
factors being the load, time and traffic
rgds,Load
Time
Mts
Rise in degree celcius
temperature
Traffic6
14:30
35
5.00
28
1986
14:55
60
4.00
32
2126
15:30
90
1.00
33
2356
16:00
120
2.60
35.6
1956
16:30
150
1.00
36.6
2066
17:00
180
1.00
37.6
21912
15:00
30
11.70
32
20612
15:30
60
6.00
38
18512
16:00
90
4.00
42.1
16612
16:30
120
4.00
46.5
21412
17:00
150
2.00
48.4
22912
17:30
180
2.00
50.1
23710
13:00
60
11.00
34
54010
14:00
120
4.00
38
64510
14:30
150
2.00
40
5976
15:00
30
4.50
26
4806
15:30
60
4.00
29.6
4566
16:00
90
3.00
33
4656
16:30
120
5.00
38
4276
17:00
150
2.00
40
5656
17:30
180
2.00
42
6115
17:30
60
4.00
25
2225
18:00
90
2.20
27.2
2785
18:30
120
2.80
30
2525
19:00
150
2.60
32.6
2755
19:30
180
1.80
34.8
27415
13:20
60
5.00
31
37415
14:20
120
7.00
37.9
31915
15:20
180
7.00
45
37015
15:50
220
3.40
48.4
380
0June 29, 2004 at 1:07 pm #102608
Ken FeldmanParticipant@Darth Include @Darth in your post and this person will
be notified via email.Looking for relationships between variables usually calls for some type of regression analysis. If the Y is continuous and the Xs are both continuous and discrete, as you mentioned, then the discrete values need to be substituted with dummy or indicator variables. Search the web or a stat book to learn the specifics of doing this. It can easily be done in Minitab if you have access to that program.
0June 29, 2004 at 2:04 pm #102614
GollapudiParticipant@Kalyan Include @Kalyan in your post and this person will
be notified via email.Indresh, looks like your data is not summarized properly. Can you give me brief on your data collection procedure.
0June 29, 2004 at 3:41 pm #102617As Darth pointed out, it only takes a few minutes with Minitab. The fact that some of the data are discrete shouldn’t be a problem. Since you have five variables, you may as well fit all of them and see what happens. The only challenge is time, since it is coded as “16:30”, so these values were all converted to a numeric value, where 12:00 = 0.5, 18:00 = 0.75, etc.The results from Minitab are then…———————————————Regression Analysis: Rise versus Load, Time, Mts, Temp, Trafic The regression equation is
Rise = 2.92 + 0.467 Load + 2.04 Time – 0.0257 Mts – 0.066 Temp + 0.00278 Trafic
Predictor Coef SE Coef T P
Constant 2.924 7.316 0.40 0.693
Load 0.4670 0.1676 2.79 0.010
Time 2.041 9.403 0.22 0.830
Mts 0.02568 0.01334 1.92 0.066
Temp 0.0665 0.1039 0.64 0.528
Trafic 0.002785 0.002902 0.96 0.347
S = 1.87584 RSq = 56.7% RSq(adj) = 47.7%
Analysis of VarianceSource DF SS MS F P
Regression 5 110.698 22.140 6.29 0.001
Residual Error 24 84.450 3.519
Total 29 195.148
Source DF Seq SS
Load 1 40.406
Time 1 32.218
Mts 1 33.109
Temp 1 1.724
Trafic 1 3.240
Unusual ObservationsObs Load Rise Fit SE Fit Residual St Resid
7 12.0 11.700 7.481 0.855 4.219 2.53R
13 10.0 11.000 6.404 0.838 4.596 2.74R
27 15.0 5.000 8.504 1.100 3.504 2.31R0June 29, 2004 at 3:51 pm #102619
Ken FeldmanParticipant@Darth Include @Darth in your post and this person will
be notified via email.Nice effort Tim. What strikes me first is the fact that the R square adj. is only about 48%. I hope the original poster has more Xs since he is only explaining about 1/2 of the variation in Y with the ones he has. Did you try any analysis of residuals or multicollinearity? Did the pvalues for the individual Xs show significance?
0June 29, 2004 at 4:03 pm #102620
V. LaxmananMember@V.Laxmanan Include @V.Laxmanan in your post and this person will
be notified via email.Dear Indresh:
I just saw the data that you have posted. Imagine the following two experiments.
Experiment 1: Something is falling.
We can do experiments and find the rate at which it is falling.
Experiment 2: Something is alternately rising and falling.
Again we can do experiments and find the rate at which it is rising and falling.
Now, you want to understand what is going on.
Both these experiments that I have mentioned were done and changed our understanding of the universe. The first experiment was done in the last part of the 16th century. The second experiment was done in the first part of the 20th century.
Also, interestingly, just as in your problem, the author of the second experiment suspected that there are three types of forces acting on that “something” that is alternately rising and falling. He thought about what his experiments were telling him deeply.
Then he developed a simple mathematical model to explain what he was observing. He then tested the model thoroughly and did hundreds of experiments. He kept such good records – his lab notebooks are writtten so neatly it is unbelievable – that even today we can go back and analyze the data and see if we agree with his conclusions.
Interestingly, he did never uses any statistical arguments. After more than 8 years of experimentation he established something so important that he was awarded the Nobel Prize!
We can all learn from this first American born and educated scientist did. And, what is more interesting is that he started in Nobel Prize winning experiments when he well into what we call middle age.
So, age is no barrier to the desire to learn and enquire deeply into our observations. We could all benefit from a deeper understanding of what this great scientist did.
Now, let me give you his name. His name is Robert A. Millikan and he is one of the few scientists who has written his autobiography. I have learned a lot from what he did, and how he used pure mathematical logic, to discover two fundamental constants of nature. This was done, let me repeat, without the help of any of the statistical tools that we used today.
Surprisingly, he does not even use least square regression, although he finds linear relations in his experiments!
Your data reminded me of Millikan’s experiments. I don’t know what we can learn, but I just wanted to share this perspective. Good luck with your project. Regards.
Laxman0June 29, 2004 at 5:33 pm #102624
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Your data looks like repeat measure data. You appear to have set a load condition and then you let the process run (?) for a period of time which seems to be 30 minutes but can be longer. At the end of that initial time you take a measure of the temperature and the traffic. You then repeat this process at 30 minute intervals under the same load condition. The number of 30 minute intervals per load condition varies.
If you are interested in building a predictive equation connecting load, traffic, and time to either the absolute temperature or the delta temperature and if the data is in fact repeat measure data, you will have to find a package capable of properly handling repeat measure data. Regular linear regression packages will treat the repeat measures as independent measures and the resultant model will be of little value.
Providing answers to the following questions may allow others on this forum to offer additional advice.
1. Are the assumptions I listed above valid?
2. Is there a reason for the varying amount of minutes before first measurement?
3. Is there a reason for the varying number of 30 minute intervals?
4. Do you have any control over the variable “Traffic”?
0June 30, 2004 at 4:24 am #102659
indreshParticipant@indresh Include @indresh in your post and this person will
be notified via email.dear all,
thanks for replying
will reply to all questions again asked
TIM , LAXMAN, DARTH, ROBERT
The regression equation is Rise = 2.92 + 0.467 Load + 2.04 Time – 0.0257 Mts – 0.066 Temp + 0.00278 Trafic
even i tried that, though that shows no significance
ok will tell you what was the data collection procedure for this
– this is data of every cell site (GSM), load is the number of sectors that the site supports and traffic is the number of calls that were reported during that time period
– aim is to schedule the battery before the diesel generator so that when the power goes (mains fail) the site should first go on battery, thereby reducing the run time for DG. Now when the site is on battery the AC (air conditioner) does not work and thereby the temperature rises. the temp rise is dependent on the current the equipment are drawing from the battery which is dependent on traffic (number of calls ). At every site the technician was asked to switch off the mains supply putting the site on battery and note the tem rise after interval of time (somewhere it was 30,60,90,120,150,180) and somewhere it was taken on an hourly basis (statistically don’t think that should be a big factor for creating a bias in analysis). The battery life and the equipment life is uneffected till 40degree C average for the entire year, beyond which it is reduced to half.
– we are finding out a optimised path to schedule the time for which battery should run before the DG is switched on in order to again run the AC to bring the temp down, without effecting the effeciency of the site
on analysing the data, i found that in the first hour the temp rise is high but after the first hour is over it follows a normal pattern of rise, this can be attributed to matching the inside (shelter, where equipment is placed) to the outside temp, since its insulated it takes almost an hour to get to the outside temp.
– if we remove the rise of temp for the first one hour and then do the regression analysis, will it be of use, i still need to do that at minitab which i shall do it today and post it here
I WOULD REQUEST TO PROVIDE A LAYMAN’S EXPLANATION OF ANALYSIS ALONGWITH FOR BETTER UNDERSTANDING,
thanks again for all efforts
rgds,0June 30, 2004 at 12:51 pm #102680
Robert ButlerParticipant@rbutler Include @rbutler in your post and this person will
be notified via email.Based on your latest post it would appear that the issue is not temperature rise. Instead, your critical measure would seem to be when and how your system approaches the region of 40C. If you take the measured temperatures and difference them from 40 and use this as your Y variable and plot your data you will find a very nice linear relation between this difference and minutes as well as what looks to be a log relation between load and the difference. A plot of traffic against this difference doesn’t reveal any trend. If you take the variables of traffic, load, and minutes and normalize each of them to a 1 to 1 range and then run a simple linear regression against the difference you will get a model with a snginficant minutes and load effect and no significant traffic effect.
I don’t like focusing on R2 as a lone assessment of a regression equation for the simple reason that R2 is only one measure of one aspect of a regression and it is easily manipulated, however, since you seem to want the value the R2 for this simple model is .76 and the adjusted is .73. If you build a simple linear model with only load and minutes and run a proper regression analysis you will find the residual patterns to be acceptable. The various statistics for residual normality are also within limits. The residual pattern suggests you may get a better fit if you run the regression against normalized minutes and the log of the load. The mean square error for the model is 3.5.
If you build the equation in this fashion and confirm it with additional observations you will have an equation relating your systems temperature difference from 40C to minutes on battery and load. Your data would also suggest that for the ranges of traffic recorded, traffic isn’t the main issue. If the equation is confirmed you will have an equation that will permit you to make decisions concerning when to turn on the DG.0August 19, 2004 at 6:21 am #105964
indreshParticipant@indresh Include @indresh in your post and this person will
be notified via email.Site name
Configuration
Initial temperature inside shelter
Rise in temperature
Initial temperature outside shelter
Time elapsed in minutes
time
Traffic1
Hinjewadi
18
3024
0
12:15
73418
31.5
1.5
34
30
12:45
79918
35.3
3.8
31
60
13:15
84518
39.3
3.9
29.4
90
13:45
83618
42.9
3.7
30
120
14:15
76018
45.8
2.9
33.3
150
14:45
67118
48.8
3.0
35
180
15:15
65318
30.025
0
16:15
69318
37.6
7.6
29
30
16:45
70518
41.7
4.1
28
60
17:15
77418
45.6
3.9
28
90
17:45
94118
48.9
3.3
27
120
18:15
96018
52.7
3.7
27
150
18:45
89218
54.9
2.2
26
180
19:15
92518
25.025
0
20:15
95718
39.5
14.5
25
30
20:45
95718
44.0
4.6
25
60
21:15
95818
46.3
2.3
24
90
21:45
89818
50.4
4.1
24
120
22:15
85318
52.2
1.8
24
150
22:45
8242
Khadki
6
21.332
0
11:00
3596
29.2
7.9
32.5
30
11:30
3826
34.1
4.9
34.5
60
12:000 
AuthorPosts
The forum ‘General’ is closed to new topics and replies.