Don Turnblade

Forum Replies Created

Forum Replies Created

Viewing 7 posts - 1 through 7 (of 7 total)
  • Author
  • #201884

    Don Turnblade

    44/2357 = 1.73%

    Exact Binomial computation:
    97.98% confidence is 54/2367 or 2.28%

    Student’s T approximation of Binomial
    tinv(.05,2367-1)/(2*sqrt(2367) = 2.02%

    Correctly done approximate of Student’s T for binomial
    sqrt(5/(2367-3))/2 =2.30%

    From each of these 44/2367 or 1.73% is not enough to be 95% confident that binomial success/failure “dice” effects could not cause this result at random.

    Binomial Statistics for “dice” odds
    1 = (1)^N = (Odds_Win – (1-Odds_Win))^N = sum_0-N( N!/(Num_win)!*(N-Num_win)!*Odds_win^Num_win * (1-Odds_win)^(N-Num_win)

    Sum 0 to X_wins until cumulative probability is on or near 97.5% cumulatively.
    Odds Wins Cumulative Odds
    0.017808975 51 94.692966%
    0.013981322 52 96.091098%
    0.01076459 53 97.167558%
    0.008130949 54 97.980652%

    54/2367 = 2.28% 95% confidence that any combination of wins or losses for 2367 tries could at random create 54 wins. 41 wins out of 2367 tries is not 95% confident that is could not be created by luck of the dice alone rather than a fair effect being measured.


    Don Turnblade

    I was trying to build you a more precise model to look at the confidence interval around your measurement 41/2373. However the closer I look at it the less hope I have for this measurement. I believe you have enough confidence for Marketing which only required a 66% confidence interval to make an inference. Business benefits from the 95% confidence interval as this is the cut off line when a firm can legally guarantee something. Depending on the confidence interval modeling, this number meets or fails the 95% confidence interval. When it comes to Engineering modeling for reliability to protect human life 99.9% confidence quickly appears, this would cleanly fail.


    Don Turnblade

    Curve fit of tinv(.025,N-1)*sqrt(Odds(1-Odds)/N) = sqrt(Odds*(Odds-1)/N)*sqrt(5*N/(N-3))

    This reduces to sqrt(Odds*(1-Odds))*sqrt(5/(N-3)) ; using Tinv(.025,N-1)*sqrt(Odds(1-Odds)/N) is a reasonable approximation of dice behavior after N>=4. Generally below about N=20, one can directly compute the probability distribution for dice using Binomial statistics.

    Odds(x_successes) = N!/(x!*(N-x)!)*Odds_success^x*(1-Odds_success)^(N-x)
    Decent approximation N!/(x!*(N-x)!) if x>0.
    = (12x/(12x+1))*(12*(N-x)/((12*(N-x)+1))*((12*N+1)/(12*N))*(N/(N-x))^N*((N-x)/x)^x*sqrt(N/(2*PI*x*(N-x)))

    Based on Stirlings Approximation n! = 1/sqrt(2*PI*n)*(n/e)^n*(1+1/(12*n))
    This allows for nearly direct computation of binomial odds even when a spreadsheet cannot handler numbers much larger than fact(139), 139!.

    OK, so your spreadsheet would have 2373 +1 rows of such computations in it. But it then could directly estimate your 95% confidence interval.


    Don Turnblade


    The best advice I have when using statistical modeling is that if it fails a hypothesis test you do not have to think much. But, if it passes, that is when it is time to start thinking.

    Rule of thumb approximation for 95% confidence interval of tinv(0.025)*sqrt(Odds(1-Odds)/N) = sqrt(odds*(1-odds))*sqrt(5/(N-3))

    Your sample size N is 2372.
    Thus, worst case, 1/2*sqrt(5/(2373-3)) = 1.15%

    Your average is 1.7% so it is plausible to think you have enough samples to be 95% sure that random dice sampling could not cause your average. 1.7% +/- 1.1%
    Now, it is the time to start thinking carefully. If random dice is not the cause of this average, what is?


    Don Turnblade


    I do a lot of likelihood modeling for InfoSec work. I have some suggestions that may spark some useful thoughts. My notes are over focused on InfoSec/Risk Management concerns but may give you ideas to assist with odds based model building.

    Ideally, one would use half your data to fit the model and half your data to test the model. Data might be split based on present state followed by projection testing in the next state. This is one area where random splits of data sets may not help you test whether the models built have good protective capability.

    Not all odds are correlated. Function1(Odds1)*Function2(Odds2) can lead to several combinations. In my area Poisson modeling of odds helps build these models -ln(1-Odds_Failure) = L; Function1 = m1 *L1 + b1 ; Function2 = m2 * L2 + b2 ; Fuction1*Function2 = m1*m2*L1*L2 + m1*b2*L1 + m2*b1*L2 + b1*b2. Testing against these models is very easy to do using log transformation of sample data. ln(Funtion1*function2) = ln(function1) + ln(function2) ; Using Log transformations can help you use linear correlation testing to identify whether multiplication is a good fit.

    After data is roughly as linear in a graphical sense either naturally or due to simple transforms such as ln(x):

    Short review of linear correlation testing using spreadsheet functions.
    Var1 Var2
    N Count N Count N
    <x> average <x> average <y>
    Vxx sumproduct(x,x)/N-x^2 sumproduct(y,y)/N-y^2
    Vyx sumproduct(Y,X)/N-x*y
    F0 (Fdist hypothesis test) (N-2)*Vyx^2/(Vxx*Vyy-Vyx^2)
    Test 95% confidence 1=correlated if(F0>=tinv(.05,N-1)^2,1,0)

    M, Slope if(Test=1,Vyx/Vxx,0)
    B, intercept Ave_y – M*Ave_x
    S, Model Sigma if(Test=1,sqrt(N/(N-2)*(Vyy-M*Vyx)),sqrt(N/(N-1)*Vyy))
    CI95 Confidence Interval 95% if(Test=1,tinv(.025,N-2),tinv(.025,N-1))*S
    Failure of Test means that 95% of the time, random noise rather than linear modeling is a better explanation, no line should be used.

    Note: Both full factorial and partial factorial test tables remove any correlation between variables caused by the test design rather than the data itself. Thus, linear testing between any variable and the outcome could be statistically independent tests. Each variable could be tested against the outcome and fit. Then, the modeled change removed from the outcome and tests for other variable correlation can then begin again.
    Naturally, this is easier to do with a good statistical package, but it can be done by hand with sweat equity with a large spreadsheet — especially if this is needed but only sweat equity funding is available. (I can make a better one if I get the right tools is not a bad internal sales pitch.)

    Testing lots of possible models:
    Once linearized, Partial Factorial Testing of Models:
    Partial factorial arrays can build test cases for single factor and multiplied probability very nicely as well as comment on correlation with sample data. Cases in such arrays can naturally test Odds1 + Odds2 + Odds3 vs Odds1*Odds2 + Odds1*Odds3 + Odds2*Odds3 and even Odds1*Odds2*Odds3; In effect the test case combinations help you select among models for odds.

    In my case, I face significant uncertainties in input data and wish to look at models in a three level state. Sometimes, I have credible Average and Sigma on inputs. Other times, I have human expert interview data — which needs special treatment to avoid center bias, extreme distortions, self-reporting shame/honor bias and simple human mismatch with statistical or odds based thinking. Trinary full factorial arrays help in that case.

    Forced choice grading of combinations proposed by a partial factorial array can also help considerably when looking for fundamental relationships between factors.

    I had a group of InfoSec experts rate the risk of a system facing the internet on a force choice scale from 1 to 32 (only one choice is allowed) 1 being best protection and 16 being worst protection from their experience.

    V1: Web Server ISS/Apache
    V2: Database: MS SQL, MySQL
    V3: Application Server: Windows/Linux
    V4: Framework: .Net/Python
    V4: All on one system/Split into separate systems.

    This addressed a vexing case of risk management that was not process and policy dependent. When combined with attacker behavior models and Business Continuity Down Time estimates. This allowed a model to be used by IT Audit to estimate risk on developed platforms that included approximate modeled feedback form InfoSec staff. The result was improved credibility in risk ranking and a self-teaching model by project managers


    Don Turnblade

    Information Security vulnerability patching queues is a rather hot topic. Consider the time in queue of a critical vulnerability unpatched vs the mean time to exploit that vulnerability and harm the firm. Then, there is the cost of more staff to set up patching jobs, QA test them before releasing them to production. Is a faster team more expensive? Should there be more parallel teams? Lot splitting of jobs? Are some areas of the network more sensitive to external versus internal attack. What about patching plans that are 95% confident to always complete before a compliance deadline. PCI DSS requires that critical vulnerability patches be resolved in 30 days, High vulnerability patches in 90 days. Internet facing systems need to patch even Medium vulnerability patches. What about Whitehat statistics suggesting that 80% of websites were vulnerable to one or more critical vulnerability all year in 2015? Coding standards to prevent, discover and remediate both Internet and Intranet facing vulnerabilities have lots of factors to create reliable processes. Cost effective change. Lean processes. Reliable patching to meet or exceed compliance objectives.


    Don Turnblade

    Drive storage arrays such as RAID stacks have large numbers of disks running in parallel that can fail and automated procedures that spin up hot swappable drives to replace these exist. But predicting purchase rates when cheaper drives fail faster versus purchase and failure rates when more expensive drives take longer to fail leads to a cost of quality problem in operation. Initially, it might be a time dependent binomial or Poisson statistical problem. First cut models that I used at the time searched for the peak probability of failure and a confidence interval for hot swappable drive inventory management. But, quality of drive factors can improve this.

Viewing 7 posts - 1 through 7 (of 7 total)