R version 2.10.1 (2009-12-14) Copyright (C) 2009 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [R.app GUI 1.31 (5538) powerpc-apple-darwin8.11.1] ># Math 3080-1 (Treibergs) Jan. 25, 2010 ># ># Example of Chi-Squared test for probabilities ># depending on a parameter ># ># From J. Devore, "Probability and Statistics for ># Engineering and the Sciences" 4th ed p. 606 no. 14. ># ># A paper in "Annals of Mathematical Statistics" reports ># the following data on the number of borers in each of ># 120 groups of borers. Does the poisson pmf provide a ># plausible model for the distribution of the number of ># borers in a group? (Hint: add the frequencies of ># 7,8,...,12 to establish a single category ">= 7." ># ># x = No. of borers in group ># 0 1 2 3 4 5 6 7 8 9 10 11 12 ># ------------------------------------- ># 24 16 16 18 15 9 6 5 3 4 3 0 1 ># Freq.:No of groups with exactly x borers ># > fr <- c(24,16,16,18,15,9,6,5,3,4,3,0,1); fr [1] 24 16 16 18 15 9 6 5 3 4 3 0 1 > sum(fr) [1] 120 > nn <- 0:12;nn [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 > borers <- sum(nn*fr) > borers [1] 380 ># ># E(x)=lambda for Poisson. Full MLE estimator is ># elambda = Total no borers / total no of groups ># > elambda <- 380/120;elambda [1] 3.166667 ># ># pdf for Poisson ># > poi <- dpois(nn, lambda= elambda);poi [1] 4.214384e-02 1.334555e-01 2.113045e-01 [4] 2.230437e-01 1.765763e-01 1.118316e-01 [7] 5.902225e-02 2.670054e-02 1.056896e-02 [10] 3.718710e-03 1.177591e-03 3.390036e-04 [13] 8.945928e-05 ># ># expected no if Poisson ># > 120*poi [1] 5.05726122 16.01466053 25.35654584 26.76524284 [5] 21.18915058 13.41979537 7.08266978 3.20406490 [9] 1.26827569 0.44624515 0.14131096 0.04068043 [13] 0.01073511 ># ># categories with nn > 6 have fewer than 5 so ># so rule-of-thumb for applicability fails. ># Remedy: lump all x>6 into one category. ># > sum(fr[nn>6]) [1] 16 > fr2 <- c(fr[nn<7],16); fr2 [1] 24 16 16 18 15 9 6 16 ># ># Cum. dist fn is called ppois() ># > poi2 <-c(poi[nn<7],1-ppois(6,lambda=elambda));poi2 [1] 0.04214384 0.13345550 0.21130455 0.22304369 [5] 0.17657625 0.11183163 0.05902225 0.04262228 > 120*poi2 [1] 5.057261 16.014661 25.356546 26.765243 21.189151 [6] 13.419795 7.082670 5.114674 > sum(poi2) [1] 1 > chisq.test(fr2,p=poi2) Chi-squared test for given probabilities data: fr2 X-squared = 103.8717, df = 7, p-value < 2.2e-16 > qchisq(.05,df=6,lower.tail =T) [1] 1.635383 > qchisq(.05,df=7,lower.tail =T) [1] 2.16735 > ># ># There are k=8 categories after lumping. ># m=1 parameters are estimated. >$ The Full MLE is not the MLE for Binned data ># but can be used. The critical value c for level alpha ># falls between ># qchisq(.05,df=k-m-1,lower.tail = F) < c < qchisq(.05,df=k-1,lower.tail = F) ># Here, df = k - 1 - m = 8 - 1 - 1 = 6 or df= k-1=7 > qchisq(.001,df=7,lower.tail =F) [1] 24.32189 > qchisq(.001,df=6,lower.tail =F) [1] 22.45774 ># Our chi-sq far exceeds the upper estimate. ># ># It's strong evidence that distribution is not Poisson. >#