Math 3080-1 TERM PROJECT Due Wednesday, Treibergs April 28, 2010 Instructions You are to work alone on this project, using only your texts and notes. You may ask either Carlos Gamez or me questions but noone else. In answering each question, you should provide a typewritten response with references to statistical output that you will attach to the back of your report. Be sure to number the pages and refer to the appropriate page number. The response to each question should conclude with a brief summary of your findings described in the context of the practical situation and written for an audience of non-statisticians. Questions on the Colleges and Universities Data This data is taken from David A. Levine, Patricia M. Ramsey & Robert K. Smidt, "Applied Statistics for Engineers and Scientists," Prentice Hall (2001) p. 668. The following variables were collected for 80 colleges and universities: School Name of Institution Type of Term Academic Calendar Type (1=semester, 0=other) Type of School Institution is (1=private, 0=public) Average Total SAT School average for total score of Scholastic Aptitude Test TOEFL Score Test of English as a Foreign Language (1=criterion at least 550, 0=otherwise) Room and Board Room and board expenses (in thousands of dollars) Annual Total Cost Annual total cost (in thousands of dollars) Average Indebtedness Average indebtedness at graduation (in thousands of dollars) 1. Using the Colleges & Universities Data, develop a multiple linear regression model to predict the Average Indebtedness from all other variables. Be sure to consider transformations of the variables as well as powers of the variables and interactions between the independent variables. Are there any variables you would delete from the model? If so, explain why. Remove the variables you do not find useful and give the best model you would recommend. Give the reasons for your choice. Perform a thorough residual analysis, discuss the usefulness of your model and discuss whether the model assumptions are reasonably satisfied. Colleges and Universities Data School Type of Term Type of School Average Total SAT TOEFL Score Room and Board Annual Total Cost Average Indebtedness ArizonaStateUniversity 1 0 1080 0 4.3 12.7 12.9 BallStateUniversity 1 0 985 1 4 12.5 8.21 Cal.StateUniv.-Fresno 1 0 955 0 5.4 13.1 8.76 ClemsonUniversity 1 0 1130 1 3.9 12.4 9.98 CollegeofWilliam&Mary 1 0 1295 1 4.5 19.4 13.42 FloridaInternationalUniv. 1 0 1135 0 2.7 10 4.14 FloridaStateUniversity 1 0 1180 1 4.5 11.5 16.5 GeorgeMasonUniversity 1 0 1055 1 5 17 13 GeorgiaStateUniversity 0 0 1115 0 7.4 15.4 8.08 MontclairStateUniversity 1 0 1025 0 5.3 10.2 4.5 NorthCarolinaStateUniv. 1 0 1145 1 4 14.3 14.99 OregonStateUniversity 0 0 1072 1 4.4 15.5 10.5 PurdueUniversity 1 0 1095 1 4.5 15.2 11.84 SanDiegoStateUniversity 1 0 945 1 6.2 13.6 6.75 SlipperyRockUniv.ofPenn. 1 0 955 0 3.6 12.9 17 SUNY-Binghamton 1 0 1039 1 4.6 13.4 6.25 TexasA&MUniversity 1 0 1150 1 3.9 12.7 14.1 Univ.ofGeorgia 0 0 1180 1 4 11.9 10.8 Univ.ofHawaii-Manoa 1 0 1075 0 4.7 12.6 3.62 Univ.ofHouston 1 0 1065 1 4.1 12.1 9.4 Univ.ofMaryland 1 0 1170 1 5.5 15.7 16.64 Univ.ofMass.-Amherst 1 0 1100 1 4.2 16.4 10.2 Univ.ofNevada-LasVegas 1 0 980 0 5.5 12.3 10 Univ.ofNewHampshire 1 0 1110 1 4.4 18.6 9.66 Univ.ofNorthCarolina-C.H. 1 0 1225 1 4.5 15.2 9.41 Univ.ofTexas-Austin 1 0 1215 1 3.9 12.9 10.2 Univ.ofVermont 1 0 1115 1 5.1 22.4 21.5 VirginiaCommonwealthUniv. 1 0 1005 1 4.3 16.3 14.73 VirginiaTech 1 0 1265 1 3.5 14.9 10.33 WestVirginiaUniversity 1 0 1025 1 4.6 11.7 10.7 BabsonCollege 1 1 1165 1 7.6 26.4 18 BostonCollege 1 1 1285 1 7.5 26.8 15.86 BostonUniversity 1 1 1235 1 7 27.9 14.46 BowdoinCollege 1 1 1345 1 6 27.8 13.64 BryantCollege 0 1 1080 1 6.7 20.6 18 BucknellUniversity 1 1 1255 1 5 25.4 12.5 CanisiusCollege 1 1 1143 0 5.9 18.8 14.82 CarnegieMellonUniversity 1 1 1335 1 6.1 25.6 15.68 CaseWesternReserveUniv. 1 1 1330 1 5 22.2 26.03 ClarkUniversity 1 1 1121 1 4.4 24.4 17.5 ColbyCollege 0 1 1275 1 5.7 27.9 11.63 ColgateUniversity 1 1 1300 1 5.9 27.6 9.24 CollegeofHolyCross 1 1 1275 1 6.7 26.8 12.63 EmoryUniversity 1 1 1310 1 6.5 26.6 15.31 FordhamUniversity 1 1 1150 1 7.4 23.4 8.59 Franklin&MarshallCollege 1 1 1260 1 4.5 26.4 11.5 GeorgeWashingtonUniversity 1 1 1235 1 6.9 26.7 14.37 GeorgetownUniversity 1 1 1330 1 7.5 27.5 14.01 GettysburgCollege 1 1 1200 1 4.8 26.4 11.75 HarvardUniversity 1 1 1465 1 7 28.9 11.65 IonaCollege 1 1 955 1 7.3 19.8 18 LafayetteCollege 1 1 1185 1 6.3 26.7 11.5 LaSalleUniversity 1 1 1105 0 6.7 20.8 11.7 LehighUniversity 1 1 1225 1 6 26.8 13.84 ManhattanCollege 1 1 952 1 7.1 22 9.27 NewYorkUniversity 1 1 1260 1 7.8 28.6 17.32 NiagaraUniversity 1 1 1065 0 5.4 17.6 11.58 NortheasternUniversity 0 1 1055 1 8.2 23.4 25.6 NorthwesternUniversity 0 1 1350 1 6.1 24.2 11.98 ProvidenceCollege 1 1 1185 1 6.7 22.9 17.5 RiceUniversity 1 1 1395 1 6 18 2.32 RochesterInst.Technology 0 1 1185 0 6.1 21.8 17.5 SeattleUniversity 0 1 1100 1 5.3 19.5 12 SetonHallUniversity 1 1 1030 1 7.1 20.8 14.9 SienaCollege 1 1 1095 0 5.4 17.6 18.25 SouthernMethodistUniversity 1 1 1150 1 5.3 21.3 12.11 St.BonaventureUniversity 1 1 1098 1 5.1 17.5 14 StanfordUniversity 0 1 1430 1 7.3 27.8 12.77 SyracuseUniversity 1 1 1180 1 7.2 24.3 14.5 TulaneUniversity 1 1 1270 1 6.3 27.5 13.85 Univ.ofChicago 0 1 1370 1 7.3 28.8 14.07 Univ.ofMiami 1 1 1145 1 7.1 25.7 16.07 Univ.ofNotreDame 1 1 1320 1 4.8 23.8 16.57 Univ.ofPennsylvania 1 1 1355 1 7.5 28.6 17.62 Univ.ofPortland 1 1 1135 0 4.5 18.9 13.9 Univ.ofScranton 0 1 1115 0 6.6 21.6 13.5 VanderbiltUniversity 1 1 1295 1 7.1 27.3 14.5 VillanovaUniversity 1 1 1242 1 7 24.8 17.13 WakeForestUniversity 1 1 1280 1 5.2 23.7 18.7 YaleUniversity 1 1 1450 1 6.7 28.9 13.57 Questions on the Traffic Data Data from Huber, Transportation Research Board, National Research Council, Washington D.C., 1957, as described by Sen and Srivastava, Springer, 1990. The more cars there are on a road, the slower the speed of the traffic. The transportation planner needs to know the dependence of speed on density in order to predict travel times for future highways. The data describes the DENSITY = vehicles per mile, and SPEED = miles per hour 2. Develop a simple regression model to describe the density of traffic as a function of the speed. Test the assumptions of your model and summarize your findings. What does the model predict as the mean density when the traffic is flowing at 25 mph? 3. If y = density of traffic and x = speed in mph, examine at least two transformations of the (x,y) data that might improve your model and discuss your findings. Choose a final model that you think best describes the data and explain your choice. What does your model now predict as the mean density when the traffic is flowing at 25 mph? Traffic Data DENSITY SPEED 20.4 38.8 27.4 31.5 106.2 10.6 80.4 16.1 141.3 7.7 130.9 8.3 121.7 8.5 106.5 11.1 130.5 8.6 101.1 11.1 123.9 9.8 144.2 7.8 29.5 31.8 30.8 31.6 26.5 34.0 35.7 28.9 30.0 28.8 106.2 10.5 97.0 12.3 90.1 13.2 106.7 11.4 99.3 11.2 107.2 10.3 109.1 11.4