Math 6010, Spring 2004, Project 1 (Due: October 15, 2004) 1- INTRODUCTION: For your first project have a look at the file "proj1.dat" in the listed directory. This file contains grades, from a previous term, of a certain section of a large statistics course offered by the Mathematics Department of the University of Utah. In order to be able to analyse this data set properly, you need to know a few things about the aforementioned course: It is a rather elementary, first-year course on statistic. Its mathematical content is quite small, and the students are introduced to a number of deep ideas in a setting that is as non-technical as is possible. The students' backgrounds vary tremendously; this is a "QB Course." This might explain why about 30% of the students attend the main lectures, and 1-5% of the students invariably miss a midterm exam. 2- ORGANIZATION OF THE COURSE: The course in question was based on two weekly lectures delivered by the in- structor of the course. Then, the students would meet with a T.A. once a week, and solve problems, seek help, etc. The students' grades were based on weekly assignments (totaling nine), three midterms, and a final examination. o The lowest two assignments were dropped. All other homework contributed equally to a total of 20% of the final grade. o The lowest midterm score was dropped. The other two midterms each contributed to 30% of the final grade. o The final contributed to 20% of the final grade. 3- ORGANIZATION OF THE FILE "proj1.dat": o THE ROWS: Each row corresponds to an individual student in the section that we are studying. There is one exception to this rule: THE LAST ROW CONTAINS THE MAXIMUM SCORE FOR EACH COLUMN. o THE COLUMNS: The columns respectively denote: H1 H2 H3 M1 H4 H5 H6 M2 H7 H8 M3 H9 F, where "Hk" refers to "Homework #k," "Mk" to "Midterm #k," and "F" to "Final." 4- YOUR PROJECT: o Load the file "proj1.dat" into the package of your choice. My advice would be "R." o Transform the raw data into a data set that has one column for "Homework," one for "Midterm," and one for "Final." - Compute "Homework" by first rescaling all homeworks to make them count the same. Then drop the lowest two homeworks. (HINT: In "R" try "sort".) Add the remaining homework grades and transform them to a percentage out of 20% (max. homework). - Perform similar tasks to obtain a column for "Midterm." It should contain percentages out of 60%. - Transform the final-exam scores to make them out of 20%. This is "Final." o Consider the regression problem, Final = a + b Homework + c Midterm + noise. - Estimate the parameters a, b, and c by least squares. - Assume the normal-error model, and test the hypothesis that "Homework is ineffective toward Final." Do this at 95%. - Explore whether or not the normal-errors model is valid. 5- YOUR REPORT: o Form teams of 2-3 people. Share the thinking. Divide the technical tasks. But: - Write up your own findings independently of your team. - Write the report to your boss who runs "your statistical consulting firm." "School term-paper" standards are generally not high enough. - Needless to say, your report MUST be typed. If you are in, or plan to be in, the MSTAT program, then you might as well write your report in LaTeX-2e. If you do not know LaTeX-2e, then learn it; you will soon have to anyway. - Your report MUST have a front page. This page contains a title, your name, the name(s) of your team-mate(s), and other, relevant, information. - Your report must have an introduction. You explain the background here. - Your report must also have at least one technical section. This is where you explain your analysis in detail (and correct prose). - Your findings are reported in a separate section. - You may choose to write a "Conclusions" section, but this is entirely optional. - If you use/cite technical material, then cite carefully a source and add the source in your bibilography. This remark applies to your textbook, anything that I have written, others' manuscripts and notes, etc. 6- BON CHANCE!