R-Forge Logo

fastR

fastR is an R package that contains data and other utilities to support the book Foundations and Applications of Statistics: An Introduction using R. This book is designed for an upper-level undergraduate "mathematical statistics" course (or 2-semester sequence), but it a different from other books for this audience.

Obtaining the book

Foundations and Applications of Statistics: An Introduction Using R is being published by the American Mathematical Society and is scheduled to appear in early 2011. If you are interested in using the book before it is published, contact the author (rpruim@calvin.edu).

Approach of the book

Features of this book that help distinguish it from other books available for such a course include

Brief Outline of the book

Table of Contents [pdf]

The first four chapters of this book introduce important ideas in statistics (distributions, variability, hypothesis testing, confidence intervals) while developing a mathematical and computational toolkit. I cover this material in a one-semester course. And since some of my students only take the first semester, I wanted to be sure that they have gotten a sense for statistical practice and have some useful statistical skills even if they do not continue. Interestingly, as a result of designing my course so that stopping half-way makes some sense, I am finding that more of my students are continuing on to the second semester. My sample size is still small, but I hope that the trend continues, and would like to think it is due in part because the students are enjoying the course and can see “where it is going.”

The last three chapters deal primarily with two important methods for handling more complex statistical models: maximum likelihood and linear models (including regression, ANOVA, and an introduction to generalized linear models). This is not a comprehensive treatment of these topics, of course, but I hope it both provides flexible, usable statistical skills and prepares students for further learning.

Chi-squared tests for goodness of fit and for two-way tables using both the Pearson and likelihood ratio test statistics are covered after first generating empirical p-values based on simulations. The use of simulations here reinforces the notion of a sampling distribution and allows for a discussion about what makes a good test statistic when multiple test statistics are available. I have also included a brief introduction to Bayesian inference, some examples that use use simulations to investigate robustness, a few examples of permutations tests, and a discussion of Bradley-Terry models. The latter topic is one that I cover between Selection Sunday and the beginning of the NCAA Division I Basketball Tournament each year. An application of the method to the 2009–2010 season is included.

Various R functions and methods are described as we go along, and Appendix A provides an introduction to R focusing on the way R is used in the rest of the book. I recommend that you work through Appendix A simultaneously with first chapter – especially if you are unfamiliar with programming or with R.

Some of my students enter the course unfamiliar with the notation for things like sets, functions, and summation, so Appendix B contains a brief tour of the basic mathematical results and notation that are needed. The linear algebra required for parts of Chapter 4 and again in Chapters 6 and 7 is covered in Appendix C. These can be covered as needed or used as a quick reference. Appendix D is a review of the first four chapters in outline form. It is intended to remind prepare students for the remainder of the book after a semester break, but it could also be used as an end of term review.

fastR Package Project Summary

You can find the project summary page for the fastR package here.