Boston College is a great university. So why do we find the graduate level multivariate statistics course Sociology 703 using, as “the” required text , the textbook Statistics and Data Analysis for Nursing Research? Because sociologists are so akin to nurses? Why does perhaps the graduate level statistics textbook for the social and behavioral sciences (Using Multivariate Statistics by Tabachnick & Fidell) fail to cover multivariate statistics? Why, in general, do graduate research & statistics courses across the sciences end up teaching how to use some statistical software package (e.g., SPSS) so that today’s researchers can implement statistical tests without understanding the mathematics which underlie them?
To answer this, we can look to the 3rd edition of The Linear Algebra a Beginning Graduate Student Ought to Know, noting that virtually the entirety of statistical methods employed in research rely on linear/matrix algebra:
Linear algebra is a living, active branch of mathematical research which is central to almost all other areas of mathematics and which has important applications in all branches of the physical and social sciences and in engineering. However, in recent years the content of linear algebra courses required to complete an undergraduate degree in mathematics—and even more so in other areas—at all but the most dedicated universities, has been depleted to the extent that it falls far short of what is in fact needed for graduate study and research or for real-world application. This is true not only in the areas of theoretical work but also in the areas of computational matrix theory, which are becoming more and more important to the working researcher as personal computers become a common and powerful tool. Students are not only less able to formulate or even follow mathematical proofs, they are also less able to understand the underlying mathematics of the numerical algorithms they must use. The resulting knowledge gap has led to frustration and recrimination on the part of both students and faculty alike, with each silently—and sometimes not so silently—blaming the other for the resulting state of affairs. This book is written with the intention of bridging that gap.
Note that this book is designed for those going into graduate mathematics programs, i.e., those most likely to appreciate the nuances and essential concepts in multivariate mathematics, statistics, probability, etc. It is not designed or intended to address the deficiencies we find in the graduate mathematical/statistical education among students who took (hopefully) an intro. to statistics course as undergrads, which are far more severe. More importantly, if mathematics majors are increasingly incapable of demonstrating a sufficiently in-depth grasp of the foundations of multivariate statistics, how can we expect more of researchers who lack anything close to this level of familiarity with mathematics?
We can’t. So we have two choices: 1) We can start ensuring that researchers actually understand the statistical methods they use or 2) we can develop software packages that enable researchers to use methods they don’t understand. So far, we have opted for the latter. As a result, research is replete with terribly flawed uses of advanced statistical models.