It took me a while but I finally have some time to write my review about Efron and Hastie’s new book Computer Age Statistical Inference: Algorithms, Evidence and Data Science.
Bradley Efron is probably best known for his bootstrapping re-sampling technique. With his new book Computer Age Statistical Inference he provides a rather short overview of statistics as a whole. The book covers topics from the early beginnings of statistics to the new era of machine learning. As one can imagine covering such a huge amount of content is not an easy task and the two authors did their best to focus on a number of interesting aspects.
The book is separated into three major parts: classic statistical inference, early computer age methods, and twenty-first century topics. Hence, I will review each part individually as well. Despite the great number of topics the book covers it is definitely not meant for beginners. The authors assume a fair amount of algebra, probability theory as well as statistics. Nevertheless, I found it a great way to not only refresh my knowledge, but also delve deeper into various aspects of classical and modern statistics.
Classic Statistical Inference
Overall I think this is the strongest parts of the book. The authors did not go into extensive detail but covered interesting aspects of frequentist and bayesian inference. In addition, Efron and Hastie put emphasis on fisherian inference and maximum likelihood estimation, and demonstrated parallels between these different approaches as well as their historical connections. This really helped me to classify and interconnect all of these different methods. However, I found it a bit surprising on how little space is dedicated to frequentist and bayesian, compared to fisherian inference. On the one hand I really appreciated reading more about Fisher’s ideas and methods since it is often insufficiently covered in most text book. On the other hand, I would have hoped for some new insight into bayesian statistics.
Overall, I really enjoyed this part of the book. It helped me to get a deeper understanding of classical statistical methods.
Early Computer-Age Methods
This part of the book covers quite a variety of topics, from empirical Bayes, over generalized linear models (GLM), to cross-validation and the bootstrap. In particular the bootstrap is covered extensively and pops up in a number of chapters. While this is not particularly surprising given the background of the authors, it does feel a bit too much. Furthermore, I find that GLM are covered insufficiently (only 20 pages), considering the importance of linear models in all areas of statistics. However, given the extensive scope of this part of the book, the authors do a fairly good job by discussing each topic in detail while not being too general.
I especially liked the notes at the end of each chapter, which provided additional historic and mathematical annotations. I often enjoyed these notes more than the actual chapter.
Twenty-first century topics
This is probably the weakest part of the book. While topics such as local false-discovery rate (FDR), sparse modeling and lasso are covered clearly and in detail, topics such as neural networks and random forests feel sparse and are in my view insufficiently discussed. The discussion of neural networks feels especially rudimentary. Again, this is not particular surprising given that neither author is an expert in machine learning. However, the book is good enough without venturing into machine learning topics. The additional space could have been used for more extensive discussions of FDR or GLM.
Hence if you are interested in learning more about machine learning this book might not be ideal for you. However, that does not mean that individual chapters of this book are bad. Indeed, topics such as support vector machines (SVM) and lasso are very well discussed. Nevertheless, although I enjoyed refreshing my knowledge about these methods I did not feel that I gained a deeper understanding compared to the previous parts of the book.
Overall I really enjoyed reading the book. It gave me a great view of current and past statistical applications. It was especially rewarding to discover and understand connections between various different methods and ideas. Furthermore, the book is covered with nice examples (the data and code for each example is also available on the author’s website).
If you want to refresh or update your knowledge about general statistics Efron and Hastie’s Computer Age Statistical Inference is an excellent choice. You can download the free PDF from the author’s website.