PCA

Prof. Speed writes columns for IMS Bulletin and the April 2008 issue has Terence’s Stuff: PCA (p.9). Here are quotes with minor paraphrasing:

Although a quintessentially statistical notion, my impression is that PCA has always been more popular with non-statisticians. Of course we love to prove its optimality properties in our courses, and at one time the distribution theory of sample covariance matrices was heavily studied.

…but who could not feel suspicious when observing the explosive growth in the use of PCA in the biological and physical sciences and engineering, not to mention economics?…it became the analysis tool of choice of the hordes of former physicists, chemists and mathematicians who unwittingly found themselves having to be statisticians in the computer age.

My initial theory for its popularity was simply that they were in love with the prefix eigen-, and felt that anything involving it acquired the cachet of quantum mechanics, where, you will recall, everything important has that prefix.

He gave the following eigen-’s: eigengenes, eigenarrays, eigenexpression, eigenproteins, eigenprofiles, eigenpathways, eigenSNPs, eigenimages, eigenfaces, eigenpatterns, eigenresult, and even eigenGoogle.

How many miracles must one witness before becoming a convert?…Well, I’ve seen my three miracles of exploratory data analysis, examples where I found I had a problem, and could do something about it using PCA, so now I’m a believer.

No need to mention that astronomers explore data with PCA and utilize eigen- values and vectors to transform raw data into more interpretable ones.

Leave a comment