Stein’s Method in High Dimensional Classification and Applications

Do-Hwan Park, Department of Mathematics and Statistics
Junyong Park, Department of Mathematics and Statistics

It is often the case, that high-dimensional data consists of only a few informative components. Standard statistical modeling and estimation in such a situation, is prone to inaccuracies due to overfitting, unless regularization methods are practiced. In the context of classification, we propose a class of regularization methods through shrinkage estimators. The shrinkage is based on variable selection coupled with SCAD shrinkage by using Stein’s unbiased estimator of the risk, and we derive an estimator for the optimal shrinkage method. We demonstrate and examine our method on simulated data and three real data sets and compare it to Independence Rule and Feature Annealed Independence Rule.