Computationally tractable bootstrap for high-dimensional data
Holger Dette and Angelika Rohde
Bootstrapping is the classical approach for distributional approximation of estimators and test statistics when an asymptotic distribution contains unknown quantities or provides a poor approximation quality. For the analysis of massive data, however, the bootstrap is computationally intractable in its basic sampling-with-replacement version. Moreover, it is even not valid in some important high-dimensional applications. Combining subsampling of observations with suitable selection of their coordinates, a new and computationally tractable bootstrap algorithm especially for high-dimensional massive data sets is proposed in this project. Its performance is studied for statistics of high-dimensional sample covariance matrices, namely linear spectral statistics and PCA-preprocessed statistics, where the common sampling-with-replacement bootstrap fails.