Statistical inference for high dimensional data with low rank structure

Thumbnail Image
Zhou, Fan
Koltchinskii, Vladimir
Associated Organizations
Organizational Unit
Organizational Unit
Supplementary to
We study two major topics on statistical inference for high dimensional data with low rank structure occurred in many machine learning and statistics applications. The first topic is about nonparametric estimation of low rank matrix valued function with applications in building dynamic recommender systems and recovering euclidean distance matrices in molecular biology. We propose an innovative nuclear norm penalized local polynomial estimator and establish an upper bound on its point-wise risk measured by Frobenius norm. Then we extend this estimator globally and prove an upper bound on its integrated risk measured by $L_2$-norm. We also propose another new estimator based on bias-reducing kernels to study the case when the matrix valued function is not necessarily low rank and establish an upper bound on its risk measured by $L_{\infty}$-norm. We show that the obtained rates are all optimal up to some logarithmic factor in minimax sense. Finally, we propose an adaptive estimation procedure for practitioners based on Lepski's method and the penalized data splitting technique which is computationally efficient and can be easily implemented and parallelized. The other topic is about spectral perturbation analysis of higher order singular value decomposition (HOSVD) of tensor under Gaussian noise. Given a tensor contaminated with Gaussian noise, we establish sharp upper bounds on the perturbation of linear forms of singular vectors of HOSVD. In particular, sharp upper bounds are proved for the component-wise perturbation of singular vectors. These results can be applied on sub-tensor localization and low rank tensor denoising.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI