No Unbiased Estimator of the Variance of K-Fold Cross-Validation
Yoshua Bengio, Yves Grandvalet; 5(Sep):1089--1105, 2004.
Abstract
Most machine learning researchers perform quantitative experiments
to estimate generalization error and compare the performance of
different algorithms (in particular, their proposed algorithm). In
order to be able to draw statistically convincing conclusions,
it is important to estimate the uncertainty of such estimates.
This paper studies the very commonly used K-fold cross-validation
estimator of generalization performance. The main theorem shows
that there exists no universal (valid under all distributions)
unbiased estimator of the variance of K-fold cross-validation.
The analysis that accompanies this result is based on the
eigen-decomposition of the covariance matrix of errors, which
has only three different eigenvalues corresponding to three
degrees of freedom of the matrix and three components of the
total variance. This analysis helps to better understand the
nature of the problem and how it can make naive estimators
(that don't take into account the error correlations due to
the overlap between training and test sets) grossly underestimate
variance. This is confirmed by numerical experiments in which
the three components of the variance are compared when the
difficulty of the learning problem and the number of folds are varied.
[abs][pdf]
[ps.gz]
[ps]