## Time-Accuracy Tradeoffs in Kernel Prediction: Controlling Prediction Quality

** Samory Kpotufe, Nakul Verma**; 18(44):1−29, 2017.

### Abstract

Kernel regression or classification (also referred to as
*weighted* $\epsilon$-NN methods in Machine Learning) are
appealing for their simplicity and therefore ubiquitous in data
analysis. However, practical implementations of kernel
regression or classification consist of quantizing or sub-
sampling data for improving time efficiency, often at the cost
of prediction quality. While such tradeoffs are necessary in
practice, their statistical implications are generally not well
understood, hence practical implementations come with few
performance guarantees. In particular, it is unclear whether it
is possible to maintain the statistical accuracy of kernel
prediction---crucial in some applications---while improving
prediction time.

The present work provides guiding principles for combining kernel prediction with data- quantization so as to guarantee good tradeoffs between prediction time and accuracy, and in particular so as to approximately maintain the good accuracy of vanilla kernel prediction.

Furthermore, our tradeoff guarantees are
worked out explicitly in terms of a tuning parameter which acts
as a *knob* that favors either time or accuracy depending
on practical needs. On one end of the knob, prediction time is
of the same order as that of *single* -nearest-neighbor
prediction (which is statistically inconsistent) while
maintaining consistency; on the other end of the knob, the
prediction risk is nearly minimax-optimal (in terms of the
original data size) while still reducing time complexity. The
analysis thus reveals the interaction between the data-
quantization approach and the kernel prediction method, and most
importantly gives explicit control of the tradeoff to the
practitioner rather than fixing the tradeoff in advance or
leaving it opaque. The theoretical results are validated on data
from a range of real-world application domains; in particular we
demonstrate that the theoretical *knob* performs as
expected.