## The Statistical Performance of Collaborative Inference

*Gérard Biau, Kevin Bleakley, Benoît Cadre*; 17(62):1−29, 2016.

### Abstract

The statistical analysis of massive and complex data sets will
require the development of algorithms that depend on distributed
computing and collaborative inference. Inspired by this, we
propose a collaborative framework that aims to estimate the
unknown mean $\theta$ of a random variable $X$. In the model we
present, a certain number of calculation units, distributed
across a communication network represented by a graph,
participate in the estimation of $\theta$ by sequentially
receiving independent data from $X$ while exchanging messages
via a stochastic matrix $A$ defined over the graph. We give
precise conditions on the matrix $A$ under which the statistical
precision of the individual units is comparable to that of a
(gold standard) virtual centralized estimate, even though each
unit does not have access to all of the data. We show in
particular the fundamental role played by both the non-trivial
eigenvalues of $A$ and the Ramanujan class of expander graphs,
which provide remarkable performance for moderate algorithmic
cost.

[abs][pdf][bib]