Abstract
We propose a strategy for computing estimators in some non-standard M-estimation problems, where the data are distributed across different servers and the observations across servers, though independent, can come from heterogeneous sub-populations, thereby violating the identically distributed assumption. Our strategy fixes the super-efficiency phenomenon observed in prior work on distributed computing in (i) the isotonic regression framework, where averaging several isotonic estimates (each computed at a local server) on a central server produces super-efficient estimates that do not replicate the properties of the global isotonic estimator, i.e. the isotonic estimate that would be constructed by transferring all the data to a single server, and (ii) certain types of M-estimation problems involving optimization of discontinuous criterion functions where M-estimates converge at the cube-root rate. The new estimators proposed in this paper work by smoothing the data on each local server, communicating the smoothed summaries to the central server, and then solving a non-linear optimization problem at the central server. They are shown to replicate the asymptotic properties of the corresponding global estimators, and also overcome the super-efficiency phenomenon exhibited by existing estimators.
Citation
Moulinath Banerjee. Cécile Durot. "Circumventing superefficiency: An effective strategy for distributed computing in non-standard problems." Electron. J. Statist. 13 (1) 1926 - 1977, 2019. https://doi.org/10.1214/19-EJS1559
Information