DP-Fast MH: Private, Fast, and Accurate Metropolis-Hastings for Large-Scale Bayesian InferenceBayesian inference provides a principled framework for learning from complex
data and reasoning under uncertainty. It has been widely applied in machine
learning tasks such as medical diagnosis, drug design, and policymaking. In
these common applications, the data can be highly sensitive. Differential
privacy (DP) offers data analysis tools with powerful worst-case privacy
guarantees and has been developed as the leading approach in privacy-preserving
data analysis. In this paper, we study Metropolis-Hastings (MH), one of the
most fundamental MCMC methods, for large-scale Bayesian inference under
differential privacy. While most existing private MCMC algorithms sacrifice
accuracy and efficiency to obtain privacy, we provide the first exact and fast
DP MH algorithm, using only a minibatch of data in most iterations. We further
reveal, for the first time, a three-way trade-off among privacy, scalability
(i.e. the batch size), and efficiency (i.e. the convergence rate),
theoretically characterizing how privacy affects the utility and computational
cost in Bayesian inference. We empirically demonstrate the effectiveness and
efficiency of our algorithm in various experiments.
arxiv.org