|
This article is cited in 13 scientific papers (total in 13 papers)
Information Theory
Comparison of contraction coefficients for $f$-divergences
A. Makur, L. Zheng Department of Electrical Engineering and Computer Science,
Massachusetts Institute of Technology, Cambridge, MA, USA
Abstract:
Contraction coefficients are distribution dependent constants that are used to
sharpen standard data processing inequalities for $f$-divergences (or relative $f$-entropies) and
produce so-called “strong” data processing inequalities. For any bivariate joint distribution,
i.e., any probability vector and stochastic matrix pair, it is known that contraction coefficients
for $f$-divergences are upper bounded by unity and lower bounded by the contraction coefficient for $\chi^2$-divergence. In this paper, we elucidate that the upper bound is achieved when the
joint distribution is decomposable, and the lower bound can be achieved by driving the input
$f$-divergences of the contraction coefficients to zero. Then, we establish a linear upper bound
on the contraction coefficients of joint distributions for a certain class of $f$-divergences using
the contraction coefficient for $\chi^2$-divergence, and refine this upper bound for the salient special
case of Kullback–Leibler (KL) divergence. Furthermore, we present an alternative proof of the
fact that the contraction coefficients for KL and $\chi^2$-divergences are equal for bivariate Gaussian
distributions (where the former coefficient may impose a bounded second moment constraint).
Finally, we generalize the well-known result that contraction coefficients of stochastic matrices
(after extremizing over all possible probability vectors) for all nonlinear operator convex $f$-divergences are equal. In particular, we prove that the so-called “less noisy” preorder over stochastic
matrices can be equivalently characterized by any nonlinear operator convex $f$-divergence. As
an application of this characterization, we also derive a generalization of Samorodnitsky's strong
data processing inequality.
Keywords:
contraction coefficient, $f$-divergence/relative $f$-entropy, strong data processing inequality, less noisy preorder, maximal correlation.
Received: 17.10.2019 Revised: 17.10.2019 Accepted: 09.03.2020
Citation:
A. Makur, L. Zheng, “Comparison of contraction coefficients for $f$-divergences”, Probl. Peredachi Inf., 56:2 (2020), 3–63; Problems Inform. Transmission, 56:2 (2020), 103–156
Linking options:
https://www.mathnet.ru/eng/ppi2315 https://www.mathnet.ru/eng/ppi/v56/i2/p3
|
Statistics & downloads: |
Abstract page: | 213 | Full-text PDF : | 17 | References: | 22 | First page: | 9 |
|