Abstract:
The rate of convergence of the classical Thresholding Greedy Algorithm with respect to some bases is studied. We bound the error of approximation by the product of two norms, the norm of $f$ and the $A_1$-norm of $f$. We obtain some results for greedy bases, unconditional bases and quasi-greedy bases. In particular, we prove that our bounds for the trigonometric basis and Haar basis are optimal.
Bibliography: 16 titles.
Keywords:greedy algorithm, bases, rate of convergence.
This research was performed in Lomonosov Moscow State University and supported by the Russian Science Foundation under grant no. 23-71-30001, https://rscf.ru/en/project/23-71-30001/.
This paper is a follow-up to the very recent paper [14], where we found the rate of convergence of some standard greedy algorithms with respect to arbitrary dictionaries. The new ingredient of [14] was that we bounded the error of approximation by the product of both the norm of $f$ and the $A_1$-norm of $f$. Typically, only the $A_1$-norm of $f$ is used. In this paper we focus on a special class of dictionaries, namely, on various types of bases — greedy bases, unconditional bases, and quasi-greedy bases. We study the rate of convergence of the classical Thresholding Greedy Algorithm with respect to bases in the style of [14].
We consider here approximation in uniformly smooth real Banach spaces. For a Banach space $X$ we define the modulus of smoothness by
We consider here the case $X=L_p$. Let $\Omega$ be a compact subset of $\mathbb{R}^d$ with a probability measure $\mu$. By the $L_p$-norm, $1\leqslant p< \infty$, of a function defined on $\Omega$ we understand
We say that a set $\mathcal{D}$ of elements (functions) from $X$ is a dictionary if each $g\in \mathcal{D}$ has a norm bounded by one ($\|g\|\leqslant 1$) and $\operatorname{\overline{span}} \mathcal{D} =X$. We define the best $m$-term approximation with regard to $\mathcal{D}$ as follows:
where the infimum is taken over the coefficients $c_k$ and the sets of $m$ elements $g_1,\dots,g_m$, $g_i\in\mathcal{D}$, $i=1,\dots,m$.
Let $\mathcal{D}^\pm := \{\pm g\colon g\in \mathcal{D}\}$. We denote the closure of the convex hull of $\mathcal{D}^\pm$ by $A_1(\mathcal{D})$. With $f\in X$ we associate the following norm:
The following result is a corollary of known results on greedy approximation (see [14], Remark 3.2).
Proposition 1.1. Let $X$ be a uniformly smooth Banach space such that ${\rho(u,X) \leqslant \gamma u^q}$, $1<q\leqslant 2$. Then for any dictionary $\mathcal{D}$ and any $\alpha \in [0,1]$ we have
for all $f\in L_p(\Omega,\mu)$ such that $\|f\|_{A_1(\mathcal{D})} <\infty$, $f\neq 0$.
Let a Banach space $X$ with (Schauder) basis $\Psi=\{\psi_k\}_{k=1}^\infty$, be given. We assume that $0<c_0\leqslant \|\psi_k\|_X\leqslant 1$, $k=1,2,\dots$, and consider the following theoretical greedy algorithm. Given an element $f\in X$, we consider the expansion
where $\varrho(j)=k_j$, $j=1,2,\dots$, and we write $\varrho \in D(f)$. If the inequalities in (1.5) are strict, then $D(f)$ consists of only one permutation. We define the $m$th greedy approximant of $f$ with regard to the basis $\Psi$ corresponding to a permutation ${\varrho \in D(f)}$, by the formula
The above algorithm $G_m(\,\cdot\,,\Psi)$ is a simple algorithm, which describes the theoretical scheme for the $m$-term approximation of an element $f$. This algorithm is called the Thresholding Greedy Algorithm (TGA) or simply the Greedy Algorithm (GA). In order to understand the efficiency of this algorithm we compare its accuracy with the best-possible accuracy when an approximant is a linear combination of $m$ terms from $\Psi$. The best we can achieve with the algorithm $G_m$ is
for all elements $f\in X$, and with a constant $G=C(X,\Psi)$ independent of $f$ and $m$. It is clear that, when $X=H$ is a Hilbert space and $\mathcal B$ is an orthonormal basis, we have
It is known (see [6]) that for a greedy basis $\Psi$ inequality (1.7) holds for all permutations $\varrho \in D(f)$. Therefore, Definition 1.1 and Corollary 1.1 imply the following bounds for greedy bases.
Theorem 1.1. Let $1<p<\infty$. Then for any greedy basis $\Psi$ of $L_p(\Omega,\mu)$ and any $\alpha \in [0,1]$ we have
In this paper we extend Theorem 1.1 in different directions. In § 3 we consider the Haar basis, which is a greedy basis for the $L_p$, $1<p<\infty$ (see [9]). We find the sharp order of the $\gamma_m(\alpha,L_p([0,1],\mu),\Psi)$ in the case when $\Psi$ is the Haar basis $\mathcal{H}_p$ and $\mu=\lambda$ is the Lebesgue measure. These bounds are better than the corresponding bounds in Theorem 1.1 in the case $2<p<\infty$. The reader can find results on greedy approximation of smooth multivariate functions with respect to bases similar to the multivariate Haar basis in [10] and [11].
We consider bases which are not greedy. In § 2 we find the sharp order of the quantity $\gamma_m(\alpha,L_p(\mathbb{T}^d,\lambda),\Psi)$ in the case when $\Psi$ is the trigonometric system. The reader can find a survey of the results on the rate of the best $m$-term trigonometric approximations of smooth multivariate functions in [13], Ch. 9.
It is known (see [6]) that greedy bases are exactly those that are unconditional and democratic (see the definitions in § 3 below). In § 3 (see Theorems 3.1 and 3.2) we prove that Theorem 1.1 holds under the assumption that $\Psi$ is an unconditional basis.
In § 3.2 we discuss a wider class of bases than the class of unconditional bases, quasi-greedy bases. It turns out that Theorem 1.1 can be extended to the class of quasi-greedy bases (see Theorem 3.5 below). Finally, in § 3.2 we prove that some error bounds can be improved for a subclass of the quasi-greedy bases, namely, for uniformly bounded quasi-greedy bases.
We use $C$, $C(p,d)$ and so on, to denote various constants, with arguments indicating dependence on other parameters. We use the following symbols for brevity. For two nonnegative sequences $a=\{a_n\}_{n=1}^\infty$ and $b=\{b_n\}_{n=1}^\infty$, the relation, or order inequality, $a_n \ll b_n$ means that there is a number $C(a,b)$ such that, for all $n$, we have $ a_n\leqslant C(a,b)b_n$, and the relation $a_n\asymp b_n$ means that $a_n\ll b_n$ and $b_n\ll a_n$.
§ 2. Trigonometric system
In this section we consider the case $X=L_p({\mathbb T}^d,\lambda)$, $1\leqslant p \leqslant \infty$, $\lambda$ is the normalized Lebesgue measure on $\mathbb{T}^d$ and $\Psi ={\mathcal T}^d := \{e^{i(\mathbf{k},\mathbf{x})}\}_{\mathbf{k}\in {\mathbb Z}^d}$ is the trigonometric system. It is convenient for us to deal with the complex trigonometric system. Results of this section hold for the real trigonometric system as well. Moreover, note that we can use general results for real Banach spaces and real trigonometric system and derive from these results the corresponding results for the complex trigonometric system. We define
Proof. The upper bound follows from Theorem 2.1 and Corollary 1.1. Now we prove the lower bound. Clearly, it is sufficient to carry out the proof in the case $d=1$. We use the Rudin–Shapiro polynomials (see, for instance, [12], Appendix, § 7.4)
for an absolute constant $C_1$. For $s = \pm 1$ set
$$
\begin{equation*}
\Lambda_{s}:=\{ k \colon \epsilon_k=s \}.
\end{equation*}
\notag
$$
Let $s = \pm 1$ be such that $|\Lambda_s| > |\Lambda_{-s}|$. Then $|\Lambda_{-s}|\leqslant m$. Let $G_m(\mathcal{R}_m)$ be a realization of the TGA as applied to $f=\mathcal{R}_m$ with all frequencies in $\Lambda_{-s}$ and maybe with some frequencies in $\Lambda_s$. Set $f_m := \mathcal{R}_m - G_m(\mathcal{R}_m)$. Then all the frequencies of $f_m$ are in $\Lambda_s$, and therefore
This follows from Theorem 2.2 for $\alpha = 2h(p)$ and from the observation that the constant in the upper bound of Theorem 2.2 can depend only on $p$.
Proof. We begin with the upper bounds. In the case $1<p\leqslant 2$ we can follow the lines of the proof of Theorem 2.2 and apply Theorem 2.1 and Corollary 1.1. This gives
We now need the following well-known simple lemma from [7], p. 92 (see [2], § 7.4, for a historical discussion).
Lemma 2.1. Let $a_1\geqslant a_2\geqslant \dotsb \geqslant a_M\geqslant 0$ and $1\leqslant q\leqslant p\leqslant \infty$. Then for all $m\leqslant M$ we have
Writing $\|f_m\|_p = \|f_m\|_p^{1-\alpha}\|f_m\|_p^\alpha$ and using (2.6) and (2.7) we obtain the upper bound with an absolute constant as an extra factor.
We now prove the lower bounds. We use the de la Vallée Poussin kernels (see, for instance, [12], Appendix, § 7.4) for $m=1,2,\dots$:
We use the notations from the above proof of Theorem 2.2. Let $G_m(\mathcal{V}_m)$ be a realization of the TGA as applied to $f=\mathcal{V}_m$ with all frequencies in $\Lambda_{-s}$ and maybe with some frequencies in $\Lambda_s$. Set $f_m := \mathcal{V}_m - G_m(\mathcal{V}_m)$. Then a lower bound for $\|f_m\|_p$ follows from the relations
Note that it is easy to check that, actually, $\|\mathcal V_{m}\|_{A_1(\mathcal{T})} = 3m$. Relations (2.9)–(2.11) imply the required inequality in the case $1 \leqslant p \leqslant 2$. The proof of Theorem 2.3 is now complete.
Corollary 2.2. Let $1\leqslant p\leqslant 2$. Then for any $f\in L_p$ and each $m\in \mathbb{N}$ we have
We recall that we assume that $\Psi :=\{\psi_n\}_{n=1}^\infty$ is a (Schauder) basis of a Banach space $X$ which satisfies the condition $\|\psi_n\|_X \leqslant 1$, $n=1,2,\dots$ .
3.1. Unconditional bases
There are different equivalent definitions of the concept of unconditional basis (see, for instance, [12], § 1.4). It is convenient for us to use the following two equivalent definitions. We use the representation
Writing $\|f_m\|_p = \|f_m\|_p^{1-\alpha}\|f_m\|_p^\alpha$ and using (3.1) and (3.3) we obtain the inequality required in Theorem 3.1. The theorem is proved.
Theorem 3.2. Let $1< p \leqslant 2$, and let $\Psi$ be an unconditional basis of $L_p(\Omega,\mu)$. Then
Proof. This proof goes along the lines of the proof of Theorem 3.1. We only point out where we use another argument. Instead of (3.2), by property U2 we obtain
Writing $\|f_m\|_p = \|f_m\|_p^{1-\alpha}\|f_m\|_p^\alpha$ and using (3.1) and (3.5) we obtain the inequality required in Theorem 3.2. The theorem is proved.
Let $\mathcal{H} :=\{H_k\}_{k=1}^\infty$ denote the Haar basis on $[0,1)$ normalized in $L_2([0,1],\lambda)$ with respect to the Lebesgue measure $\lambda$: $H_1 =1$ on $[0,1)$ and for $k=2^n +l$, $n=0,1,\dots$ and $l=1,2,\dots,2^n$,
We denote by $\mathcal{H}_p :=\{H_{k,p}\}_{k=1}^\infty$ the Haar basis $\mathcal{H}$ renormalized in $L_p([0,1],\lambda)$. It is well known that the Haar basis $\mathcal{H}_p$ is an unconditional basis for the $L_p([0,1],\lambda)$, $1<p<\infty$ (see, for instance, [5]). Now we show that Theorem 3.1 can be improved for the Haar basis.
Theorem 3.3. Let $1< p < \infty$. Then for the space $L_p([0,1],\lambda)$ we have
Now we use Lemma 3.1 (see below), which holds for a quasi-greedy basis (this is a weaker condition than the unconditionality of a basis). By Lemma 3.1 we obtain
The combination of relations (3.10)–(3.12) proves the lower bounds in Theorem 3.3. The theorem is proved.
We say that $\Psi =\{\psi_k\}_{k=1}^\infty$ is $L_p$-equivalent to $\Phi =\{\phi_k\}_{k=1}^\infty$ if for any finite set $\Lambda$ and any coefficients $c_k$, $k\in \Lambda$, we have
for two positive constants $C_1(p,\Psi,\Phi)$ and $C_2(p,\Psi,\Phi)$ which can depend on $p$, $\Psi$ and $\Phi$. For sufficient conditions on $\Psi$ to be $L_p$-equivalent to $\mathcal{H}_p$, see [4] and [1].
Remark 3.1. Theorem 3.3 holds for any basis $\Psi$, which is $L_p$-equivalent to the Haar basis $\mathcal{H}_p$.
3.2. Quasi-greedy bases
As above, let $X$ be an infinite-dimensional separable Banach space with norm $\|\,{\cdot}\,\|:=\|\,{\cdot}\,\|_X$, and let $\Psi:=\{\psi_k\}_{k=1}^{\infty}$ be a semi-normalized basis of $X$ ($0<c_0\leqslant\|\psi_k\|\leqslant 1$, $k\in {\mathbb N}$). In this subsection we concentrate on a wider class of bases than unconditional bases, namely, quasi-greedy bases. The concept of quasi-greedy basis was introduced in [6]. The reader can find a detailed discussion of quasi-greedy bases in [12], Ch. 3.
Definition 3.1. A basis $\Psi$ is called a quasi-greedy basis of $X$ if there exists a constant $C=C(\Psi,X)$ such that
for the decreasing rearrangement of the coefficients of $f$.
The following theorem is borrowed from [15] (see also [12], p. 70, Theorem 3.2.10). We note that in the case $p=2$ Theorem 3.4 was proved in [16].
Theorem 3.4. Let $\Psi=\{\psi_k\}_{k=1}^\infty$ be a quasi-greedy basis of the space $L_p$, ${1\!<\!p\!<\!\infty}$. Then there exist positive constants $C_i=C_i(p,\Psi)$, $i=1,\dots,4$, such that for each $f\in L_p$ we have
Definition 3.2. We say that a basis $\Psi =\{\psi_k\}_{k=1}^\infty$ is a democratic basis of $X$ if there exists a constant $D:=D(X,\Psi)$ such that for any two finite sets of indices $P$ and $Q$ of the same cardinality $|P|=|Q|$ we have
We now discuss a special subclass of quasi-greedy bases, namely, the uniformly bounded quasi-greedy bases. The existence of uniformly bounded quasi-greedy bases in $L_p([0,1],\lambda)$, $1<p<\infty$, is known (see [12], p. 83, Theorem 3.3.5, and p. 76, Theorem 3.2.20).
Theorem 3.6. Let $1<p<\infty$. Then for any uniformly bounded quasi-greedy basis $\Psi$ of $L_p(\Omega,\mu)$ and any $\alpha \in [0,1]$ we have
Proof. The upper bounds can be proved in the same way as Theorem 3.5. Note that in the case $2\leqslant p<\infty$ the upper bounds in Theorem 3.6 follow directly from Theorem 3.5. In the case $1<p\leqslant 2$, instead of Theorem 3.4 we use the following proposition.
Proposition 3.1 (see [12], p. 77, Proposition 3.2.22). Let $\Psi$ be a uniformly bounded quasi-greedy basis $\Psi=\{\psi_j\}_{j=1}^\infty$ in $L_p$, $1<p<\infty$. Then for $f\in L_p$ we have
Now we prove the lower bounds. This proof is based on the following proposition.
Proposition 3.2 (see [12], p. 76, Proposition 3.2.21). Let $\Psi$ be a uniformly bounded quasi-greedy basis $\Psi=\{\psi_j\}_{j=1}^\infty$ in $L_p$, $1<p<\infty$. Then $\Psi$ is democratic with fundamental function $\varphi(m,\Psi,L_p)\asymp m^{1/2}$.
Proposition 3.2 means (see Definition 3.2 above) that there exist two positive constants $c'(p,\Psi)$ and $C'(p,\Psi)$ such that for any index subset $\Lambda$, $|\Lambda|=m$, we have
Combining the above relations we obtain the required lower bounds. Theorem 3.6 is proved.
Remark 3.2. In the case $2\leqslant p<\infty$ the lower bounds of Theorem 3.6 show that Theorem 3.5 is sharp in this case.
§ 4. A criterion for good rate of approximation
We prove here necessary and sufficient conditions on a basis $\Psi$ which provide a good rate of approximation for elements of $A_1(\Psi)$. We define the following property of a basis $\Psi$.
An unconditional bound with parameters $a$ and $K$
We say that a basis $\Psi$ of $X$ satisfies an unconditional bound with parameters $a$ and $K$ (($\mathrm{UB}(a,K)$)) if for each finite set $\Lambda\subset \mathbb{N}$ and each choice of signs $\epsilon_k=\pm 1$, $k\in\Lambda$, we have
It is clear that our condition $\|\psi_k\|_X\leqslant 1$, $k=1,2,\dots$, guarantees that $\Psi$ satisfies $\mathrm{UB}(1,1)$. We restrict ourselves to $a\in [0,1)$. Obviously, an orthonormal basis of a Hilbert space satisfies $\mathrm{UB}(1/2,1)$. There are nontrivial examples of $\mathrm{UB}(a,K)$-bases. For instance, it is known (see [9], [12], Ch. 2, and [13], Ch. 8) that for $1<p<\infty$ the univariate Haar basis normalized in $L_p$ satisfies $\mathrm{UB}(1/p,C(p))$. The reader can find a discussion of results on sparse approximation, including some historical comments, in [13], Ch. 9.
Theorem 4.1. (A) Let $b\in (0,1]$. Suppose that $\Psi$ is such that for any $f\in A_1(\Psi)$ there exists a permutation $\varrho \in D(f)$ with the property
Proof. First we prove part (A). We introduce some notation. For $m\in\mathbb{N}$ let $V(m)$ denote the vertex set of the unit cube $[-1,1]^m$. In other words $V(m)$ consists of all vectors $\mathbf{v} = (v_1,\dots,v_m)$ such that $v_j$ takes values $1$ or $-1$ for $j=1,\dots,m$. For $\Lambda =\{k_j\}_{j=1}^m \subset \mathbb{N}$ and $\mathbf{v} \in V(m)$ set
where $\delta>0$, $Q\subset \mathbb{N}$ is such that $|Q|=m$ and $\Lambda\cap Q =\varnothing$, and $\mathbf{u}$ is any vector in $V(m)$, for instance, $\mathbf{u}=(1,\dots,1)$. Then $((2+\delta)m)^{-1} f\in A_1(\Psi)$,
Proof. Without loss of generality assume that $f$ is normalized in such a way that $a_1(f)< 1$. Set $\mathcal{N}_s:=\{n\colon a_n(f)\geqslant 2^{-s}\}$, $s\in \mathbb{N}$, and $N_s:=|\mathcal{N}_s|$. Note that for small $s$ the sets $\mathcal{N}_s$ can be empty. Clearly, for nonempty $\mathcal{N}_s$ we have $\mathcal{N}_s=[1,N_s]$. We have
Our assumption that $\Psi$ satisfies $\mathrm{UB}(a,K)$ implies that for any finite $\Lambda\subset \mathbb{N}$ and any coefficients $b_k$, $k\in \Lambda$, we have
Then $\mathbf{w} \in [-1,1]^n$, and therefore $\mathbf{w}$ is a convex combination of elements of $V(n)$. This proves (4.5). Using (4.5), from (4.4) we derive
The author is grateful to the referee for helpful comments and suggestions.
Bibliography
1.
R. A. DeVore, S. V. Konyagin and V. N. Temlyakov, “Hyperbolic wavelet approximation”, Constr. Approx., 14:1 (1998), 1–26
2.
Dinh Dũng, V. Temlyakov and T. Ullrich, Hyperbolic cross approximation, Adv. Courses Math. CRM Barcelona, Birkhäuser/Springer, Cham, 2018, xi+218 pp.
3.
M. J. Donahue, L. Gurvits, C. Darken and E. Sontag, “Rates of convex approximation in non-Hilbert spaces”, Constr. Approx., 13:2 (1997), 187–220
4.
M. Frazier and B. Jawerth, “A discrete transform and decompositions of distribution spaces”, J. Funct. Anal., 93:1 (1990), 34–170
5.
B. S. Kashin and A. A. Saakyan, Orthogonal series, Transl. Math. Monogr., 75, Amer. Math. Soc., Providence, RI, 1989, xii+451 pp.
6.
S. V. Konyagin and V. N. Temlyakov, “A remark on greedy approximation in Banach spaces”, East. J. Approx., 5:3 (1999), 365–379
7.
V. N. Temlyakov, “Approximation of functions with a bounded mixed derivative”, Proc. Steklov Inst. Math., 178 (1989), 1–121
8.
V. N. Temlyakov, “Greedy algorithm and $m$-term trigonometric approximation”, Constr. Approx., 14:4 (1998), 569–587
9.
V. N. Temlyakov, “The best $m$-term approximation and greedy algorithms”, Adv. Comput. Math., 8:3 (1998), 249–265
10.
V. N. Temlyakov, “Greedy algorithms with regard to multivariate systems with special structure”, Constr. Approx., 16:3 (2000), 399–425
11.
V. N. Temlyakov, “Universal bases and greedy algorithms for anisotropic function classes”, Constr. Approx., 18:4 (2002), 529–550
12.
V. Temlyakov, Sparse approximation with bases, Adv. Courses Math. CRM Barcelona, Birkhäuser/Springer, Basel, 2015, xii+261 pp.
13.
V. Temlyakov, Multivariate approximation, Cambridge Monogr. Appl. Comput. Math., 32, Cambridge Univ. Press, Cambridge, 2018, xvi+534 pp.
14.
V. N. Temlyakov, On the rate of convergence of greedy algorithms, arXiv: 2304.06423v1
15.
V. N. Temlyakov, Mingrui Yang and Peixin Ye, “Greedy approximation with regard to non-greedy bases”, Adv. Comput. Math., 34:3 (2011), 319–337
16.
P. Wojtaszczyk, “Greedy algorithm for general biorthogonal systems”, J. Approx. Theory, 107:2 (2000), 293–314
Citation:
V. N. Temlyakov, “Rate of convergence of Thresholding Greedy Algorithms”, Sb. Math., 215:2 (2024), 275–289