Abstract:
Tikhonov's setting of the problem of solving systems of linear algebraic equations that are equivalent in accuracy is investigated. The problem is shown to be well posed in this setting.
Bibliography: 5 titles.
Keywords:
systems of linear algebraic equations, normal pseudosolution, matrix norms, correct problems.
This study was carried out at the Moscow Center of Fundamental and Applied Mathematics and supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-15-2019-1624).
It is known that an arbitrary system of linear algebraic equations Ax=b has a uniquely defined normal pseudosolution. When the natural scalar product is used, it is obtained by the least square method as a vector of minimum length among all vectors minimizing the residual length. If the matrix A is of incomplete rank, then the problem of seeking a normal pseudosolution is ill posed, since there is no continuous dependence on the entries of A. It is quite a common situation when instead of the system Ax=b one proposes to solve an approximate system ˜A˜x=˜b, where
‖˜A−A‖⩽μand‖˜b−b‖⩽δ.
In this case one must understand that small values of the errors μ and δ do not guarantee any closeness of the normal pseudosolutions of the original and approximate systems.
Using Tikhonov’s regularization method (see [1]–[4]) one considers regularized systems
(˜A∗˜A+αI)xα=˜A∗˜b;
it claims that the parameter
α=α(μ,δ)>0
can be chosen so that the vector xα converges to the normal pseudosolution of the system Ax=b as μ,δ→0. Nevertheless, we cannot but think about the physical meaning attached to the search for an object that can vary arbitrarily strongly under small perturbations of the original data. Apparently, problems of this kind do not take account of some important additional information which could help one to restate the original problem so that it becomes well posed. Then solution methods could be sought for this new setting of the problem.
It can be noted that a necessary attribute of regularization methods is the involvement of some additional information. However, in a huge number of works devoted to the development and applications of regularization methods, the new setting using additional information is usually not explicitly stated. In this regard, the work [5] by Tikhonov is of exceptional interest. It is proposed there to include the accuracy parameters μ and δ mentioned above in the statement of the problem of solving systems of linear algebraic equations. It is proposed to treat a whole class of systems equivalent in accuracy as input data. A separate class
Σ=Σ(A0,b0,μ,δ)
is specified by a matrix A0, a right-hand side b0 and two accuracy parameters μ and δ. All systems of the form Ax=b are under consideration, where
‖A−A0‖⩽μand‖b−b0‖⩽δ.
As a normal solution for Σ it is proposed to take a vector of minimum length among all solutions of all consistent systems in this class. The main theorem in [5] claims that a normal solution for a class containing at least one consistent system exists and is unique, provided that the Frobenius (Euclidean) norms are used. A method for the calculation of this solution is also proposed, and some relationship with the idea of regularization is discussed. Nevertheless, [5] contains no theorem stating that the Tikhonov normal solution is well defined.
The aim of this paper is to study the well-posedness of Tikhonov’s setting. The main new result is the theorem on the continuous dependence of the Tikhonov normal solution on the original data A0, b0, μ and δ. In addition, the possibility of some generalizations is investigated, when norms different from the Frobenius ones are used.
§ 2. Uniqueness of Tikhonov’s solution
Assume that the spaces Cn and Cm are endowed with vector norms and that the norms of m×n matrices obey the inequality ‖Ax‖⩽‖A‖‖x‖ for any vector x and any matrix A. The spaces Cn and Cm are called the basic space and the image space, respectively.
The class of systems
Σ=Σ(A0,b0,μ,δ),A0∈Cm×n,b0∈Cm,μ,ν>0,
consists of all systems of the form Ax=b, where
‖A−A0‖⩽μand‖b−b0‖⩽δ.
Systems can be consistent and inconsistent alike. We do not assume that the individual system A0x=b0 is consistent, but at least one system in this class must be so. Among the vectors solving consistent systems in the class Σ we choose the ones of minimum norm. We call them Tikhonov solutions. We denote the set of Tikhonov solutions by
N0={argmin
It follows directly from compactness arguments that N_0 \ne \varnothing. Following [5] we also introduce the important sets
Proof. Let z \in N_0. Proving by contradiction, we assume that \|Az-b_0\| < \delta. We set z_\varepsilon=(1-\varepsilon) z. Then for all small \varepsilon > 0 we obtain the inequalities
which contradict the fact that z has the minimum norm among all solutions of consistent systems in \Sigma. Thus, \|Az-b_0\|=\delta.
Now we assume that \|A - A_0\| < \mu. This inequality remains valid for all sufficiently small perturbations F of the matrix A. We choose a perturbation so that
for any vector z \in N_0. Therefore, N_0=N_1. The lemma is proved.
Note that Lemma 2.1 does not use any restrictions on norms, and thus is extends slightly a similar result stated in [5] for the Frobenius (Euclidean) norms. However, to deduce the equality
\begin{equation*}
N_1=N_2,
\end{equation*}
\notag
we must introduce some restrictions.
A norm on the m \times n matrices is said to be supercompatible with the vector norms in the spaces of columns of sizes m and n if
\bullet the inequality \|Ax\| \leqslant \|A\|\,\|x\| holds for any matrix A of size m \times n and any vector x of size n and, in addition,
\bullet for any nonzero vectors x and y of size n and m, respectively, there exists a matrix A such that y=Ax and \|y\|=\|A\| \,\|x\|.
If the Hölder 2-norms are used as vector norms, then the Frobenius (Euclidean) norm and the spectral norm have the property of supercompatibility. In this case, given two vectors x and y, we can choose A in the form
Lemma 2.2. If the matrix norm is supercompatible with the vector norms in the basic space and the image space, then N_0=N_1=N_2.
Proof. Let z \in N_0 and set r_0=b_0 - A_0 z. Then we certainly have \|r_0\| - \mu \|z\| > 0. Proving by contradiction we assume that z \ne 0 and \| r_0 \| / \|z\| \leqslant \mu. We choose F from the equation
It remains to note that any vector z satisfying this equation is a solution of some consistent system in the class \Sigma. In fact, choosing F in the same way as above and setting A=A_0 + F, we infer that \|b_0 - Az\|=\delta=\|b_0-A_0z\| - \mu \|z\|. The lemma is proved.
Clearly, all Tikhonov solutions have the same (minimum) norm. We denote it by \nu. Then Lemma 2.2 implies that Tikhonov solutions have the smallest norm among the vectors x satisfying
If the norm \|x\| is strictly convex, then this vector is uniquely defined. So we have proved the following theorem.
Theorem 2.1. If the matrix norm is supercompatible with the vector norms and the vector norm in the basic space is strictly convex, then the Tikhonov solution is unique.
§ 3. Continuity of the Tikhonov solution
We let R_0 denote the minimum of the norm of the residual b_0-A_0x on the basic space. For \rho \geqslant R_0 we consider the function
Let f(\rho_k)=\|x_k\| and \|b_0-A_0x_k\|=\rho_k. The sequence of vectors x_k lies in the ball of radius f(\rho); therefore, it contains a convergent subsequence x_{k_l} \to x\Longrightarrow\|x_{k_l}\| \to \|x\|. If \rho_k \to \rho, then \|b_0 - A_0 x\|=\rho\Longrightarrow\lim_{k \to \infty} f(\rho_k)=\|x\|=f(\rho).
To prove continuity from the left, in view of monotonicity it suffices to construct a special sequence of points
such that \rho'_k \to \rho and f(\rho'_k) \to f(\rho) as k\to\infty. Assume that x and x_0 are vectors of minimum norm in the sets M_\rho and M_{\rho'}:
Hence we can choose a sequence of vectors x_k \to x, k\to\infty, such that the number sequence \rho'_k :=\|b_0-A_0x_k\| increases monotonically and converges to \rho. It is clear that f(\rho'_k) \leqslant \|x_k\| \to \|x\|=f(\rho). Let f(\rho'_k)=\|x'_k\| and \|b_0-A_0x'_k\|=\rho'_k. If we suppose that
then x is not a vector of minimum norm in M_\rho. Thus, we have obtained the required special sequence. The lemma is proved.
We let z(\rho) denote a function that identifies a vector of minimum norm in the set M_\rho.
Lemma 3.2. If there is a unique vector z(\rho) of minimum norm in each set M_\rho, then the function z(\rho) is continuous in \rho.
Proof. Consider an arbitrary sequence \rho_k \to \rho, k\to\infty. The corresponding sequence of points z_k=z(\rho_k) is bounded and thus contains convergent subsequences. Let z_{k_l} be any of these subsequences, and let z_{k_l} \to z' as l\to\infty. Clearly, \|b_0-A_0z'\|=\rho. As we have already shown that the function f(\rho)=\|z(\rho)\| is continuous, we have \|z_{k_l}\| \to \|z\| as l\to\infty. Therefore, \|z'\|=\|z\|; by uniqueness z'=z.
Thus, any convergent subsequence z_{k_l} has the same limit z. Hence any convergent subsequence of the bounded number sequence \|z_k-z\| converges to zero, so that \|z_k-z\| \to 0 as k\to\infty, and therefore z_k \to z as k\to\infty. The lemma is proved.
According to the definitions, the quantities R_0 and f(\rho) and the set M_\rho depend on the matrix A_0 and the vector b_0. Therefore, we write
In what follows we assume that for any A_0, b_0, and \rho \geqslant R_0(A_0,b_0) there exists a unique vector of minimum norm in the set M_\rho(A_0,b_0). We denote it by z(\rho,A_0,b_0).
Lemma 3.3. For fixed \rho > R_0(A_0, b_0) the function
is continuous with respect to A, b in a neighbourhood of (A_0, b_0).
Proof. It suffices to prove continuity at the point (A_0, b_0). We let z_0 denote the normal pseudosolution of the system A_0x=b_0. It is obvious that z_0=z(\rho_0,A_0,b_0), where
therefore, \|b-Az_0\| < \rho for all A and b that are sufficiently close to A_0 and b_0. Thus, for \|A-A_0\| \leqslant \varepsilon and \|b-b_0\| \leqslant \varepsilon, where \varepsilon>0 is sufficiently small, we see that
depends continuously on \rho, A_0 and b_0 and decreases strictly monotonically in \rho for any fixed A_0 and b_0 until it attains the value zero. Thus, the function F(\rho)=\mu f(\rho,A_0,b_0) + \delta - \rho is continuous and strictly monotonically decreasing. Therefore, there exists unique \rho_* such that F(\rho_*)=0.
We prove that the function \rho_*=\rho_*(\mu,\delta,A_0,b_0) depends continuously on its arguments. We take an arbitrary sufficiently small \varepsilon > 0. Then
remains on the interval [\rho_*-\varepsilon, \rho_*+\varepsilon]. By Lemma 3.2 the vector z(\rho) depends continuously on \rho and therefore on A_0 and b_0. The theorem is proved.
§ 4. Final observations
Proving that the Tikhonov solution is well defined is related directly to its calculation algorithm, which consists of two steps:
\bullet find the root \rho_* of the equation \rho=\delta + \mu f(\rho);
\bullet find a vector of minimum norm in the set \|b_0-A_0 x\| \leqslant \rho.
The problem at the second step is to find a point x such that f(\rho)=\|x\|. The corresponding value of \rho is specified at the first step. In the case of the Frobenius norms these are quadratic programming problems.
We emphasize that the individual system A_0x=b_0 is not assumed to have a solution. However, if this system is consistent, then the Tikhonov solution can be regarded as a two-parameter regularization algorithm. The following theorem generalizes the result in [5], stated there for the Frobenius norms.
Theorem 4.1. Let the assumptions of Theorem 2.1 hold. If the system A_0x=b_0 is consistent, then the Tikhonov solutions x(A_0,b_0,\mu,\delta) converge to its minimum norm solution as \delta, \mu \to 0.
Proof. Let \widehat{x} be a minimum norm solution of the system A_0x\,{=}\,b_0. Given A_0, b_0, \mu and \delta, the Tikhonov solution x is equal to z(\rho,A_0,b_0) for \rho \leqslant \delta + \mu \|\widehat{x}\|. Therefore, \rho \to R_0=0 as \mu,\delta \to 0. By Lemma 3.2, z(\rho,A_0,b_0) \to z(R_0,A_0,b_0)=\widehat{x}. The theorem is proved.
How restrictive is the requirement that the class \Sigma must contain at least one consistent system?
For m \leqslant n this is certainly fulfilled: it suffices to note that there is a nonsingular matrix in an arbitrarily small neighbourhood of a singular matrix. For m > n it can occur that for any small perturbations of the matrix and the right-hand side the resulting systems are inconsistent. However, we can switch from A_0x=b_0 to the extended system with square matrix
The normal pseudosolution of the original system obviously coincides with the normal pseudosolution of the extended system. Now we can seek a Tikhonov solution \widetilde{z} for the class \Sigma=\{ \widetilde{A}_0, b_0, \mu, \delta \} and take the vector formed by the first n components of \widetilde{z} as a generalized solution of the original system.
The fact that only compatible systems are considered in the definition of a Tikhonov solution is apparently essential. Even for a class containing consistent systems we can consider a vector of minimum norm among all normal pseudosolutions of systems equivalent in accuracy; however, such a vector can be distinct from the Tikhonov solution.
Introducing additional restrictions on matrices (for example, symmetry, sparseness, Toeplitz property and so on) when we define a Tikhonov solution requires a careful analysis, since perturbations due to the supercompatibility of the matrix norm with the vector norms can violate these restrictions.
Bibliography
1.
V. V. Vasin, Foundations of the theory of ill-posed problems, Publishing house of the Siberian Branch of the Russian Academy of Sciences, Novosibirsk, 2020, 312 pp. (Russian)
2.
S. I. Kabanikhin, Inverse and ill-posed problems, Sibirskoe Nauchnoe Izdatel'stvo, Novosibirsk, 2009, 457 pp. (Russian)
3.
A. N. Tikhonov and V. Y. Arsenin, Solutions of ill-posed problems, 2nd revised and augmented ed., Nauka, Moscow, 1979, 286 pp. ; English transl. of 1st ed., Scripta Series in Mathematics, V. H. Winston & Sons, Washington, DC; John Wiley & Sons, New York–Toronto, ON–London, 1977, xiii+258 pp.
4.
A. N. Tikhonov, A. S. Leonov and A. G. Yagola, Nonlinear ill-posed problems, 2nd revised and augmented ed., KURS, Moscow, 2017, 400 pp.; English transl. of 1st ed., v. 1, 2, Appl. Math. Math. Comput., 14, Chapman & Hall, London, 1998, xxviii+386 pp.
5.
A. N. Tikhonov, “Approximate systems of linear algebraic equations”, Zh. Vychisl. Mat. Mat. Fiz., 20:6 (1980), 1373–1383; English transl. in Comput. Math. Math. Phys., 20:6 (1980), 10–22
Citation:
E. E. Tyrtyshnikov, “A well-posed setting of the problem of solving systems of linear algebraic equations”, Sb. Math., 213:10 (2022), 1436–1443
\Bibitem{Tyr22}
\by E.~E.~Tyrtyshnikov
\paper A~well-posed setting of the problem of solving systems of linear algebraic equations
\jour Sb. Math.
\yr 2022
\vol 213
\issue 10
\pages 1436--1443
\mathnet{http://mi.mathnet.ru/eng/sm9706}
\crossref{https://doi.org/10.4213/sm9706e}
\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=4582598}
\zmath{https://zbmath.org/?q=an:07733503}
\adsnasa{https://adsabs.harvard.edu/cgi-bin/bib_query?2022SbMat.213.1436T}
\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=000992275100005}
\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85159945999}
Linking options:
https://www.mathnet.ru/eng/sm9706
https://doi.org/10.4213/sm9706e
https://www.mathnet.ru/eng/sm/v213/i10/p130
This publication is cited in the following 2 articles:
V. Shaydurov, V. Petrakova, “Three-parameter Regularization Algorithm for Pseudo-solution of Non-compatible Systems”, Lobachevskii J Math, 45:1 (2024), 328
V. Shaydurov, S. Kabanikhin, V. Petrakova, “Two algorithms for refinement of approximate solutions in the regularization method”, Lobachevskii J. Math., 44:1 (2023), 437