Abstract:
We prove the Pontryagin maximum principle for a general optimal control problem.
The main ingredient of the proof is the abstract lemma on an inverse function,
which is proved via the Schauder fixed-point theorem.
Under this approach, the proof of the Pontryagin maximum principle
is quite short and transparent.
Keywords:the Pontryagin maximum principle, the Schauder fixed-point theorem, inverse function lemma.
The central result of optimal control theory is the Pontryagin maximum principle, and there are many studies devoted to its proof. Let us mention some ideas underlying these studies. Historically, the first proof is based on the so-called needle variations of optimal control (see [1]). The idea of variation dates back to J.-L. Lagrange, who suggested to L. Euler that necessary extremum conditions in variational problems can be proved via variation of the extremal function. However, the essence of the variation by “needles” is principally different, because it is global with respect to control. This suggests the concept of a strong minimum, which is local with respect to the phase trajectory and global with respect to control, after which one can obtain necessary minimum conditions for this minimum, in which the control stationarity condition (as in the Euler–Lagrange equations) is replaced by the maximum condition of the Pontryagin function, which is a stronger condition.
Under the second approach, the original problem is “submerged” in a wider class of problems and is considered as a generic problem in this class rather than a separate objects with its own internal properties. The idea of this approach dates back to A.-M. Legendre, C. G. J. Jacobi, and W. R. Hamilton in the context of searching for sufficient weak minimum conditions in the simplest problem of variational calculus. For optimal control problems, this approach, together with the relaxation idea, was first proposed by Gamkrelidze [2]. This method, which proved to be quite fruitful, is capable of delivering only the classical Pontryagin maximum principle, but also its extension and strengthening (see [3]).
The third idea is here to consider some abstract general extremal problem in a Banach space to which the optimal control problem can be reduced (see, for example, [4]–[6]). Under this approach, the main difficulties are now related to the proof of necessary extremum conditions for this problem, and the Pontryagin maximum principle itself is derived by deciphering some (quite involved, in general) necessary extremum conditions.
The forth idea involves an application of various penalty methods for the proof of the Pontryagin maximum principle (this idea was apparently introduced in [7], see also [8] and [9]). Under this approach, the restrictions of an optimal control problem are carried over to the penalty function. After this, some necessary extremum conditions are applied to the resulting “penalty” problem, and then the Pontryagin maximum principle is proved by several passages to the limit. Note that the method of the proof of the maximum principle, which we propose in the present paper, involves the concept of a strong minimum.
In the present paper, we prove an abstract lemma on an inverse function for close mappings, and then, using some ideas of Gamkrelidze, derive the Pontryagin maximum principle by deciphering the negation of sufficient conditions for existence of an inverse function in this lemma. The proof of this lemma depends substantially on the Schauder fixed-point theorem. To our opinion, this approach simplifies the proof of the Pontryagin maximum principle because, on the one hand, it does not involve (unlike the variational approach) the lemma on the packet of needles or an equation in variations, and, on the other hand, the machinery of both the inverse function lemma and its deciphering is more simple than that based on the abstract approach or the penalty method.
The paper consists of three sections. In § 1, we formulate the main result (the Pontryagin maximum principle). In § 2, we prove the lemma on an inverse function and sone auxiliary results. The proof of the Pontryagin maximum principle based on the inverse function lemma is given in § 3.
where $U$ is a non-empty subset of $\mathbb R^r$, $\varphi\colon \mathbb R\times\mathbb R^n\times\mathbb R^r\to\mathbb R^n$ is a mapping of $t$, $x$ and $u$, $f_0\colon\mathbb R^n\times\mathbb R^n\to \mathbb R$, $f\colon\mathbb R^n\times\mathbb R^n\to \mathbb R^{m_1}$ and $g\colon\mathbb R^n\times\mathbb R^n\to \mathbb R^{m_2}$ are mappings of $\zeta_i\in\mathbb R^n$, $i=0,1$. The inequality is understood coordinatewise.
The above problem is a standard optimal control problem in canonical form, also known as a problem in Mayer form. If the functional to be minimized and/or boundary conditions in the original problem contain functionals involving integrals, then we can easily reduce it to a problem of the form (1)–(3) by introducing extra variables.
In what follows, we always assume that both $\varphi$ and its partial derivative with respect to $x$ are continuous on $\mathbb R\times\mathbb R^n\times \mathbb R^r$; we also assume that $f_0$, $f$ and $g$ are differentiable on $\mathbb R^n\times\mathbb R^n$.
Let $C([t_0,t_1],\mathbb R^n)$, $\mathrm{AC}([t_0,t_1],\mathbb R^n)$ and $L_\infty([t_0,t_1],\mathbb R^r)$ be, respectively, the space of continuous vector functions on $[t_0,t_1]$ with values in $\mathbb R^n$, the space of absolutely continuous vector functions with values in $\mathbb R^n$, and the space of all essentially bounded vector functions with values in $\mathbb R^r$. The space $W_\infty^1([t_0,t_1],\mathbb R^n)$ is the set of all vector functions $x(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],\mathbb R^n)$ such that $\dot x(\,{\cdot}\,)\in L_\infty([t_0,t_1],\mathbb R^n)$. This space is equipped with the norm
A pair $(x(\,{\cdot}\,),u(\,{\cdot}\,)) \in C([t_0,t_1],\mathbb R^n)\times L_\infty([t_0,t_1],\mathbb R^r)$ is an admissible pair for problem (1)–(3) if it satisfies conditions (2) and (3).
Definition 1. An admissible pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ delivers a strong minimum in problem (1)–(3) (an optimal process) if there exists a neighbourhood $V\subset C([t_0,t_1],\mathbb R^n)$ of the function $\widehat{x}(\,{\cdot}\,)$ such that $f_0(x(t_0),x(t_1))\geqslant f_0(\widehat{x}(t_0),\widehat{x}(t_1))$ for any admissible pair $(x(\,{\cdot}\,),u(\,{\cdot}\,))$ with $x(\,{\cdot}\,)\in V$.
To formulate necessary conditions for a strong minimum in problem (1)–(3) (the Pontryagin maximum principle) we need some notation.
The Euclidean norm on $\mathbb R^n$ is denoted by $|\,{\cdot}\,|$. The value of a linear functional $\lambda=(\lambda_1,\dots,\lambda_n)\in(\mathbb R^n)^*$ evaluated on a vector $x=(x_1,\dots,x_n)^\top\in\mathbb R^n$ ($\top$ denotes transposition) is denoted by $\langle \lambda,x\rangle=\sum_{i=i}^n\lambda_ix_i$. Let $(\mathbb R^n)^*_+$ be the set of all linear functionals on $\mathbb R^n$ that assume non-negative values on non-negative vectors. The adjoint operator to a linear operator $\Lambda\colon \mathbb R^n\to\mathbb R^m$ is denoted by $\Lambda^*$.
Let $(\widehat{x}(\,{\cdot}\,), \widehat{x}(\,{\cdot}\,))$ be a fixed pair. For brevity, the derivatives of mappings $f_0$, $f$ and $g$ and their partial derivatives with respect to $\zeta_0$ and $\zeta_1$ at a point $(\widehat{x}(t_0),\widehat{x}(t_1))$ will be denoted, respectively, by $\widehat f'_0$, $\widehat f'$, $\widehat g$ and $\widehat f_{0\zeta_i}$, $\widehat f_{\zeta_i}$, $\widehat g_{\zeta_i}$, $i=0,1$. We also set $\widehat\varphi(\,{\cdot}\,)=\varphi(\,{\cdot}\,,\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ and $\widehat\varphi_x(\,{\cdot}\,)=\varphi_x(\,{\cdot}\,,\widehat{x}(\,{\cdot}\,)),\widehat{u}(\,{\cdot}\,))$.
Theorem 1 (the Pontryagin maximum principle). If $(\widehat{x}(\,{\cdot}\,), \widehat{u}(\,{\cdot}\,))$ is a strong minimum pair in problem (1)–(3), then there exist a non-zero tuple $(\lambda_0,\lambda_f,\lambda_g)\in \mathbb R_+\times(\mathbb R^{m_1})^*_+\times(\mathbb R^{m_2})^*$ and a vector function $p(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],(\mathbb R^n)^*)$ such that the following conditions are fulfillled:
1) the stationarity condition with respect to $x(\,{\cdot}\,)$
$$
\begin{equation*}
\max_{u\in U}\langle p(t),\varphi(t,\widehat{x}(t),u)\rangle=\langle p(t),\varphi(t,\widehat{x}(t),\widehat{u}(t))\rangle\quad\textit{for almost all } \ t\in[t_0,t_1].
\end{equation*}
\notag
$$
Let $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be an admissible pair for problem (1)–(3). Let $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be the set of all quadruples $(\lambda_0,\lambda_f,\lambda_g, p(\,{\cdot}\,))$ satisfying conditions 1)–4) of this theorem. It is clear that the zero quadruple always satisfies these conditions, and so, for a proof of the maximum principle we need to show that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\ne \{0\}$.
§ 2. Lemma on an inverse function and auxiliary results
Here, we will prove the lemma on an inverse function, which characterizes the condition $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$ in different terms, and formulate the lemma on approximation, which is a particular case of Lemma 4.3 from the authors paper [3].
Let $Z$ be a normed linear space, $z_0\in Z$ and $\rho>0$. In what follows, $U_Z(z_0,\rho)$ and $B_Z(x_0,\rho)$ will denote, respectively, the open and closed balls in $Z$ of radius $\rho$ with centre at $z_0$. The interior of a set $A\subset Z$ will be denoted by $\operatorname{int}A$.
Let $\mathcal M$ be a topological space. By $C(\mathcal M, Z)$ we denote the space of continuous bounded mappings $F\colon \mathcal M\to Z$ with norm
$$
\begin{equation*}
\|F\|_{C(\mathcal M, Z)}=\sup\limits_{x\in \mathcal M}\|F(x)\|_Z.
\end{equation*}
\notag
$$
Lemma 1 (on an inverse function). Let $X$, $Y$, $Y_1$ be Banach spaces, $Y_1$ be continuously embedded in $Y$, let $K$ and $Q$ be closed convex subsets of $X$, let $V$ be a neighbourhood of a point $\widehat{x}\in K\cap Q$, and let $F\in C(V,Y)$. Assume that the following conditions are met:
1) $F$ is differentiable at $\widehat{x}$, and $F(\widehat{x})\in Y_1$,
3) the set $B_X(0,1)\cap(F'(\widehat{x}))^{-1}B_{Y_1}(0,1)$ is precompact in $X$,
4) $F-F'(\widehat{x})\subset C(V,Y_1)$.
Then there exist constants $0<\delta_1\leqslant1$, $c>0$ and a neighbourhood $W\subset C(V\cap K\cap Q,Y)$ of the mapping $F$ such that, for each $\widetilde{F}\in W$ with $\widetilde F-F\in C(V\cap K\cap Q,Y_1)$ and, for all $(x,y)\in(V\cap K\cap Q)\times U_{Y_1}(F(\widehat{x}),\delta_1)$,
Proof. The hypotheses of this lemma include those of Lemma $1$ in the authors paper [10], according to which there exist numbers $\gamma>0$, $a>0$, and a continuous mapping (a right inverse) $R\colon U_{Y}(0,\gamma)\to K-\widehat{x}$ such that, for all $z\in U_{Y}(0,\gamma)$,
Since $Y_1$ is continuously embedded in $Y$, there exists a constant $b>0$ such that $\|y\|_Y\leqslant b\|y\|_{Y_1}$ for all $y\in Y_1$. We choose $r>0$ and $0<\delta_1\leqslant1$ so as to have
We set $W=U_{C(V\cap K\cap Q,Y)}(F,r)$ and denote $V_1=U_{Y_1}(F(\widehat{x}),\delta_1)$.
We fix $y\in V_1$ and $\widetilde{F}\in W$ such that $\widetilde F-F\in C(V\cap K\cap Q,Y_1)$ (as in the lemma) and inclusion (4) holds. From (4) we have $B_X(0,1)\cap A^{-1}(z(x))\subset Q-\widehat{x}$ for all $x\in E=U_X(0,\delta)\cap(K-\widehat{x})\cap(Q-\widehat{x})$, where
The mapping $x\mapsto z(x)$ from $E$ to $Y$ is continuous, inasmuch as $\widetilde{F}\in W$. Hence $\widetilde F\in C(V\cap K\cap Q,Y)$, and, therefore, the mapping $x\mapsto \widetilde F(\widehat{x}+x)$ from $E$ to $Y$ is continuous.
This mapping is well defined, since, first, $2ad<2a(r+b\delta_1)\leqslant\delta$ by (10) and (8), and hence $M\subset E$, and, second, for each $x\in M$, by (9), (7), (10) and (8), we have
Next, $\Phi(x)\in K-\widehat{x}$ by the definition of $R$. From the equality in (6) it follows that $A\Phi(x)=A R(z(x))=z(x)$. Hence $\Phi(x)\in A^{-1}(z(x))$. Since $2ad<\delta\leqslant1$, we have $\Phi(x)\in B_X(0,1)$, and now $\Phi(x)\in B_X(0,1)\cap A^{-1}(z(x))\subset Q-\widehat{x}$ by (4).
Thus, $\Phi(M)\subset M$. We next claim that the set $\Phi(M)$ is precompact in $X$.
From the equality $A\Phi(x)=z(x)$, the above inclusion $z(x)\in B_{Y_1}(0,\kappa)$, and the inclusion $\Phi(x)\in B_X(0,1)$, which hold for each $x\in M$, it follows that $\Phi(M)\subset B_X(0,1)\cap A^{-1}B_{Y_1}(0,\kappa)$.
By condition 3) of the lemma, the set $B_X(0,1)\cap A^{-1}B_{Y_1}(0,1)$ is precompact, and hence so is the set $B_X(0,1)\cap A^{-1}B_{Y_1}(0,\kappa)$. Consequently, its subset $\Phi(M)$ is also is precompact.
So, the continuous mapping $\Phi$ sends the closed convex set $M$ into itself, and its image is precompact in $M$. Let us use the classical Schauder fixed point theorem. This theorem asserts that if $M$ is a convex subset of a normed linear space, $f\colon M\to M$ is a continuous mapping, and the range $f(M)$ is precompact (in $M$), then $f$ has a fixed point (see, for example, [11], p. 34). Consequently, there exists a point $\widetilde x=\widetilde x(y,\widetilde{F})$ such that $\Phi(\widetilde x)=\widetilde x$. Now, by the definition of $\Phi$, the equality in (6), and the definition of $z(x)$, we have
which implies that $\widetilde{F}(\widehat{x}+\widetilde x)=y$. We set $\psi_{\widetilde{F}}(y)=\widehat{x}+\widetilde x$. By the condition, $\widetilde x\in U_X(0,\delta)\cap(K- \widehat{x})\cap(Q-\widehat{x})\subset (V-\widehat{x})\cap(K-\widehat{x})\cap(Q-\widehat{x})$ and hence $\psi_{\widetilde{F}}(y)\in V\cap K\cap Q$. Therefore, the equality in (5) holds for each $y\in V_1$.
We have $\psi_{\widetilde{F}}(y)-\widehat{x}=\widetilde x\in B_X(0, 2ad)$, and now by (10) we get the inequality in (5) with $c=2a$. This completes the proof of Lemma 1.
We set
$$
\begin{equation*}
\mathcal U=\{u(\,{\cdot}\,)\in L_\infty([t_0,t_1],\mathbb R^r)\colon u(t)\in U\text{ for almost all } t\in[t_0,t_1]\}.
\end{equation*}
\notag
$$
Let $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be an admissible pair in problem (1)–(3), let $N\in\mathbb N$, let $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_N(\,{\cdot}\,))\in\mathcal U^{N}$, and let
In what follows, for brevity, we will frequently write $\overline u$, $\overline v$, etc., in place of $\overline u(\,{\cdot}\,)$, $\overline v(\,{\cdot}\,)$, etc.
Let the mapping $A_N(\overline u)\colon X\to Y$ be defined by
for all $(h(\,{\cdot}\,),\xi,\overline \alpha, \nu_0,\nu)\in X$, where $u_0(\,{\cdot}\,)=\widehat{u}(\,{\cdot}\,)$ and $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)$. We claim that this continuous operator is well defined.
and $\mathcal K=[t_0,t_1]\times B_{\mathbb R^r}(0,\kappa)$. On the compact set $\mathcal K$, the mappings $(t,u)\mapsto \varphi(t,\widehat{x}(t),u)$ and $(t,u)\mapsto \varphi_x(t,\widehat{x}(t),u)$ are continuous, and, therefore, bounded. Hence the mappings $t\mapsto \varphi(t,\widehat{x}(t),u_i(t))$, $i=0,1,\dots,N$, and $t\mapsto \widehat\varphi_x(t)$, which are measurable (by continuity of $\varphi$ and $\varphi_x$ with respect to $u$), are essentially bounded on $[t_0,t_1]$. Therefore, the first component of the image $A_N(\overline u)$ lies in the space $C([t_0,t_1],\mathbb R^n)$. That the remaining components of the image $A_N(\overline u)$ are contained in the space $\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}$ is clear.
It is also easily checked that the mapping $A_N(\overline u)$ is linear. So, by the above, we have the estimate
2) there exist an $\widehat N\in\mathbb N$ and a tuple $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in\mathcal U^{\widehat N}$ such that
The proof of this lemma is based on the following two propositions.
Proposition 1. Let $X$, $Y_1$, $Y_2$ be Banach spaces, $A_i\colon X\to Y_i$, $i=1,2$, be linear continuous operators, the operator $A=(A_1,A_2)\colon X\to Y=Y_1\times Y_2$ be defined by $Ax=(A_1x,A_2x)$, $x\in X$, and $C$ be a closed convex subset of $X$.
Then $0\in\operatorname{int}AC$ if and only if $0\in\operatorname{int}A_1C$ and $0\in\operatorname{int}A_2(C\cap \operatorname{Ker}A_1)$.
Proof. Let $0\in\operatorname{int}AC$. Then there exists $r>0$ such that $U_{Y_1}(0,r)\times U_{Y_2}(0,r)\subset A(C)$. Hence $U_{Y_1}(0,r)\subset A_1C$, and, therefore, $0\in\operatorname{int}A_1C$.
Next, we have $\{0\}\times U_{Y_2}(0,r)\subset A(C)$, and hence, for each $y_2\in U_{Y_2}(0,r)$, there exists a point $x\in C\cap\operatorname{Ker}A_1$ such that $A_2x=y_2$. Consequently, we have $0\in\operatorname{int}A_2(C\cap \operatorname{Ker}A_1)$.
Conversely, let $0\in\operatorname{int}A_1C$ and $0\in\operatorname{int}A_2(C\cap \operatorname{Ker}A_1)$. From the first inclusion it follows that the hypotheses of Lemma 1 in [10] are satisfied. As mentioned at the beginning of the proof of Lemma 1, this entails that there exist numbers $\gamma>0$, $a>0$ and mappings $R\colon U_{Y_1}(0,\gamma)\to C$ such that, for all $y_1\in U_{Y_1}(0,\gamma)$,
The second inclusion ensures that $U_{Y_2}(0,\rho)\subset A_2(C\cap \operatorname{Ker}A_1)$ for some $\rho>0$. Let $0<\delta\leqslant \min(\rho/4, \rho/4a\|A_2\|,\gamma/2)$. We claim that $U_{Y}(0,\delta)\subset AC$.
If $y=(y_1,y_2)\in U_{Y}(0,\delta)$, then $2y_1\in U_{Y_1}(0,\gamma)$ by the choice of $\delta$ and hence, by (12), there exists $x_1=R(2y_1)\in C$ such that $A_1x_1=2y_1$.
Let us now show that $2y_2-A_2x_1\in U_{Y_2}(0,\rho)$. Indeed, by the choice of $\delta$ and from the inequality in (12), we have
Therefore, there exists $x_2\in C\cap \operatorname{Ker}A_1$ such that $A_2x_2=2y_2-A_2x_1$.
We set $x=(1/2)x_1+(1/2)x_2$. Since $x_i\in C$, $i=1,2$, and since $C$ is convex, we have $x\in C$. Next, $x_2\in \operatorname{Ker}A_1$, and hence $A_1x=(1/2)A_1x_1+(1/2)A_1x_2=(1/2)A_1x_1=y_1$. Further, by the definition of $x_2$, we have $A_2x=(1/2)A_2x_1+(1/2)A_2x_2=(1/2)A_2x_1+y_2-(1/2)A_2x_1=y_2$, that is, $Ax=(A_1x,A_2x)=(y_1,y_2)=y$.
So, $U_{Y}(0,\delta)\subset AC$, which proves Proposition 1.
The operator $A_N(\overline u)=(A_{1N}(\overline u), A_{2N}(\overline u))$ was introduced before Lemma 2, where the linear operator $A_{1N}(\overline u)\colon X\to Y_1=C([t_0,t_1],\mathbb R^n)$ is defined by
for all $(h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu)\in X$, $t\in[t_0,t_1]$, and the linear operator $A_{2N}(\overline u)\colon X\to Y_2=\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}$ is defined by
for all $(h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu)\in X$.
Proposition 2. The following assertions are equivalent:
a) $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$;
b) there exist an $\widehat N\in \mathbb N$ and a tuple $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in\mathcal U^{\widehat N}$ such that
where $\mathcal V$ is the set of all tuples $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_{N}(\,{\cdot}\,))\in \mathcal U^{N}$, $N\in \mathbb N$.
Assume on the contrary that inclusion (14) does not hold. Let us verify that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\ne\{0\}$.
We first show that the set on the right of (14) is convex. We denote this set by $M$. Consider $y_i\in M$, $\beta_i>0$, $i=1,2$, and $\beta_1+\beta_2=1$. We claim that $\beta_1y_1+\beta_2y_2\in M$.
Since $y_i\in M$, $i=1,2$, there exist $N_i$, $\overline u_i(\,{\cdot}\,)=(u_{i1}(\,{\cdot}\,),\dots,u_{iN_i}(\,{\cdot}\,))\in\mathcal U^{N_i}$, and $z_i=(h_i(\,{\cdot}\,),\xi_i,\overline\alpha_i,\nu_{0i},\nu_i+f(\widehat{x}(t_0),\widehat{x}(t_0)))\in K_{N_i}\cap \operatorname{Ker}A_{1N_i}(\overline u_i)$ such that $y_i=A_{2N_i}(\overline u_i)[z_i]$, $i=1,2$, where $\overline\alpha_i=(\alpha_{i0},\alpha_{i1},\dots,\alpha_{iN_i})$ and
Let $N=N_1+N_2$ and $\overline u(\,{\cdot}\,)=(\overline u_{1}(\,{\cdot}\,), \overline u_2(\,{\cdot}\,))$. We set $z=(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0, \nu+f(\widehat{x}(t_0),\widehat{x}(t_0)))$, where $h(\,{\cdot}\,)=\beta_1h_1(\,{\cdot}\,)+\beta_2h_2(\,{\cdot}\,)$, $\overline\alpha=(\beta_1\alpha_{10}+\beta_2\alpha_{20},\beta_1\alpha_{11},\dots, \beta_1\alpha_{1N_1}, \beta_2\alpha_{21},\dots,\beta_2\alpha_{2N_2})$, $\xi=\beta_1\xi_1+\beta_2\xi_2$, $\nu_0=\beta_1\nu_{10}+\beta_2\nu_{20}$ and $\nu=\beta_1\nu_1+\beta_2\nu_2$.
It is easily seen that $\overline\alpha\in \Sigma^{N+1}-\overline\alpha_N$. Hence $z\in K_N$.
From (15) it follows that the function $h(\,{\cdot}\,)$ satisfies the equation
that is, $\beta_1y_1+\beta_2y_2\in A_{2N}(\overline u)(K_{N}\cap\operatorname{Ker}A_{1N}(\overline u))$. This verifies that the set $M$ is convex.
Let us get back to inclusion (14). If this inclusion does not hold, then either $M$ has non-empty interior and does not contain 0, or the interior of $M$ is empty.
Since $M$ is convex, its interior is also convex; if this interior is non-empty and does not contain the origin, then, by the finite-dimensional separation theorem, there exists a non-zero vector $\lambda\in (\mathbb R\times\mathbb R^{m_1}\times\mathbb R^{m_2})^*$ such that
for each tuple $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_N(\,{\cdot}\,))\in\mathcal U^N$ and for all $(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu+f(\widehat{x}(t_0),\widehat{x}(t_1))) \in K_{N}\cap\operatorname{Ker}A_{1N}(\overline u)$. Here, $h(\,{\cdot}\,)=h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$ satisfies the differential equation
where $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)$.
If the interior of $M$ is empty, then it is well-known (see, for example, [12]) that the set $M$ is contained in some hyperplane containing the origin (since $M\ni 0$), and, in this case, inequality (16) becomes an equality.
We set $\lambda=(\lambda_0,\lambda_f,\lambda_{g})\in \mathbb R\times(\mathbb R^{m_1})^*\times (\mathbb R^{m_2})^*$. In this case, by the definition of the operator $A_{2N}(\overline u)$, inequality (16) assumes the form
Using this relation, let us verify that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\ne\{0\}$.
We put $\xi=0$, $\overline\alpha=0$ (note that $h(\,{\cdot}\,)=0$ by uniqueness of the solution of the linear equation (17)), and define $\nu=-f(\widehat{x}(t_0),\widehat{x}(t_1))$. Now from (18) we have $\lambda_0\nu_0\geqslant0$ for each $\nu_0\geqslant0$. Therefore, $\lambda_0\geqslant0$.
If $\xi=0$, $\overline\alpha=0$, $\nu_0=0$ and $\nu=\nu'-f(\widehat{x}(t_0),\widehat{x}(t_1))$, where $\nu'\in \mathbb R^{m_1}_+$, then from (18) it follows that $\langle\lambda_f,\nu'\rangle\geqslant0$ for each $\nu'\in\mathbb R^{m_1}_+$, that is, $\lambda_f\in(\mathbb R^{m_1})^*_+$.
Let $\xi=0$, $\overline\alpha=0$, $\nu_0=0$ and $\nu=0$. From (18) we have $\langle\lambda_f,f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle\geqslant 0$. However, $\lambda_f\in(\mathbb R^{m_1})^*_+$, $f(\widehat{x}(t_0),\widehat{x}(t_1))\leqslant0$, and so $\langle\lambda_f,f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle\leqslant0$, that is, $\langle\lambda_f,f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle=0$. This verifies the complementary slackness condition.
In (18), we put $\overline\alpha=0$, $\nu_0=0$, and $\nu=-f(\widehat{x}(t_0),\widehat{x}(t_1))$. Since $\xi\in\mathbb R^n$ and since $h(\,{\cdot}\,)=h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$ depend linearly on $\xi$, inequality (18) becomes the equality
Together with (20) this proves the stationarity condition with respect to $x(\,{\cdot}\,)$ and the transversality condition. It remains to verify the maximum condition.
In (18), we put $\nu_0=0$, $\nu=-f(\widehat{x}(t_0),\widehat{x}(t_1))$, and let $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)\in \Sigma^{N+1}-\overline\alpha_N$ be such that $\sum_{i=1}^N\alpha_i>0$. Recalling the expression for $p(t_0)$ and $p(t_1)$, from (18) we have
(recall that $h(t_0,\xi,\overline\alpha; \overline u)=\xi$). Let $u(\,{\cdot}\,)\in\mathcal U$. Setting $u_i(\,{\cdot}\,)=u(\,{\cdot}\,)$, $i=1,\dots, N$, we have from (23)
By $T_0$ we denote the set of all Lebesgue points on $(t_0,t_1)$ of the function under the integral on the right. Let $\tau\in T_0$ and $v\in U$. For each $h>0$ such that $[\tau-h,\tau+h]\subset (t_0,t_1)$, we set $u_h(t)=v$ if $t\in [\tau-h,\tau+h]$, and, for $t\in [t_0,t_1]\setminus[\tau-h,\tau+h]$, we define $v_h(t)=\widehat{u}(t)$. It is clear that $v_h(\,{\cdot}\,)\in\mathcal U$, and from (24) we have
The function $t\mapsto \langle p(t),\varphi(t,\widehat{x}(t),v)\rangle$ is continuous, and hence $\tau$ is its Lebesgue point. Making $h\to0$ in the last inequality, we find that
Now the above inequality is equivalent to the maximum condition since $v\in U$ is arbitrary and since $T_0$ is a set of full measure.
Thus, we have proved that if $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$, then inclusion (14) holds. Let us now show that this inclusion implies (13). This will complete the proof of implication a) $\Rightarrow$ b).
We set $Z=\mathbb R\times \mathbb R^{m_1}\times\mathbb R^{m_2}$ and $m=m_1+m_2+1$. In view of (14), there exists an $m$-dimensional simplex $S\subset M$ such that $0\in\operatorname{int}S$ (see [12]). Hence $U_{Z}(0,\rho)\subset S$ for some $\rho>0$.
Let $e_1,\dots,e_{m+1}$ be the vertices of the simplex $S$. Then there exist $N_i$, $\overline u_i(\,{\cdot}\,)=(u_{i1}(\,{\cdot}\,),\dots, u_{iN_i}(\,{\cdot}\,))\in\mathcal U^{N_i}$, and $z_i=(h_i(\,{\cdot}\,),\xi_i,\overline\alpha_i,\nu_{0i},\nu_i+f(\widehat{x}(t_0),\widehat{x}(t_0)))\in K_{N_i}\cap \operatorname{Ker}A_{1N_i}(\overline u_i)$, where $\overline\alpha_i=(\alpha_{i0},\alpha_{i1},\dots,\alpha_{iN_i})$, and
for all $t\in[t_0,t_1]$, such that $e_i=A_{2N_i}(\overline u_i)[z_i]$, $i=1,\dots,m+1$.
Let $y\in U_{Z}(0,\rho)$. Therefore, $y=\sum_{i=1}^{m+1}\beta_ie_i$ for some $\beta_i>0$ with $\sum_{i=1}^{m+1}\beta_i=1$ (see [12]). Setting $\widehat N=\sum_{i=1}^{m+1}N_i$, $\overline v(\,{\cdot}\,)=(\overline u_1(\,{\cdot}\,),\dots,\overline u_{m+1}(\,{\cdot}\,))$ and proceeding as above in the proof of the convexity of $M$, we find that $y\in A_{2\widehat{N}}(\overline v)(K_{\widehat{N}}\cap\operatorname{Ker}A_{1\widehat{N}}(\overline v))$. Since this is true for each $y\in U_{Z}(0,\rho)$, this proves inclusion (13), and, therefore, implication a) $\Rightarrow$ b).
øverfullrule 0pt Let us prove implication b) $\Rightarrow$ a). Assume on the contrary that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\,{\ne}\, \{0\}$. Let us show that inclusion (13) is impossible for any $N\in\mathbb N$ and any tuple $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_{N}(\,{\cdot}\,))\in\mathcal U^N$.
We fix an $N\in\mathbb N$ and a tuple $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_{N}(\,{\cdot}\,))\in\mathcal U^N$. Let $\xi\in\mathbb R^n$, $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)\in \Sigma^{N+1}-\overline\alpha_N$, and let $h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$ be a solution of equation (17) with these data.
The inequality (24) for all $u(\,{\cdot}\,)\in\mathcal U$ is clear from the maximum condition in Theorem 1. Substituting the functions $u_1(\,{\cdot}\,),\dots, u_N(\,{\cdot}\,)$ for $u(\,{\cdot}\,)$ in this inequality, multiplying the resulting inequalities by $\alpha_1, \dots, \alpha_N$, respectively, and then adding these inequalities, we see that the expression on the right of (23) is non-negative. Hence
for all $\xi\in\mathbb R^n$ and $\overline\alpha\in \Sigma^{N+1}-\overline\alpha_N$.
Let $z=(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu+f(\widehat{x}(t_0),\widehat{x}(t_1)))\in K_{N}\cap\operatorname{Ker}A_{1N}(\overline u)$. The function $h(\,{\cdot}\,)$ should satisfy equation (17), and therefore, by uniqueness, $h(\,{\cdot}\,)= h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$. Next, inequality (18) for each of the above $z$ is secured by (25) since $\lambda_0\geqslant0$, $\lambda_f\in (\mathbb R^{m_1})^*_+$ and since the complementary slackness condition is fulfilled. The triple $(\lambda_0,\lambda_f,\lambda_g)$ is non-zero (otherwise the function $p(\,{\cdot}\,)$ would be zero), but this contradicts inclusion (13). This proves implication b) $\Rightarrow$ a), and, therefore, Proposition 2.
Proof of Lemma 2. Implication 1) $\Rightarrow$ 2). We have $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$, and hence by Proposition 2 inclusion (13) holds with some $\widehat N\in \mathbb N$ and a tuple $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in\mathcal U^{\widehat N}$. Let us also verify that $0\in\operatorname{int}A_{1\widehat N}(\overline v)K_{\widehat N}$.
Indeed, consider the operator $A_0\colon C([t_0,t_1],\mathbb R^n)\to C([t_0,t_1],\mathbb R^n)$ defined by
where $h(\,{\cdot}\,)\in C([t_0,t_1],\mathbb R^n)$. It is well known (see, for example, § 2.5.7 in [13]) and easily checked that this is a continuous bijective linear operator.
Let $y(\,{\cdot}\,)\in C([t_0,t_1],\mathbb R^n)$. There exists a function $h(\,{\cdot}\,)\in C([t_0,t_1],\mathbb R^n)$ such that $A_0[h(\,{\cdot}\,)](\,{\cdot}\,)=y(\,{\cdot}\,)$. This is equivalent to saying that $A_{1\widehat N}(\overline v)[h(\,{\cdot}\,),0,0,0,0](\,{\cdot}\,)=y(\,{\cdot}\,)$. However, $(h(\,{\cdot}\,),0,0,0,0)\in K_{\widehat N}$, and, therefore, $A_{1\widehat N}(\overline v)K_{\widehat N}=C([t_0,t_1],\mathbb R^n)$. Hence $0\in\operatorname{int}A_{1\widehat N}(\overline v)K_{\widehat N}$.
Now inclusion (11) is secured by Proposition 1 with $A_1=A_{1\widehat N}$, $A_2=A_{2\widehat N}$, $A=A_{\widehat N}=(A_{1\widehat N},A_{2\widehat N})$ and $C=K_{\widehat N}$.
Implication 2) $\Rightarrow$ 1). If (11) holds, then inclusion (13) follows from Proposition 1 (where, as above, $A_1 = A_{1\widehat N}$, $A_2 = A_{2\widehat N}$, $A = A_{\widehat N} =(A_{1\widehat N},A_{2\widehat N})$ and $C = K_{\widehat N}$. Hence $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,)) = \{0\}$ by Proposition 2. Lemma 2 is proved.
Let $L>0$. As above, we let $Q_L$ denote the set of all Lipschitz-continuous functions from $C([t_0,t_1],\mathbb R^n)$ with Lipschitz constant $L$. It is easily checked that $Q_L$ is a closed convex subset of $C([t_0,t_1],\mathbb R^n)$.
Recall that the sets $\mathcal U$ and $\Sigma^k$, $k\in\mathbb N$, were defined before Lemma 2.
Lemma 3 (on approximation). Let $M$ be a bounded set in $C([t_0,t_1],\mathbb R^n)$, let $\Omega$ be a bounded set in $\mathbb R^n$, $N\in\mathbb N$, $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots, u_{N}(\,{\cdot}\,))\in \mathcal U^N$, and let $L> 0$. Then, for each $\overline\alpha=(\alpha_1,\dots,\alpha_N)\in\Sigma^N$, there exists a sequence of controls $u_s(\overline\alpha;\overline u)(\,{\cdot}\,)\in \mathcal U^N$,
where $t\in[t_0,t_1]$, lie in $C((M\cap Q_L)\times\Omega\times \Sigma^N,\,C([t_0,t_1],\mathbb R^n))$ and converge in this space as $s\to\infty$ to the mapping $\Phi\colon (M\cap Q_L)\times\Omega\times \Sigma^N\to C([t_0,t_1],\mathbb R^n)$ defined by
As already mentioned, this lemma is a particular case of Lemma 4.3 from [3].
§ 3. Proof of the Pontryagin maximum principle
Assume on the contrary that the Pontryagin maximum principle does not hold for a pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ admissible in problem (1)–(3), that is, $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$. Let us show that this pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ is not a strong minimum in this problem. Our proof will follow from Lemma 1 on an inverse function, which we will apply in our setting.
Let $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be an admissible pair in problem (1)–(3), let $\widehat N\in\mathbb N$, and let $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in \mathcal U^{\widehat N}$ be as in Lemma 2.
The mappings $(t,u)\mapsto \varphi(t,\widehat{x}(t),u)$ and $(t,u)\mapsto \varphi_x(t,\widehat{x}(t),u)$ are continuous on this compact set, and hence are uniformly continuous. Therefore, there exists $0<\delta_0\leqslant\delta$ such that if $|x_1-x_2|<\delta_0$, then
where $Q_L$ is defined before Lemma 3, $\widehat{x}=(\widehat{x}(\,{\cdot}\,), \widehat{x}(t_0),\overline\alpha_{\widehat{N}},0,0)$, and $V=U_X(\widehat{x},\delta_0)$.
In order to distinguish between $\widehat{x}$ and $\widehat{x}(\,{\cdot}\,)$, we will write $\widehat z$ in place of $\widehat{x}$, that is, $\widehat{z}=(\widehat{x}(\,{\cdot}\,), \widehat{x}(t_0),\overline\alpha_{\widehat{N}},0,0)$.
It is clear that $Y_1$ is continuously embedded in $Y$, and $K$ and $Q$ are closed convex subsets of $X$.
The pair $(\widehat{x}(\,{\cdot}\,), \widehat{u}(\,{\cdot}\,))$ satisfies the differential equation in (2), and hence $\widehat{x}(\,{\cdot}\,)$ is a Lipschitz function with constant $C_0$. Therefore, $\widehat{x}(\,{\cdot}\,)\in Q_L$, and further, since $\overline\alpha_{\widehat{N}}=(1,0,\dots,0)\in\Sigma^{\widehat N+1}$, we have $\widehat{z}\in K\cap Q$.
where $v_0(\,{\cdot}\,)=\widehat{u}(\,{\cdot}\,)$ and $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_{\widehat N})$. The dependence of $F$ on a fixed tuple $(\widehat{u}(\,{\cdot}\,),v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))$ will not be indicated.
The first component of the image of the mapping $F$ lies in $C([t_0,t_1],\mathbb R^n)$ (this is verified as for the above operator $A_N$, and since $V$ is a bounded set and the mappings $\varphi$, $f_0$, $f$, and $g$ are continuous, it is easily seen that $F\in C(V,Y)$.
Let us now check that the above spaces, sets, and the mapping satisfy the conditions of Lemma 1.
Since $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ is an admissible pair in problem (1)–(3), it follows that the first component $F(\widehat{z})$ is zero, and $\xi=\widehat{x}(t_0)$. Hence $F(\widehat{z})=(0, f_0(\widehat{x}(t_0),\widehat{x}(t_1)),f(\widehat{x}(t_0), \widehat{x}(t_1)),0)\in Y_1$.
The mapping $F$ is differentiable (for differentiability with respect to $x(\,{\cdot}\,)$, see, for example, § 2.4.2 in [13], the mapping $F$ being linear with respect to the remaining variables). Hence its derivative at $\widehat{z}$ is given by
for all $(h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu)\in X$ and $t\in[t_0,t_1]$.
This verifies condition 1) of Lemma 1. Let us check condition 2).
It is clear that $F'(\widehat{z})=A_{\widehat{N}}(\overline v)$ and $K=K_{\widehat{N}}+\overline\alpha_{\widehat{N}}$. By the assumption, $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$, and now condition 2) of Lemma 1 is secured by Lemma 2.
Let us verify condition 3). If $(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)\in B_X(0,1)\cap(F'(\widehat{x}))^{-1}(B_{Y_1}(0,1))$, then $(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)\in B_X(0,1)$, and there exists a tuple $(y(\,{\cdot}\,),w_1,w_2,w_3)\in B_{Y_1}(0,1)$ such that
Since $y(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, we have $h(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, and, therefore, for almost all $t\in[t_0,t_1]$,
We have $\|h(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}\leqslant1$, $|\overline\alpha|\leqslant1$, and $\|y(\,{\cdot}\,)\|_{W_\infty^1([t_0,t_1],\mathbb R^n)}\leqslant1$, and hence $|\dot h(t)|\leqslant C_1+C_0\sqrt{\widehat N+1}+1$ for almost all $t\in[t_0,t_1]$ (see (28)).
So, the set of these functions $h(\,{\cdot}\,)$ is bounded in $C([t_0,t_1],\mathbb R^n)$; these functions are Lipschitz-continuous with the same Lipschitz constant $C_1+C_0\sqrt{\widehat N+1}+ 1$. Therefore, they are equicontinuous. Now an appeal to the Arzelá–Ascoli theorem shows that the set of these functions is precompact in $C([t_0,t_1],\mathbb R^n)$.
Next, $|\xi|+|\overline\alpha|+|\nu_0|+|\nu|\leqslant1$, and hence the set of such tuples $(\xi,\overline\alpha,\nu_0,\nu)$ is compact in $\mathbb R^n\times \mathbb R^{\widehat N+1}\times\mathbb R\times\mathbb R^{m_1}$. Hence the set $B_X(0,1)\cap(F'(\widehat{x}))^{-1}(B_{Y_1}(0,1))$ is precompact in $X$.
It remains to verify condition 4). For all $z=(x(\,{\cdot}\,), \xi, \overline\alpha,\nu_0,\nu)\in V$ and $t\in[t_0,t_1]$,
Now, by continuity of the mappings $\varphi$ and $\varphi_x$, the first component of the image of the difference $F[z](\,{\cdot}\,)-F'(\widehat{z})[z](\,{\cdot}\,)$ lies in $W_\infty^1([t_0,t_1],\mathbb R^n)$, and hence, the image of this difference belongs to $Y_1$. Since $V$ is bounded, and the mappings $f_0$, $f$ and $g$ are continuous, it follows that this difference lies in $C(V,Y_1)$, that is, condition 4) of Lemma 1 is met.
So, all the hypotheses of Lemma 1 are fulfilled. Let the constants $0<\delta_1\leqslant1$, $c>0$ and the neighbourhood $W$ be as in this lemma, and let $\mathcal{O}$ be an arbitrary neighbourhood of the function $\widehat{x}(\,{\cdot}\,)$ in $C([t_0,t_1],\mathbb R^n)$. There exist $r_0>0$ and $0<r<r_0/c$ such that $U_{C([t_0,t_1],\mathbb R^n)}(\widehat{x}(\,{\cdot}\,),r_0)\subset \mathcal{O}$ and $U_{C(V\cap K\cap Q,Y)}(F,r)\subset W$.
For each $s\in\mathbb N$, consider the mapping $F_s\colon V\cap K\to Y$ defined by
where the functions $u_s(\overline\alpha;\overline u)(\,{\cdot}\,)$ are from Lemma 3, $M$ is the projection of $V$ onto $C([t_0,t_1],\mathbb R^n)$, $\Omega$ is the projection of $V$ onto $\mathbb R^n$, $N=\widehat N+1$, and $\overline u(\,{\cdot}\,)=(\widehat{u}(\,{\cdot}\,),v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))$. The dependence of the mapping $F_s$ on a fixed tuple $\overline u(\,{\cdot}\,)$ is not indicated.
By Lemma 3, the controls $u_s(\overline\alpha;\overline u)(\,{\cdot}\,)$ are uniformly bounded with respect to $s$ for almost all $t\in [t_0,t_1]$, and hence (by the same reasons as for the mapping $F$), for all $s\in\mathbb N$, the mappings $F_s$ lie in $C(V\cap K\cap Q,Y)$, and there exists $s_0$ such that
(where we set $\widetilde{F}=F_{s_0}$) for all $(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu) \in V\cap K\cap Q$ and $t\in[t_0,t_1]$. Hence $\widetilde{F} \in U_{C(V\cap K\cap Q,Y)}(F,r) \subset W$.
From the definitions of the mappings $\Phi_s$ and $\Phi$ it is clear that the difference $\widetilde{F}(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(\,{\cdot}\,) -F(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(\,{\cdot}\,)$ lies in $C(V\cap K\cap Q,Y_1)$.
Let us now verify (4). Let $(x(\,{\cdot}\,), \xi,\overline\alpha,\nu_0,\nu)\in V\cap K\cap Q$ and $(y(\,{\cdot}\,),w_1,w_2,w_3)\in U_{Y_1}(F(\widehat{z}),\delta_1)$. If $(x'(\,{\cdot}\,), \xi',\overline\alpha^{\,\prime},\nu_0',\nu')$ lies in the set on the left of this inclusion, then
(recall that $\overline\alpha^{\,\prime}=(\alpha_0',\alpha_1',\dots,\alpha_{\widehat N}'))$.
Let us now proceed as in the verification of condition 3) above. Since $y(\,{\cdot}\,) \in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, we have $x'(\,{\cdot}\,) \in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, and hence, for almost all $t \in [t_0,t_1]$,
We have $\|x'(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}\leqslant1$, $|\overline\alpha^{\,\prime}|\leqslant1$, $\overline\alpha\in\Sigma^{\widehat N+1}$, $\|y(\,{\cdot}\,)\|_{W_\infty^1([t_0,t_1],\mathbb R^n)}<\delta_1\leqslant1$, $\|x(\,{\cdot}\,)-\widehat{x}(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}<\delta_0\leqslant1$, and hence, for almost all $t\in[t_0,t_1]$, the sum of the terms (except the last one) on the right of (30) is estimated by $C_1+C_0\sqrt{\widehat N+1}+C_0+C_1+2C_0+1=(3+\sqrt{\widehat N+1})C_0+2C_1+1$.
Subtracting and adding $\varphi(t,\widehat{x}(t),u_{s_0}(\overline\alpha^{\,\prime};\overline u)(t))$ to the last term on the right of (30), it follows by (27) that this term is majorized by $1+C_0$ for almost all $t\in[t_0,t_1]$. So, for almost all $t\in[t_0,t_1]$,
We next claim that $(x'(\,{\cdot}\,), \xi',\overline\alpha^{\,\prime},\nu_0',\nu')\in Q-\widehat{z}$. Clearly, to verify this inclusion, it suffices to verify that the function $x'(\,{\cdot}\,)$ lies in $Q_L-\widehat{x}(\,{\cdot}\,)$. The Lipschitz constant of this function is $L-C_0$, and the Lipschitz constant of $\widehat{x}(\,{\cdot}\,)$ is $C_0$. Hence $x'(\,{\cdot}\,)+\widehat{x}(\,{\cdot}\,)\in Q_L$, that is, $x'(\,{\cdot}\,)\in Q_L-\widehat{x}(\,{\cdot}\,)$, which proves inclusion (4).
So, all the assumptions of in Lemma 1 on the mapping $\widetilde{F}$ are met, and, therefore, there exists a mapping
where $0<\varepsilon\leqslant(r_0/c)-r$ is so small that $y\in U_{Y_1}(F(\widehat z),\delta_1)$. We set $\psi_{\widetilde{F}}(y)=(\widetilde x(\,{\cdot}\,),\widetilde\xi,\widetilde {\overline\alpha},\widetilde\nu_0,\widetilde\nu)$, where $\widetilde\nu=\widetilde \nu_1+f(\widehat{x}(t_0),\widehat{x}(t_1))$ and $\widetilde\nu_1\geqslant0$. Now the equality in (5) assumes the form
The equality of the first components implies that the pair $(\widetilde x(\,{\cdot}\,), \widetilde u(\,{\cdot}\,))$, where $\widetilde u(\,{\cdot}\,)=u_{s_0}(\widetilde{\overline\alpha};\overline u)(\,{\cdot}\,)$, satisfies condition (2), and, in addition, $\widetilde \xi=\widetilde x(t_0)$. Hence, since the third and fourth components are equal, we have $f(\widetilde x(t_0),\widetilde x(t_1))\leqslant0$ and $g(\widetilde x(t_0),\widetilde x(t_1))=0$, that is, the pair $(\widetilde x(\,{\cdot}\,), \widetilde u(\,{\cdot}\,))$ is admissible in problem (1)–(3).
The equality of the second components implies that
that is, $\widetilde x(\,{\cdot}\,)\in \mathcal{O}$.
So, for any neighbourhood $\mathcal{O}$ of a point $\widehat{x}(\,{\cdot}\,)$, there exists a pair $(\widetilde x(\,{\cdot}\,),\widetilde u(\,{\cdot}\,))$ admissible for problem (1)–(3) and such that $\widetilde x(\,{\cdot}\,)\in \mathcal{O}$ and the value of the functional $f_0$ on $\widetilde x(\,{\cdot}\,)$ is smaller than on $\widehat{x}(\,{\cdot}\,)$. This, however, contradicts the fact that the pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ is a strong minimum in problem (1)–(3). Theorem 1 is proved.
Bibliography
1.
L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko, The mathematical theory of optimal processes, Intersci. Publ. John Wiley & Sons, Inc., New York–London, 1962
2.
R. V. Gamkrelidze, Principles of optimal control theory, Math. Concepts Methods Sci. Eng., 7, Rev. ed., Plenum Press, New York–London, 1978
3.
E. R. Avakov and G. G. Magaril-Il'yaev, “Local infimum and a family of maximum principles in optimal control”, Sb. Math., 211:6 (2020), 750–785
4.
A. A. Milyutin, “General schemes of necessary conditions for extrema and problems of optimal control”, Russian Math. Surveys, 25:5 (1970), 109–115
5.
A. D. Ioffe and V. M. Tihomirov, Theory of extremal problems, Stud. Math. Appl., 6, North-Holland Publishing Co., Amsterdam–New York, 1979
6.
E. R. Avakov, G. G. Magaril-Il'yaev, and V. M. Tikhomirov, “Lagrange's principle in extremum problems with constraints”, Russian Math. Surveys, 68:3 (2013), 401–433
7.
F. H. Clarke, “A new approach to Lagrange multipliers”, Math. Oper. Res., 1:2 (1976), 165–174
8.
A. V. Arutyunov, Optimality conditions. Abnormal and degenerate problems, Math. Appl., 526, Kluwer Acad. Publ., Dordrecht, 2000
9.
A. D. Ioffe, “On necessary conditions for a minimum”, J. Math. Sci. (N.Y.), 217:6 (2016), 751–772
10.
E. R. Avakov and G. G. Magaril-Il'yaev, “General implicit function theorem for close mappings”, Proc. Steklov Inst. Math., 315 (2021), 1–12
11.
A. Granas and J. Dugundji, Fixed point theory, Springer Monogr. Math., Springer-Verlag, New York, 2003
12.
G. G. Magaril-Il'yaev and V. M. Tikhomirov, Convex analysis: theory and applications, Transl. Math. Monogr., 222, Amer. Math. Soc., Providence, RI, 2003
13.
V. M. Alekseev, V. M. Tikhomirov, and S. V. Fomin, Optimal control, Contemp. Soviet Math., Consultants Bureau, New York, 1987
Citation:
E. R. Avakov, G. G. Magaril-Il'yaev, “Schauder's fixed point theorem and Pontryagin maximum principle”, Izv. Math., 88:6 (2024), 1013–1031