E. R. Avakov, G. G. Magaril-Il'yaev, “Schauder's fixed point theorem and Pontryagin maximum principle”, Izv. Math., 88:6 (2024), 1013

Izvestiya: Mathematics

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Forthcoming papers
	Archive
	Impact factor
	Guidelines for authors
	Submit a manuscript

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Izv. RAN. Ser. Mat.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Izvestiya: Mathematics, 2024, Volume 88, Issue 6, Pages 1013–1031
DOI: https://doi.org/10.4213/im9471e (Mi im9471)

Schauder's fixed point theorem and Pontryagin maximum principle

E. R. Avakov^a, G. G. Magaril-Il'yaev^bcd

^a V. A. Trapeznikov Institute of Control Sciences of Russian Academy of Sciences, Moscow
^b Lomonosov Moscow State University
^c Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow Region
^d Southern Mathematical Institute of the Vladikavkaz Scientific Center of the Russian Academy of Sciences, Vladikavkaz

English version PDF (574 kB) HTML full-text Russian version article

References:

PDF

HTML

DOI: https://doi.org/10.4213/im9471e

Abstract: We prove the Pontryagin maximum principle for a general optimal control problem. The main ingredient of the proof is the abstract lemma on an inverse function, which is proved via the Schauder fixed-point theorem. Under this approach, the proof of the Pontryagin maximum principle is quite short and transparent.

Keywords: the Pontryagin maximum principle, the Schauder fixed-point theorem, inverse function lemma.

Received: 25.02.2023
Revised: 24.05.2024

Document Type: Article

UDC: 517.97

MSC: 49J15, 49K15

Language: English

Original paper language: Russian

Introduction

The central result of optimal control theory is the Pontryagin maximum principle, and there are many studies devoted to its proof. Let us mention some ideas underlying these studies. Historically, the first proof is based on the so-called needle variations of optimal control (see [1]). The idea of variation dates back to J.-L. Lagrange, who suggested to L. Euler that necessary extremum conditions in variational problems can be proved via variation of the extremal function. However, the essence of the variation by “needles” is principally different, because it is global with respect to control. This suggests the concept of a strong minimum, which is local with respect to the phase trajectory and global with respect to control, after which one can obtain necessary minimum conditions for this minimum, in which the control stationarity condition (as in the Euler–Lagrange equations) is replaced by the maximum condition of the Pontryagin function, which is a stronger condition.

Under the second approach, the original problem is “submerged” in a wider class of problems and is considered as a generic problem in this class rather than a separate objects with its own internal properties. The idea of this approach dates back to A.-M. Legendre, C. G. J. Jacobi, and W. R. Hamilton in the context of searching for sufficient weak minimum conditions in the simplest problem of variational calculus. For optimal control problems, this approach, together with the relaxation idea, was first proposed by Gamkrelidze [2]. This method, which proved to be quite fruitful, is capable of delivering only the classical Pontryagin maximum principle, but also its extension and strengthening (see [3]).

The third idea is here to consider some abstract general extremal problem in a Banach space to which the optimal control problem can be reduced (see, for example, [4]–[6]). Under this approach, the main difficulties are now related to the proof of necessary extremum conditions for this problem, and the Pontryagin maximum principle itself is derived by deciphering some (quite involved, in general) necessary extremum conditions.

The forth idea involves an application of various penalty methods for the proof of the Pontryagin maximum principle (this idea was apparently introduced in [7], see also [8] and [9]). Under this approach, the restrictions of an optimal control problem are carried over to the penalty function. After this, some necessary extremum conditions are applied to the resulting “penalty” problem, and then the Pontryagin maximum principle is proved by several passages to the limit. Note that the method of the proof of the maximum principle, which we propose in the present paper, involves the concept of a strong minimum.

In the present paper, we prove an abstract lemma on an inverse function for close mappings, and then, using some ideas of Gamkrelidze, derive the Pontryagin maximum principle by deciphering the negation of sufficient conditions for existence of an inverse function in this lemma. The proof of this lemma depends substantially on the Schauder fixed-point theorem. To our opinion, this approach simplifies the proof of the Pontryagin maximum principle because, on the one hand, it does not involve (unlike the variational approach) the lemma on the packet of needles or an equation in variations, and, on the other hand, the machinery of both the inverse function lemma and its deciphering is more simple than that based on the abstract approach or the penalty method.

The paper consists of three sections. In § 1, we formulate the main result (the Pontryagin maximum principle). In § 2, we prove the lemma on an inverse function and sone auxiliary results. The proof of the Pontryagin maximum principle based on the inverse function lemma is given in § 3.

§ 1. The main result

Consider on $[t_0,t_1]$ the optimal problem

$$ \begin{equation} f_0(x(t_0),x(t_1))\to\inf, \end{equation} \tag{1} $$

$$ \begin{equation} \dot x(t)=\varphi(t,x(t),u(t)),\qquad u(t)\in U\ \text{ for almost all }\ t\in[t_0,t_1], \end{equation} \tag{2} $$

$$ \begin{equation} f(x(t_0),x(t_1))\leqslant0,\qquad g(x(t_0),x(t_1))=0, \end{equation} \tag{3} $$

where $U$ is a non-empty subset of $\mathbb R^r$, $\varphi\colon \mathbb R\times\mathbb R^n\times\mathbb R^r\to\mathbb R^n$ is a mapping of $t$, $x$ and $u$, $f_0\colon\mathbb R^n\times\mathbb R^n\to \mathbb R$, $f\colon\mathbb R^n\times\mathbb R^n\to \mathbb R^{m_1}$ and $g\colon\mathbb R^n\times\mathbb R^n\to \mathbb R^{m_2}$ are mappings of $\zeta_i\in\mathbb R^n$, $i=0,1$. The inequality is understood coordinatewise.

The above problem is a standard optimal control problem in canonical form, also known as a problem in Mayer form. If the functional to be minimized and/or boundary conditions in the original problem contain functionals involving integrals, then we can easily reduce it to a problem of the form (1)–(3) by introducing extra variables.

In what follows, we always assume that both $\varphi$ and its partial derivative with respect to $x$ are continuous on $\mathbb R\times\mathbb R^n\times \mathbb R^r$; we also assume that $f_0$, $f$ and $g$ are differentiable on $\mathbb R^n\times\mathbb R^n$.

Let $C([t_0,t_1],\mathbb R^n)$, $\mathrm{AC}([t_0,t_1],\mathbb R^n)$ and $L_\infty([t_0,t_1],\mathbb R^r)$ be, respectively, the space of continuous vector functions on $[t_0,t_1]$ with values in $\mathbb R^n$, the space of absolutely continuous vector functions with values in $\mathbb R^n$, and the space of all essentially bounded vector functions with values in $\mathbb R^r$. The space $W_\infty^1([t_0,t_1],\mathbb R^n)$ is the set of all vector functions $x(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],\mathbb R^n)$ such that $\dot x(\,{\cdot}\,)\in L_\infty([t_0,t_1],\mathbb R^n)$. This space is equipped with the norm

$$ \begin{equation*} \|x(\,{\cdot}\,)\|_{W_\infty^1([t_0,t_1],\mathbb R^n)}=\|x(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}+ \|\dot x(\,{\cdot}\,)\|_{L_\infty([t_0,t_1],\mathbb R^n)}. \end{equation*} \notag $$

A pair $(x(\,{\cdot}\,),u(\,{\cdot}\,)) \in C([t_0,t_1],\mathbb R^n)\times L_\infty([t_0,t_1],\mathbb R^r)$ is an admissible pair for problem (1)–(3) if it satisfies conditions (2) and (3).

Definition 1. An admissible pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ delivers a strong minimum in problem (1)–(3) (an optimal process) if there exists a neighbourhood $V\subset C([t_0,t_1],\mathbb R^n)$ of the function $\widehat{x}(\,{\cdot}\,)$ such that $f_0(x(t_0),x(t_1))\geqslant f_0(\widehat{x}(t_0),\widehat{x}(t_1))$ for any admissible pair $(x(\,{\cdot}\,),u(\,{\cdot}\,))$ with $x(\,{\cdot}\,)\in V$.

To formulate necessary conditions for a strong minimum in problem (1)–(3) (the Pontryagin maximum principle) we need some notation.

The Euclidean norm on $\mathbb R^n$ is denoted by $|\,{\cdot}\,|$. The value of a linear functional $\lambda=(\lambda_1,\dots,\lambda_n)\in(\mathbb R^n)^*$ evaluated on a vector $x=(x_1,\dots,x_n)^\top\in\mathbb R^n$ ($\top$ denotes transposition) is denoted by $\langle \lambda,x\rangle=\sum_{i=i}^n\lambda_ix_i$. Let $(\mathbb R^n)^*_+$ be the set of all linear functionals on $\mathbb R^n$ that assume non-negative values on non-negative vectors. The adjoint operator to a linear operator $\Lambda\colon \mathbb R^n\to\mathbb R^m$ is denoted by $\Lambda^*$.

Let $(\widehat{x}(\,{\cdot}\,), \widehat{x}(\,{\cdot}\,))$ be a fixed pair. For brevity, the derivatives of mappings $f_0$, $f$ and $g$ and their partial derivatives with respect to $\zeta_0$ and $\zeta_1$ at a point $(\widehat{x}(t_0),\widehat{x}(t_1))$ will be denoted, respectively, by $\widehat f'_0$, $\widehat f'$, $\widehat g$ and $\widehat f_{0\zeta_i}$, $\widehat f_{\zeta_i}$, $\widehat g_{\zeta_i}$, $i=0,1$. We also set $\widehat\varphi(\,{\cdot}\,)=\varphi(\,{\cdot}\,,\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ and $\widehat\varphi_x(\,{\cdot}\,)=\varphi_x(\,{\cdot}\,,\widehat{x}(\,{\cdot}\,)),\widehat{u}(\,{\cdot}\,))$.

Theorem 1 (the Pontryagin maximum principle). If $(\widehat{x}(\,{\cdot}\,), \widehat{u}(\,{\cdot}\,))$ is a strong minimum pair in problem (1)–(3), then there exist a non-zero tuple $(\lambda_0,\lambda_f,\lambda_g)\in \mathbb R_+\times(\mathbb R^{m_1})^*_+\times(\mathbb R^{m_2})^*$ and a vector function $p(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],(\mathbb R^n)^*)$ such that the following conditions are fulfillled:

1) the stationarity condition with respect to $x(\,{\cdot}\,)$

$$ \begin{equation*} \dot p(t) =-p(t)\widehat\varphi_x(t); \end{equation*} \notag $$

2) the transversality condition

$$ \begin{equation*} p(t_0)=\lambda_0{\widehat {f}}_{0\zeta_0}+{\widehat {f}_{\zeta_0}}^*\lambda_f+{\widehat {g}_{\zeta_0}}^*\lambda_g,\qquad p(t_1)=-\lambda_0{\widehat {f}}_{0\zeta_1}-{\widehat {f}_{\zeta_1}}^*\lambda_f-{\widehat {g}_{\zeta_1}}^*\lambda_g; \end{equation*} \notag $$

3) the complementary slackness condition

$$ \begin{equation*} \langle \lambda_f, f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle=0; \end{equation*} \notag $$

4) the maximum condition

$$ \begin{equation*} \max_{u\in U}\langle p(t),\varphi(t,\widehat{x}(t),u)\rangle=\langle p(t),\varphi(t,\widehat{x}(t),\widehat{u}(t))\rangle\quad\textit{for almost all } \ t\in[t_0,t_1]. \end{equation*} \notag $$

Let $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be an admissible pair for problem (1)–(3). Let $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be the set of all quadruples $(\lambda_0,\lambda_f,\lambda_g, p(\,{\cdot}\,))$ satisfying conditions 1)–4) of this theorem. It is clear that the zero quadruple always satisfies these conditions, and so, for a proof of the maximum principle we need to show that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\ne \{0\}$.

§ 2. Lemma on an inverse function and auxiliary results

Here, we will prove the lemma on an inverse function, which characterizes the condition $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$ in different terms, and formulate the lemma on approximation, which is a particular case of Lemma 4.3 from the authors paper [3].

Let $Z$ be a normed linear space, $z_0\in Z$ and $\rho>0$. In what follows, $U_Z(z_0,\rho)$ and $B_Z(x_0,\rho)$ will denote, respectively, the open and closed balls in $Z$ of radius $\rho$ with centre at $z_0$. The interior of a set $A\subset Z$ will be denoted by $\operatorname{int}A$.

Let $\mathcal M$ be a topological space. By $C(\mathcal M, Z)$ we denote the space of continuous bounded mappings $F\colon \mathcal M\to Z$ with norm

$$ \begin{equation*} \|F\|_{C(\mathcal M, Z)}=\sup\limits_{x\in \mathcal M}\|F(x)\|_Z. \end{equation*} \notag $$

Lemma 1 (on an inverse function). Let $X$, $Y$, $Y_1$ be Banach spaces, $Y_1$ be continuously embedded in $Y$, let $K$ and $Q$ be closed convex subsets of $X$, let $V$ be a neighbourhood of a point $\widehat{x}\in K\cap Q$, and let $F\in C(V,Y)$. Assume that the following conditions are met:

1) $F$ is differentiable at $\widehat{x}$, and $F(\widehat{x})\in Y_1$,

2) $0\in\operatorname{int}F'(\widehat{x})(K-\widehat{x})$,

3) the set $B_X(0,1)\cap(F'(\widehat{x}))^{-1}B_{Y_1}(0,1)$ is precompact in $X$,

4) $F-F'(\widehat{x})\subset C(V,Y_1)$.

Then there exist constants $0<\delta_1\leqslant1$, $c>0$ and a neighbourhood $W\subset C(V\cap K\cap Q,Y)$ of the mapping $F$ such that, for each $\widetilde{F}\in W$ with $\widetilde F-F\in C(V\cap K\cap Q,Y_1)$ and, for all $(x,y)\in(V\cap K\cap Q)\times U_{Y_1}(F(\widehat{x}),\delta_1)$,

$$ \begin{equation} B_X(0,1)\cap(F'(\widehat{x}))^{-1}\bigl(F'(\widehat{x})(x-\widehat{x})+y-\widetilde F(x)\bigr)\subset Q-\widehat{x}, \end{equation} \tag{4} $$

there exists a mapping $\psi_{\widetilde{F}}\colon U_{Y_1}(F(\widehat{x}),\delta_1)\to V\cap K\cap Q$ satisfying

$$ \begin{equation} \widetilde{F}(\psi_{\widetilde{F}}(y))=y,\qquad \|\psi_{\widetilde{F}}(y)-\widehat{x}\|_X\leqslant c\bigl(\|\widetilde{F}-F\|_{C(V\cap K\cap Q,Y)}+\|y-F(\widehat{x})\|_Y\bigr) \end{equation} \tag{5} $$

for all $y\in U_{Y_1}(F(\widehat{x}),\delta_1)$.

Proof. The hypotheses of this lemma include those of Lemma $1$ in the authors paper [10], according to which there exist numbers $\gamma>0$, $a>0$, and a continuous mapping (a right inverse) $R\colon U_{Y}(0,\gamma)\to K-\widehat{x}$ such that, for all $z\in U_{Y}(0,\gamma)$,

$$ \begin{equation} AR(z)=z,\qquad \|R(z)\|_X\leqslant a\|z\|_Y \end{equation} \tag{6} $$

(recall that $A=F'(\widehat{x})$).

By condition 1), there exists $0<\delta=\delta(a)\leqslant 1$ such that $\widehat{x}+U_X(0,\delta)\subset V$, and if $x\in U_X(0,\delta)$, then

$$ \begin{equation} \|F(\widehat{x}+x)-F(\widehat{x})-A x\|_Y\leqslant\frac1{2a}\|x\|_X. \end{equation} \tag{7} $$

Since $Y_1$ is continuously embedded in $Y$, there exists a constant $b>0$ such that $\|y\|_Y\leqslant b\|y\|_{Y_1}$ for all $y\in Y_1$. We choose $r>0$ and $0<\delta_1\leqslant1$ so as to have

$$ \begin{equation} 2(r+b\delta_1)\leqslant\min\biggl(\gamma,\frac{\delta}{a}\biggr). \end{equation} \tag{8} $$

We set $W=U_{C(V\cap K\cap Q,Y)}(F,r)$ and denote $V_1=U_{Y_1}(F(\widehat{x}),\delta_1)$.

We fix $y\in V_1$ and $\widetilde{F}\in W$ such that $\widetilde F-F\in C(V\cap K\cap Q,Y_1)$ (as in the lemma) and inclusion (4) holds. From (4) we have $B_X(0,1)\cap A^{-1}(z(x))\subset Q-\widehat{x}$ for all $x\in E=U_X(0,\delta)\cap(K-\widehat{x})\cap(Q-\widehat{x})$, where

$$ \begin{equation*} z(x)=z_{y,\widetilde{F}}(x)=A x+y-\widetilde{F}(\widehat{x}+x). \end{equation*} \notag $$

The mapping $x\mapsto z(x)$ from $E$ to $Y$ is continuous, inasmuch as $\widetilde{F}\in W$. Hence $\widetilde F\in C(V\cap K\cap Q,Y)$, and, therefore, the mapping $x\mapsto \widetilde F(\widehat{x}+x)$ from $E$ to $Y$ is continuous.

We write $z(x)$ as

$$ \begin{equation*} z(x)=A(\widehat{x}+x)-F(\widehat{x}+x)+F(\widehat{x}+x)-\widetilde{F}(\widehat{x}+x)+F(\widehat{x}) -A\widehat{x}+y-F(\widehat{x}). \end{equation*} \notag $$

By condition 4) of the lemma and the choice of $\widetilde{F}$ and $y$ we see that $z(x)\in Y_1$ for each $x\in E$, and, further,

$$ \begin{equation*} \begin{aligned} \, \|z(x)\|_{Y_1} &\leqslant\|F-A\|_{C(V,Y_1)}+\|F-\widetilde{F}\|_{C(V\cap K\cap Q,Y_1)} \\ &\qquad+\|F(\widehat{x})-A\widehat{x}\|_{Y_1}+ \|y-F(\widehat{x})\|_{Y_1}. \end{aligned} \end{equation*} \notag $$

Hence $z(x)\in B_{Y_1}(0,\kappa)$ for all $x\in E$, where $\kappa=\kappa(y,\widetilde{F})$ is the quantity on the right of this estimate.

Writing $z(x)$ as

$$ \begin{equation*} z(x)=-\bigl(F(\widehat{x}+x)-F(\widehat{x})-A x\bigr) +F(\widehat{x}+x) -\widetilde{F}(\widehat{x}+x) +y -F(\widehat{x}), \end{equation*} \notag $$

we conclude (by inequality (7) and the choice of $\widetilde{F}$) that

$$ \begin{equation} \|z(x)\|_Y\leqslant \frac1{2a}\|x\|_X +\|\widetilde{F}-F\|_{C(V\cap K\cap Q,Y)}+\|y-F(\widehat{x})\|_Y=\frac1{2a}\|x\|_X+d, \end{equation} \tag{9} $$

where

$$ \begin{equation} \begin{aligned} \, d &=d(y, \widetilde{F})=\|\widetilde{F}-F\|_{C(V\cap K\cap Q,Y)}+\|y-F(\widehat{x})\|_Y \nonumber \\ &<r+b\|y-F(\widehat{x})\|_{Y_1}< r+b\delta_1 \end{aligned} \end{equation} \tag{10} $$

(recall that the embedding of $Y_1$ into $Y$ is continuous).

We set

$$ \begin{equation*} M=B_X(0, 2ad)\cap(K-\widehat{x})\cap (Q-\widehat{x}) \end{equation*} \notag $$

and let the mapping $\Phi=\Phi_{y,\widetilde{F}}\colon M\to X$ be defined by

$$ \begin{equation*} \Phi(x)=R(z(x)). \end{equation*} \notag $$

This mapping is well defined, since, first, $2ad<2a(r+b\delta_1)\leqslant\delta$ by (10) and (8), and hence $M\subset E$, and, second, for each $x\in M$, by (9), (7), (10) and (8), we have

$$ \begin{equation*} \|z(x)\|_Y\leqslant\frac1{2a}\|x\|_X+d\leqslant 2d<2(r+b\delta_1)\leqslant\gamma. \end{equation*} \notag $$

The mapping $\Phi$ is continuous qua superposition of continuous mappings. We claim that $\Phi$ maps the closed convex set $M$ into itself.

Indeed, let $x\in M$. By (6) and from the previous estimate,

$$ \begin{equation*} \|\Phi(x)\|_X=\|R(z(x))\|_X\leqslant a\|z(x)\|_Y\leqslant2ad, \end{equation*} \notag $$

which implies that $\Phi(x)\in B_X(0, 2ad)$.

Next, $\Phi(x)\in K-\widehat{x}$ by the definition of $R$. From the equality in (6) it follows that $A\Phi(x)=A R(z(x))=z(x)$. Hence $\Phi(x)\in A^{-1}(z(x))$. Since $2ad<\delta\leqslant1$, we have $\Phi(x)\in B_X(0,1)$, and now $\Phi(x)\in B_X(0,1)\cap A^{-1}(z(x))\subset Q-\widehat{x}$ by (4).

Thus, $\Phi(M)\subset M$. We next claim that the set $\Phi(M)$ is precompact in $X$.

From the equality $A\Phi(x)=z(x)$, the above inclusion $z(x)\in B_{Y_1}(0,\kappa)$, and the inclusion $\Phi(x)\in B_X(0,1)$, which hold for each $x\in M$, it follows that $\Phi(M)\subset B_X(0,1)\cap A^{-1}B_{Y_1}(0,\kappa)$.

By condition 3) of the lemma, the set $B_X(0,1)\cap A^{-1}B_{Y_1}(0,1)$ is precompact, and hence so is the set $B_X(0,1)\cap A^{-1}B_{Y_1}(0,\kappa)$. Consequently, its subset $\Phi(M)$ is also is precompact.

So, the continuous mapping $\Phi$ sends the closed convex set $M$ into itself, and its image is precompact in $M$. Let us use the classical Schauder fixed point theorem. This theorem asserts that if $M$ is a convex subset of a normed linear space, $f\colon M\to M$ is a continuous mapping, and the range $f(M)$ is precompact (in $M$), then $f$ has a fixed point (see, for example, [11], p. 34). Consequently, there exists a point $\widetilde x=\widetilde x(y,\widetilde{F})$ such that $\Phi(\widetilde x)=\widetilde x$. Now, by the definition of $\Phi$, the equality in (6), and the definition of $z(x)$, we have

$$ \begin{equation*} A\widetilde x=A\Phi(\widetilde x)=A R(z(\widetilde x))=z(\widetilde x)=A\widetilde x+y-\widetilde{F}(\widehat{x}+\widetilde x), \end{equation*} \notag $$

which implies that $\widetilde{F}(\widehat{x}+\widetilde x)=y$. We set $\psi_{\widetilde{F}}(y)=\widehat{x}+\widetilde x$. By the condition, $\widetilde x\in U_X(0,\delta)\cap(K- \widehat{x})\cap(Q-\widehat{x})\subset (V-\widehat{x})\cap(K-\widehat{x})\cap(Q-\widehat{x})$ and hence $\psi_{\widetilde{F}}(y)\in V\cap K\cap Q$. Therefore, the equality in (5) holds for each $y\in V_1$.

We have $\psi_{\widetilde{F}}(y)-\widehat{x}=\widetilde x\in B_X(0, 2ad)$, and now by (10) we get the inequality in (5) with $c=2a$. This completes the proof of Lemma 1.

We set

$$ \begin{equation*} \mathcal U=\{u(\,{\cdot}\,)\in L_\infty([t_0,t_1],\mathbb R^r)\colon u(t)\in U\text{ for almost all } t\in[t_0,t_1]\}. \end{equation*} \notag $$

Let $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be an admissible pair in problem (1)–(3), let $N\in\mathbb N$, let $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_N(\,{\cdot}\,))\in\mathcal U^{N}$, and let

$$ \begin{equation*} \begin{aligned} \, X&=C([t_0,t_1],\mathbb R^n)\times\mathbb R^n\times \mathbb R^{N+1}\times \mathbb R\times\mathbb R^{m_1}, \\ Y&=C([t_0,t_1],\mathbb R^n)\times\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}. \end{aligned} \end{equation*} \notag $$

In what follows, for brevity, we will frequently write $\overline u$, $\overline v$, etc., in place of $\overline u(\,{\cdot}\,)$, $\overline v(\,{\cdot}\,)$, etc.

Let the mapping $A_N(\overline u)\colon X\to Y$ be defined by

$$ \begin{equation*} \begin{aligned} \, &A_N(\overline u)[h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu](t) \\ &\qquad=\biggl(h(t)-\xi- \int_{t_0}^t\biggl(\widehat \varphi_x(\tau)h(\tau)+ \sum_{i=0}^N\alpha_i\varphi(\tau,\widehat{x}(\tau),u_i(\tau))\biggr)d\tau, \\ &\qquad\qquad\widehat f_0'[\xi,h(t_1)]+\nu_0, \, \widehat f'[\xi,h(t_1)]+\nu, \,\widehat g^{\,\prime}[\xi,h(t_1)]\biggr) \quad \forall\,t\in[t_0,t_1], \end{aligned} \end{equation*} \notag $$

for all $(h(\,{\cdot}\,),\xi,\overline \alpha, \nu_0,\nu)\in X$, where $u_0(\,{\cdot}\,)=\widehat{u}(\,{\cdot}\,)$ and $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)$. We claim that this continuous operator is well defined.

Indeed, let

$$ \begin{equation*} \kappa=\max\{\|u_i(\,{\cdot}\,)\|_{L_\infty([t_0,t_1],\mathbb R^r)}, \, i=0,1,\dots, N\}, \end{equation*} \notag $$

and $\mathcal K=[t_0,t_1]\times B_{\mathbb R^r}(0,\kappa)$. On the compact set $\mathcal K$, the mappings $(t,u)\mapsto \varphi(t,\widehat{x}(t),u)$ and $(t,u)\mapsto \varphi_x(t,\widehat{x}(t),u)$ are continuous, and, therefore, bounded. Hence the mappings $t\mapsto \varphi(t,\widehat{x}(t),u_i(t))$, $i=0,1,\dots,N$, and $t\mapsto \widehat\varphi_x(t)$, which are measurable (by continuity of $\varphi$ and $\varphi_x$ with respect to $u$), are essentially bounded on $[t_0,t_1]$. Therefore, the first component of the image $A_N(\overline u)$ lies in the space $C([t_0,t_1],\mathbb R^n)$. That the remaining components of the image $A_N(\overline u)$ are contained in the space $\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}$ is clear.

It is also easily checked that the mapping $A_N(\overline u)$ is linear. So, by the above, we have the estimate

$$ \begin{equation*} |A_N(\overline u)[h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu](t)|\leqslant C\bigl(\|h(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}+|\xi|+|\overline\alpha|+|\nu_0|+|\nu|\bigr), \quad t\in [t_0,t_1], \end{equation*} \notag $$

which ensures the continuity of $A_N(\overline u)$.

We set

$$ \begin{equation*} K_N=C([t_0,t_1],\mathbb R^n)\times\mathbb R^n\times(\Sigma^{N+1}-\overline\alpha_{N})\times\mathbb R_+\times(\mathbb R^{m_1}_++f(\widehat{x}(t_0),\widehat{x}(t_1))), \end{equation*} \notag $$

where

$$ \begin{equation*} \Sigma^{N+1}=\biggl\{\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_{N})\in \mathbb R_+^{N+1} \colon \sum_{i=0}^N\alpha_i=1\biggr\} \end{equation*} \notag $$

and $\overline\alpha_N=(1,0,\dots,0)\in\Sigma^{N+1}$. It is clear that $K_N$ is a closed convex subset of $X$.

Lemma 2. The following assertions are equivalent:

1) $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$;

2) there exist an $\widehat N\in\mathbb N$ and a tuple $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in\mathcal U^{\widehat N}$ such that

$$ \begin{equation} 0\in\operatorname{int}A_{\widehat N}(\overline v)K_{\widehat N}. \end{equation} \tag{11} $$

The proof of this lemma is based on the following two propositions.

Proposition 1. Let $X$, $Y_1$, $Y_2$ be Banach spaces, $A_i\colon X\to Y_i$, $i=1,2$, be linear continuous operators, the operator $A=(A_1,A_2)\colon X\to Y=Y_1\times Y_2$ be defined by $Ax=(A_1x,A_2x)$, $x\in X$, and $C$ be a closed convex subset of $X$.

Then $0\in\operatorname{int}AC$ if and only if $0\in\operatorname{int}A_1C$ and $0\in\operatorname{int}A_2(C\cap \operatorname{Ker}A_1)$.

Proof. Let $0\in\operatorname{int}AC$. Then there exists $r>0$ such that $U_{Y_1}(0,r)\times U_{Y_2}(0,r)\subset A(C)$. Hence $U_{Y_1}(0,r)\subset A_1C$, and, therefore, $0\in\operatorname{int}A_1C$.

Next, we have $\{0\}\times U_{Y_2}(0,r)\subset A(C)$, and hence, for each $y_2\in U_{Y_2}(0,r)$, there exists a point $x\in C\cap\operatorname{Ker}A_1$ such that $A_2x=y_2$. Consequently, we have $0\in\operatorname{int}A_2(C\cap \operatorname{Ker}A_1)$.

Conversely, let $0\in\operatorname{int}A_1C$ and $0\in\operatorname{int}A_2(C\cap \operatorname{Ker}A_1)$. From the first inclusion it follows that the hypotheses of Lemma 1 in [10] are satisfied. As mentioned at the beginning of the proof of Lemma 1, this entails that there exist numbers $\gamma>0$, $a>0$ and mappings $R\colon U_{Y_1}(0,\gamma)\to C$ such that, for all $y_1\in U_{Y_1}(0,\gamma)$,

$$ \begin{equation} A_1R(y_1)=y_1,\qquad \|R(y_1)\|_X\leqslant a\|y_1\|_{Y_1}. \end{equation} \tag{12} $$

The second inclusion ensures that $U_{Y_2}(0,\rho)\subset A_2(C\cap \operatorname{Ker}A_1)$ for some $\rho>0$. Let $0<\delta\leqslant \min(\rho/4, \rho/4a\|A_2\|,\gamma/2)$. We claim that $U_{Y}(0,\delta)\subset AC$.

If $y=(y_1,y_2)\in U_{Y}(0,\delta)$, then $2y_1\in U_{Y_1}(0,\gamma)$ by the choice of $\delta$ and hence, by (12), there exists $x_1=R(2y_1)\in C$ such that $A_1x_1=2y_1$.

Let us now show that $2y_2-A_2x_1\in U_{Y_2}(0,\rho)$. Indeed, by the choice of $\delta$ and from the inequality in (12), we have

$$ \begin{equation*} \begin{aligned} \, \|2y_2-A_2x_1\|_{Y_2} &\leqslant 2\|y_2\|_{Y_2}+\|A_2\|\, \|x_1\|_X<\frac{\rho}2+\|A_2\|a\|2y_1\|_{Y_1} \\ &<\frac{\rho}2 +\|A_2\|2a\delta\leqslant \frac{\rho}2+ \frac{\rho}2=\rho. \end{aligned} \end{equation*} \notag $$

Therefore, there exists $x_2\in C\cap \operatorname{Ker}A_1$ such that $A_2x_2=2y_2-A_2x_1$.

We set $x=(1/2)x_1+(1/2)x_2$. Since $x_i\in C$, $i=1,2$, and since $C$ is convex, we have $x\in C$. Next, $x_2\in \operatorname{Ker}A_1$, and hence $A_1x=(1/2)A_1x_1+(1/2)A_1x_2=(1/2)A_1x_1=y_1$. Further, by the definition of $x_2$, we have $A_2x=(1/2)A_2x_1+(1/2)A_2x_2=(1/2)A_2x_1+y_2-(1/2)A_2x_1=y_2$, that is, $Ax=(A_1x,A_2x)=(y_1,y_2)=y$.

So, $U_{Y}(0,\delta)\subset AC$, which proves Proposition 1.

The operator $A_N(\overline u)=(A_{1N}(\overline u), A_{2N}(\overline u))$ was introduced before Lemma 2, where the linear operator $A_{1N}(\overline u)\colon X\to Y_1=C([t_0,t_1],\mathbb R^n)$ is defined by

$$ \begin{equation*} \begin{aligned} \, &A_{1N}(\overline u)[h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu](t) \\ &\qquad=h(t)-\xi- \int_{t_0}^t\biggl(\widehat \varphi_x(\tau)h(\tau) +\sum_{i=0}^N\alpha_i\varphi(\tau,\widehat{x}(\tau),u_i(\tau))\biggr)\, d\tau \end{aligned} \end{equation*} \notag $$

for all $(h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu)\in X$, $t\in[t_0,t_1]$, and the linear operator $A_{2N}(\overline u)\colon X\to Y_2=\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}$ is defined by

$$ \begin{equation*} A_{2N}(\overline u)[h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu]=\bigl(\widehat f_0'[\xi,h(t_1)]+\nu_0, \, \widehat f'[\xi,h(t_1)]+\nu,\,\widehat g^{\,\prime}[\xi,h(t_1)]\bigr) \end{equation*} \notag $$

for all $(h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu)\in X$.

Proposition 2. The following assertions are equivalent:

a) $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$;

b) there exist an $\widehat N\in \mathbb N$ and a tuple $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in\mathcal U^{\widehat N}$ such that

$$ \begin{equation} 0\in\operatorname{int}A_{2\widehat N}(\overline v)(K_{\widehat N}\cap\operatorname{Ker}A_{1\widehat N}(\overline v)). \end{equation} \tag{13} $$

Proof. a) $\Rightarrow$ b). Let us first verify that a) implies the inclusion

$$ \begin{equation} 0\in\operatorname{int} \bigcup_{\overline u(\,{\cdot}\,)\in\mathcal V}A_{2N}(\overline u)(K_{N}\cap\operatorname{Ker}A_{1N}(\overline u)), \end{equation} \tag{14} $$

where $\mathcal V$ is the set of all tuples $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_{N}(\,{\cdot}\,))\in \mathcal U^{N}$, $N\in \mathbb N$.

Assume on the contrary that inclusion (14) does not hold. Let us verify that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\ne\{0\}$.

We first show that the set on the right of (14) is convex. We denote this set by $M$. Consider $y_i\in M$, $\beta_i>0$, $i=1,2$, and $\beta_1+\beta_2=1$. We claim that $\beta_1y_1+\beta_2y_2\in M$.

Since $y_i\in M$, $i=1,2$, there exist $N_i$, $\overline u_i(\,{\cdot}\,)=(u_{i1}(\,{\cdot}\,),\dots,u_{iN_i}(\,{\cdot}\,))\in\mathcal U^{N_i}$, and $z_i=(h_i(\,{\cdot}\,),\xi_i,\overline\alpha_i,\nu_{0i},\nu_i+f(\widehat{x}(t_0),\widehat{x}(t_0)))\in K_{N_i}\cap \operatorname{Ker}A_{1N_i}(\overline u_i)$ such that $y_i=A_{2N_i}(\overline u_i)[z_i]$, $i=1,2$, where $\overline\alpha_i=(\alpha_{i0},\alpha_{i1},\dots,\alpha_{iN_i})$ and

$$ \begin{equation} \dot h_i(t)=\widehat\varphi_x(t)h_i(t)+\alpha_{i0}\widehat\varphi(t)+ \sum_{j=1}^{N_i}\alpha_{ij}\varphi(t,\widehat{x}(t),u_{ij}(t)),\qquad h_i(t_0)=\xi_i, \end{equation} \tag{15} $$

for all $t\in[t_0,t_1]$.

Let $N=N_1+N_2$ and $\overline u(\,{\cdot}\,)=(\overline u_{1}(\,{\cdot}\,), \overline u_2(\,{\cdot}\,))$. We set $z=(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0, \nu+f(\widehat{x}(t_0),\widehat{x}(t_0)))$, where $h(\,{\cdot}\,)=\beta_1h_1(\,{\cdot}\,)+\beta_2h_2(\,{\cdot}\,)$, $\overline\alpha=(\beta_1\alpha_{10}+\beta_2\alpha_{20},\beta_1\alpha_{11},\dots, \beta_1\alpha_{1N_1}, \beta_2\alpha_{21},\dots,\beta_2\alpha_{2N_2})$, $\xi=\beta_1\xi_1+\beta_2\xi_2$, $\nu_0=\beta_1\nu_{10}+\beta_2\nu_{20}$ and $\nu=\beta_1\nu_1+\beta_2\nu_2$.

It is easily seen that $\overline\alpha\in \Sigma^{N+1}-\overline\alpha_N$. Hence $z\in K_N$.

From (15) it follows that the function $h(\,{\cdot}\,)$ satisfies the equation

$$ \begin{equation*} \begin{aligned} \, \dot h(t) &=\widehat \varphi_x(t)h(t)+(\beta_1\alpha_{10}+\beta_2\alpha_{20})\widehat \varphi(t)+ \sum_{j=1}^{N_1}\beta_1\alpha_{1j}\varphi(t,\widehat{x}(t),u_{1j}(t)) \\ &\qquad+\sum_{j=N_1+1}^{N}\beta_2\alpha_{2(j-N_1)}\varphi(t,\widehat{x}(t),u_{2(j-N_1)}(t)),\qquad h(t_0)=\beta_1\xi_1+\beta_2\xi_2, \end{aligned} \end{equation*} \notag $$

for all $t\in [t_0, t_1]$. Hence $z\in \operatorname{Ker}A_{1N}(\overline u)$, and, therefore, $z\in K_N\cap\operatorname{Ker}A_{1N}(\overline u)$. Consequently,

$$ \begin{equation*} A_{2N}(\overline u)[z]=\beta_1A_{2N_1}(\overline u_1)[z_1]+\beta_2A_{2N_2}(\overline u_2)[z_2] =\beta_1y_1+\beta_2y_2, \end{equation*} \notag $$

that is, $\beta_1y_1+\beta_2y_2\in A_{2N}(\overline u)(K_{N}\cap\operatorname{Ker}A_{1N}(\overline u))$. This verifies that the set $M$ is convex.

Let us get back to inclusion (14). If this inclusion does not hold, then either $M$ has non-empty interior and does not contain 0, or the interior of $M$ is empty.

Since $M$ is convex, its interior is also convex; if this interior is non-empty and does not contain the origin, then, by the finite-dimensional separation theorem, there exists a non-zero vector $\lambda\in (\mathbb R\times\mathbb R^{m_1}\times\mathbb R^{m_2})^*$ such that

$$ \begin{equation} \bigl\langle\lambda,A_{2N}(\overline u)[h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu+f(\widehat{x}(t_0),\widehat{x}(t_1))]\bigr\rangle\geqslant0 \end{equation} \tag{16} $$

for each tuple $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_N(\,{\cdot}\,))\in\mathcal U^N$ and for all $(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu+f(\widehat{x}(t_0),\widehat{x}(t_1))) \in K_{N}\cap\operatorname{Ker}A_{1N}(\overline u)$. Here, $h(\,{\cdot}\,)=h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$ satisfies the differential equation

$$ \begin{equation} \dot h=\widehat\varphi_x(t)h+\alpha_0\widehat\varphi(t)+ \sum_{i=1}^{N}\alpha_{i}\varphi(t,\widehat{x}(t),u_{i}(t)),\qquad h(t_0)=\xi, \end{equation} \tag{17} $$

where $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)$.

If the interior of $M$ is empty, then it is well-known (see, for example, [12]) that the set $M$ is contained in some hyperplane containing the origin (since $M\ni 0$), and, in this case, inequality (16) becomes an equality.

We set $\lambda=(\lambda_0,\lambda_f,\lambda_{g})\in \mathbb R\times(\mathbb R^{m_1})^*\times (\mathbb R^{m_2})^*$. In this case, by the definition of the operator $A_{2N}(\overline u)$, inequality (16) assumes the form

$$ \begin{equation} \begin{aligned} \, &\lambda_0\bigl(\langle\widehat f_{0\zeta_0},\xi\rangle+\langle\widehat f_{0\zeta_1}, h(t_1)\rangle+\nu_0\bigr)+\bigl\langle\lambda_f,\widehat f_{\zeta_0}\xi+\widehat f_{\zeta_1}h(t_1)+\nu+f(\widehat{x}(t_0),\widehat{x}(t_1))\bigr\rangle \nonumber \\ &\qquad+\langle\lambda_g,\widehat g_{\zeta_0}\xi+\widehat g_{\zeta_1}h(t_1)\rangle\geqslant0. \end{aligned} \end{equation} \tag{18} $$

Using this relation, let us verify that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\ne\{0\}$.

We put $\xi=0$, $\overline\alpha=0$ (note that $h(\,{\cdot}\,)=0$ by uniqueness of the solution of the linear equation (17)), and define $\nu=-f(\widehat{x}(t_0),\widehat{x}(t_1))$. Now from (18) we have $\lambda_0\nu_0\geqslant0$ for each $\nu_0\geqslant0$. Therefore, $\lambda_0\geqslant0$.

If $\xi=0$, $\overline\alpha=0$, $\nu_0=0$ and $\nu=\nu'-f(\widehat{x}(t_0),\widehat{x}(t_1))$, where $\nu'\in \mathbb R^{m_1}_+$, then from (18) it follows that $\langle\lambda_f,\nu'\rangle\geqslant0$ for each $\nu'\in\mathbb R^{m_1}_+$, that is, $\lambda_f\in(\mathbb R^{m_1})^*_+$.

Let $\xi=0$, $\overline\alpha=0$, $\nu_0=0$ and $\nu=0$. From (18) we have $\langle\lambda_f,f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle\geqslant 0$. However, $\lambda_f\in(\mathbb R^{m_1})^*_+$, $f(\widehat{x}(t_0),\widehat{x}(t_1))\leqslant0$, and so $\langle\lambda_f,f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle\leqslant0$, that is, $\langle\lambda_f,f(\widehat{x}(t_0),\widehat{x}(t_1))\rangle=0$. This verifies the complementary slackness condition.

In (18), we put $\overline\alpha=0$, $\nu_0=0$, and $\nu=-f(\widehat{x}(t_0),\widehat{x}(t_1))$. Since $\xi\in\mathbb R^n$ and since $h(\,{\cdot}\,)=h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$ depend linearly on $\xi$, inequality (18) becomes the equality

$$ \begin{equation} \begin{aligned} \, &\lambda_0\bigl(\langle \widehat f_{0\zeta_0},\xi\rangle+\langle \widehat f_{0\zeta_1}, h(t_1)\rangle\bigr)+\langle \lambda_f,\widehat f_{\zeta_0} \xi+\widehat f_{\zeta_1} h(t_1)\rangle \nonumber \\ &\qquad+\langle \lambda_g,\widehat g_{\zeta_0} \xi+\widehat g_{\zeta_1} h(t_1)\rangle=0. \end{aligned} \end{equation} \tag{19} $$

Let $p(\,{\cdot}\,)$ be a solution of the equation

$$ \begin{equation} \dot p =-p\widehat\varphi_x(t),\qquad p(t_1)=-\lambda_0{\widehat {f}}_{0\zeta_1}-{\widehat {f}_{\zeta_1}}^*\lambda_f-{\widehat {g}_{\zeta_1}}^*\lambda_g. \end{equation} \tag{20} $$

From (19) and (20) it follows that

$$ \begin{equation} \begin{aligned} \, &\langle\lambda_0\widehat f_{0\zeta_0}+\widehat f_{\zeta_0}^*\lambda_f+\widehat g_{\zeta_0}^*\lambda_g,\xi\rangle =-\langle\lambda_0\widehat f_{0\zeta_1}+\widehat f_{\zeta_1}^*\lambda_f+\widehat g_{\zeta_1}^*\lambda_g, h(t_1)\rangle=\langle p(t_1),h(t_1)\rangle \nonumber \\ &=\int_{t_0}^{t_1}(\langle p(t),\dot h(t)\rangle+\langle\dot p(t),h(t)\rangle)\,dt+\langle p(t_0),h(t_0)\rangle= \langle p(t_0),h(t_0)\rangle=\langle p(t_0),\xi\rangle \end{aligned} \end{equation} \tag{21} $$

for each $\xi\in\mathbb R^n$. Hence

$$ \begin{equation*} p(t_0)=\lambda_0\widehat f_{0\zeta_0}+\widehat f_{\zeta_0}^*\lambda_f+\widehat g_{\zeta_0}^*\lambda_g. \end{equation*} \notag $$

Together with (20) this proves the stationarity condition with respect to $x(\,{\cdot}\,)$ and the transversality condition. It remains to verify the maximum condition.

In (18), we put $\nu_0=0$, $\nu=-f(\widehat{x}(t_0),\widehat{x}(t_1))$, and let $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)\in \Sigma^{N+1}-\overline\alpha_N$ be such that $\sum_{i=1}^N\alpha_i>0$. Recalling the expression for $p(t_0)$ and $p(t_1)$, from (18) we have

$$ \begin{equation} \langle p(t_0),\xi\rangle-\langle p(t_1),h(t_1,\xi,\overline\alpha; \overline u)\rangle\geqslant0 \end{equation} \tag{22} $$

for all $\xi\in\mathbb R^n$.

From (22), (20) and (17) it follows that

$$ \begin{equation} \begin{aligned} \, 0 &\leqslant \langle p(t_0),h(t_0,\xi,\overline\alpha; \overline u)\rangle-\langle p(t_1),h(t_1,\xi,\overline\alpha; \overline u)\rangle \nonumber \\ &=-\int_{t_0}^{t_1}\bigl(\langle p(t),\dot h(t,\xi,\overline\alpha; \overline u)\rangle+\langle \dot p(t), h(t,\xi,\overline\alpha; \overline u)\rangle\bigr)\,dt \nonumber \\ &=-\int_{t_0}^{t_1}\biggl(\alpha_0\langle p(t),\widehat \varphi(t)\rangle +\sum_{i=1}^N\alpha_i\langle p(t),\varphi(t,\widehat{x}(t),u_i(t))\rangle\biggr)\,dt \nonumber \\ &=\sum_{i=1}^N\alpha_i\int_{t_0}^{t_1}\bigl(\langle p(t),\widehat \varphi(t)\rangle-\langle p(t),\varphi(t,\widehat{x}(t),u_i(t))\rangle\bigr)\,dt \end{aligned} \end{equation} \tag{23} $$

(recall that $h(t_0,\xi,\overline\alpha; \overline u)=\xi$). Let $u(\,{\cdot}\,)\in\mathcal U$. Setting $u_i(\,{\cdot}\,)=u(\,{\cdot}\,)$, $i=1,\dots, N$, we have from (23)

$$ \begin{equation} \int_{t_0}^{t_1}\langle p(t),\varphi(t,\widehat{x}(t),u(t))\rangle\,dt\leqslant \int_{t_0}^{t_1}\langle p(t),\widehat\varphi(t)\rangle\,dt \qquad\text{for all}\ \ u(\,{\cdot}\,)\in\mathcal U. \end{equation} \tag{24} $$

By $T_0$ we denote the set of all Lebesgue points on $(t_0,t_1)$ of the function under the integral on the right. Let $\tau\in T_0$ and $v\in U$. For each $h>0$ such that $[\tau-h,\tau+h]\subset (t_0,t_1)$, we set $u_h(t)=v$ if $t\in [\tau-h,\tau+h]$, and, for $t\in [t_0,t_1]\setminus[\tau-h,\tau+h]$, we define $v_h(t)=\widehat{u}(t)$. It is clear that $v_h(\,{\cdot}\,)\in\mathcal U$, and from (24) we have

$$ \begin{equation*} \frac1{2h}\int_{\tau-h}^{\tau+h}\langle p(t),\varphi(t,\widehat{x}(t),v)\rangle\,dt\leqslant \frac1{2h}\int_{\tau-h}^{\tau+h}\langle p(t),\varphi(t,\widehat{x}(t),\widehat{u}(t))\rangle\,dt. \end{equation*} \notag $$

The function $t\mapsto \langle p(t),\varphi(t,\widehat{x}(t),v)\rangle$ is continuous, and hence $\tau$ is its Lebesgue point. Making $h\to0$ in the last inequality, we find that

$$ \begin{equation*} \langle p(\tau),\varphi(\tau,\widehat{x}(\tau),v)\rangle \leqslant\langle p(\tau),\varphi(\tau,\widehat{x}(\tau),\widehat{u}(\tau))\rangle. \end{equation*} \notag $$

Now the above inequality is equivalent to the maximum condition since $v\in U$ is arbitrary and since $T_0$ is a set of full measure.

Thus, we have proved that if $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$, then inclusion (14) holds. Let us now show that this inclusion implies (13). This will complete the proof of implication a) $\Rightarrow$ b).

We set $Z=\mathbb R\times \mathbb R^{m_1}\times\mathbb R^{m_2}$ and $m=m_1+m_2+1$. In view of (14), there exists an $m$-dimensional simplex $S\subset M$ such that $0\in\operatorname{int}S$ (see [12]). Hence $U_{Z}(0,\rho)\subset S$ for some $\rho>0$.

Let $e_1,\dots,e_{m+1}$ be the vertices of the simplex $S$. Then there exist $N_i$, $\overline u_i(\,{\cdot}\,)=(u_{i1}(\,{\cdot}\,),\dots, u_{iN_i}(\,{\cdot}\,))\in\mathcal U^{N_i}$, and $z_i=(h_i(\,{\cdot}\,),\xi_i,\overline\alpha_i,\nu_{0i},\nu_i+f(\widehat{x}(t_0),\widehat{x}(t_0)))\in K_{N_i}\cap \operatorname{Ker}A_{1N_i}(\overline u_i)$, where $\overline\alpha_i=(\alpha_{i0},\alpha_{i1},\dots,\alpha_{iN_i})$, and

$$ \begin{equation*} \dot h_i(t)=\widehat\varphi_x(t)h_i(t)+\alpha_{i0}\widehat\varphi(t)+ \sum_{j=1}^{N_i}\alpha_{ij}\varphi(t,\widehat{x}(t),u_{ij}(t)),\qquad h_i(t_0)=\xi_i, \end{equation*} \notag $$

for all $t\in[t_0,t_1]$, such that $e_i=A_{2N_i}(\overline u_i)[z_i]$, $i=1,\dots,m+1$.

Let $y\in U_{Z}(0,\rho)$. Therefore, $y=\sum_{i=1}^{m+1}\beta_ie_i$ for some $\beta_i>0$ with $\sum_{i=1}^{m+1}\beta_i=1$ (see [12]). Setting $\widehat N=\sum_{i=1}^{m+1}N_i$, $\overline v(\,{\cdot}\,)=(\overline u_1(\,{\cdot}\,),\dots,\overline u_{m+1}(\,{\cdot}\,))$ and proceeding as above in the proof of the convexity of $M$, we find that $y\in A_{2\widehat{N}}(\overline v)(K_{\widehat{N}}\cap\operatorname{Ker}A_{1\widehat{N}}(\overline v))$. Since this is true for each $y\in U_{Z}(0,\rho)$, this proves inclusion (13), and, therefore, implication a) $\Rightarrow$ b).

øverfullrule 0pt Let us prove implication b) $\Rightarrow$ a). Assume on the contrary that $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))\,{\ne}\, \{0\}$. Let us show that inclusion (13) is impossible for any $N\in\mathbb N$ and any tuple $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_{N}(\,{\cdot}\,))\in\mathcal U^N$.

Assume that a non-zero quadruple

$$ \begin{equation*} (\lambda_0,\lambda_f,\lambda_g,p(\,{\cdot}\,))\in \mathbb R_+\times(\mathbb R^{m_1})^*_+\times(\mathbb R^{m_2})^*\times \mathrm{AC}([t_0,t_1],(\mathbb R^n)^*) \end{equation*} \notag $$

satisfies conditions 1)–4) of Theorem 1.

We fix an $N\in\mathbb N$ and a tuple $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots,u_{N}(\,{\cdot}\,))\in\mathcal U^N$. Let $\xi\in\mathbb R^n$, $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_N)\in \Sigma^{N+1}-\overline\alpha_N$, and let $h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$ be a solution of equation (17) with these data.

The inequality (24) for all $u(\,{\cdot}\,)\in\mathcal U$ is clear from the maximum condition in Theorem 1. Substituting the functions $u_1(\,{\cdot}\,),\dots, u_N(\,{\cdot}\,)$ for $u(\,{\cdot}\,)$ in this inequality, multiplying the resulting inequalities by $\alpha_1, \dots, \alpha_N$, respectively, and then adding these inequalities, we see that the expression on the right of (23) is non-negative. Hence

$$ \begin{equation} \begin{aligned} \, &\langle p(t_0),h(t_0,\xi,\overline\alpha; \overline u)\rangle-\langle p(t_1),h(t_1,\xi,\overline\alpha; \overline u)\rangle \nonumber \\ &\qquad=\lambda_0\bigl(\langle\widehat f_{0\zeta_0},\xi\rangle+\langle\widehat f_{0\zeta_1}, h(t_1,\xi,\overline\alpha;\overline u)\rangle\bigr)+\langle\lambda_f,\widehat f_{\zeta_0}\xi+\widehat f_{\zeta_1}h(t_1,\xi,\overline\alpha;\overline u)\rangle \nonumber \\ &\qquad\qquad+\langle\lambda_g,\widehat g_{\zeta_0}\xi+\widehat g_{\zeta_1}h(t_1,\xi,\overline\alpha;\overline u)\rangle\geqslant0 \end{aligned} \end{equation} \tag{25} $$

for all $\xi\in\mathbb R^n$ and $\overline\alpha\in \Sigma^{N+1}-\overline\alpha_N$.

Let $z=(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu+f(\widehat{x}(t_0),\widehat{x}(t_1)))\in K_{N}\cap\operatorname{Ker}A_{1N}(\overline u)$. The function $h(\,{\cdot}\,)$ should satisfy equation (17), and therefore, by uniqueness, $h(\,{\cdot}\,)= h(\,{\cdot}\,,\xi,\overline\alpha;\overline u)$. Next, inequality (18) for each of the above $z$ is secured by (25) since $\lambda_0\geqslant0$, $\lambda_f\in (\mathbb R^{m_1})^*_+$ and since the complementary slackness condition is fulfilled. The triple $(\lambda_0,\lambda_f,\lambda_g)$ is non-zero (otherwise the function $p(\,{\cdot}\,)$ would be zero), but this contradicts inclusion (13). This proves implication b) $\Rightarrow$ a), and, therefore, Proposition 2.

Proof of Lemma 2. Implication 1) $\Rightarrow$ 2). We have $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$, and hence by Proposition 2 inclusion (13) holds with some $\widehat N\in \mathbb N$ and a tuple $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in\mathcal U^{\widehat N}$. Let us also verify that $0\in\operatorname{int}A_{1\widehat N}(\overline v)K_{\widehat N}$.

Indeed, consider the operator $A_0\colon C([t_0,t_1],\mathbb R^n)\to C([t_0,t_1],\mathbb R^n)$ defined by

$$ \begin{equation} A_0[h(\,{\cdot}\,)](t)=h(t)-\int_{t_0}^t\widehat\varphi_x(\tau)h(\tau)\,d\tau\quad \forall\,t\in[t_0,t_1], \end{equation} \tag{26} $$

where $h(\,{\cdot}\,)\in C([t_0,t_1],\mathbb R^n)$. It is well known (see, for example, § 2.5.7 in [13]) and easily checked that this is a continuous bijective linear operator.

Let $y(\,{\cdot}\,)\in C([t_0,t_1],\mathbb R^n)$. There exists a function $h(\,{\cdot}\,)\in C([t_0,t_1],\mathbb R^n)$ such that $A_0[h(\,{\cdot}\,)](\,{\cdot}\,)=y(\,{\cdot}\,)$. This is equivalent to saying that $A_{1\widehat N}(\overline v)[h(\,{\cdot}\,),0,0,0,0](\,{\cdot}\,)=y(\,{\cdot}\,)$. However, $(h(\,{\cdot}\,),0,0,0,0)\in K_{\widehat N}$, and, therefore, $A_{1\widehat N}(\overline v)K_{\widehat N}=C([t_0,t_1],\mathbb R^n)$. Hence $0\in\operatorname{int}A_{1\widehat N}(\overline v)K_{\widehat N}$.

Now inclusion (11) is secured by Proposition 1 with $A_1=A_{1\widehat N}$, $A_2=A_{2\widehat N}$, $A=A_{\widehat N}=(A_{1\widehat N},A_{2\widehat N})$ and $C=K_{\widehat N}$.

Implication 2) $\Rightarrow$ 1). If (11) holds, then inclusion (13) follows from Proposition 1 (where, as above, $A_1 = A_{1\widehat N}$, $A_2 = A_{2\widehat N}$, $A = A_{\widehat N} =(A_{1\widehat N},A_{2\widehat N})$ and $C = K_{\widehat N}$. Hence $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,)) = \{0\}$ by Proposition 2. Lemma 2 is proved.

Let $L>0$. As above, we let $Q_L$ denote the set of all Lipschitz-continuous functions from $C([t_0,t_1],\mathbb R^n)$ with Lipschitz constant $L$. It is easily checked that $Q_L$ is a closed convex subset of $C([t_0,t_1],\mathbb R^n)$.

Recall that the sets $\mathcal U$ and $\Sigma^k$, $k\in\mathbb N$, were defined before Lemma 2.

Lemma 3 (on approximation). Let $M$ be a bounded set in $C([t_0,t_1],\mathbb R^n)$, let $\Omega$ be a bounded set in $\mathbb R^n$, $N\in\mathbb N$, $\overline u(\,{\cdot}\,)=(u_1(\,{\cdot}\,),\dots, u_{N}(\,{\cdot}\,))\in \mathcal U^N$, and let $L> 0$. Then, for each $\overline\alpha=(\alpha_1,\dots,\alpha_N)\in\Sigma^N$, there exists a sequence of controls $u_s(\overline\alpha;\overline u)(\,{\cdot}\,)\in \mathcal U^N$,

$$ \begin{equation*} \|u_s(\overline\alpha;\overline u)(\,{\cdot}\,)\|_{L_\infty([t_0,t_1],\mathbb R^r)}\leqslant \max\{\|u_i(\,{\cdot}\,)\|_{L_\infty([t_0,t_1],\mathbb R^r)},\,1\leqslant i\leqslant N\}, \end{equation*} \notag $$

where $s\in\mathbb N$, such that the mappings $\Phi_s\colon (M\cap Q_L)\times\Omega\times \Sigma^N\to C([t_0,t_1],\mathbb R^n)$ defined by

$$ \begin{equation*} \Phi_s(x(\,{\cdot}\,),\xi,\overline\alpha;\overline u)(t) =x(t)-\xi-\int_{t_0}^t\varphi(\tau,x(\tau),u_s(\overline\alpha;\overline u)(\tau)) \,d\tau, \end{equation*} \notag $$

where $t\in[t_0,t_1]$, lie in $C((M\cap Q_L)\times\Omega\times \Sigma^N,\,C([t_0,t_1],\mathbb R^n))$ and converge in this space as $s\to\infty$ to the mapping $\Phi\colon (M\cap Q_L)\times\Omega\times \Sigma^N\to C([t_0,t_1],\mathbb R^n)$ defined by

$$ \begin{equation*} \Phi(x(\,{\cdot}\,),\xi,\overline\alpha;\overline u)(t) =x(t)-\xi-\sum_{i=1}^N\int_{t_0}^t\alpha_i\varphi(\tau,x(\tau),u_i(\tau)) \,d\tau, \quad \textit{where}\ \ t\in[t_0,t_1]. \end{equation*} \notag $$

As already mentioned, this lemma is a particular case of Lemma 4.3 from [3].

§ 3. Proof of the Pontryagin maximum principle

Assume on the contrary that the Pontryagin maximum principle does not hold for a pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ admissible in problem (1)–(3), that is, $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$. Let us show that this pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ is not a strong minimum in this problem. Our proof will follow from Lemma 1 on an inverse function, which we will apply in our setting.

Let $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ be an admissible pair in problem (1)–(3), let $\widehat N\in\mathbb N$, and let $\overline v(\,{\cdot}\,)=(v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))\in \mathcal U^{\widehat N}$ be as in Lemma 2.

We set

$$ \begin{equation*} \kappa_0=\max\{\|v_i(\,{\cdot}\,)\|_{L_\infty([t_0,t_1],\mathbb R^r)},\,i=0,1,\dots,\widehat N\}, \end{equation*} \notag $$

where $v_0(\,{\cdot}\,)=\widehat{u}(\,{\cdot}\,)$, and let $0<\delta\leqslant1$. We also define

$$ \begin{equation*} \mathcal K_1=\{ (t,x)\in \mathbb R\times\mathbb R^n \colon |x-\widehat{x}(t)|\leqslant\delta, \,t\in[t_0,t_1]\}\times B_{\mathbb R^r}(0,\kappa_0). \end{equation*} \notag $$

The mappings $(t,u)\mapsto \varphi(t,\widehat{x}(t),u)$ and $(t,u)\mapsto \varphi_x(t,\widehat{x}(t),u)$ are continuous on this compact set, and hence are uniformly continuous. Therefore, there exists $0<\delta_0\leqslant\delta$ such that if $|x_1-x_2|<\delta_0$, then

$$ \begin{equation} |\varphi(t,x_1,u)-\varphi(t,x_2,u)|<1 \end{equation} \tag{27} $$

for all $(t,x_i,u)\in\mathcal K_1$, $i=1,2$.

We also set

$$ \begin{equation} C_0=\max_{(t,u)\in \mathcal K_1}|\varphi(t,\widehat{x}(t),u)|,\qquad C_1=\max_{(t,u)\in \mathcal K_1}\|\varphi_x(t,\widehat{x}(t),u)\| \end{equation} \tag{28} $$

($\|\,{\cdot}\,\|$ is the operator norm of a linear operator from $\mathbb R^n$ into $\mathbb R^n$), and define $L=(5+\sqrt{\widehat N+1})C_0+2C_1+2$.

Let us now apply Lemma 1. In this lemma, we take

$$ \begin{equation*} \begin{aligned} \, X &=C([t_0,t_1],\mathbb R^n)\times\mathbb R^n\times \mathbb R^{\widehat N+1}\times\mathbb R\times\mathbb R^{m_1}, \\ Y &=C([t_0,t_1],\mathbb R^n)\times\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}, \\ Y_1 &=W_\infty^1([t_0,t_1],\mathbb R^n)\times\mathbb R\times\mathbb R^{m_1}\times \mathbb R^{m_2}, \\ K &=C([t_0,t_1],\mathbb R^n)\times\mathbb R^n\times\Sigma^{\widehat N+1}\times\mathbb R_+\times(\mathbb R^{m_1}_++f(\widehat{x}(t_0),\widehat{x}(t_1))), \\ Q &=Q_L\times\mathbb R^n\times \mathbb R^{\widehat N+1}\times\mathbb R\times\mathbb R^{m_1}, \end{aligned} \end{equation*} \notag $$

where $Q_L$ is defined before Lemma 3, $\widehat{x}=(\widehat{x}(\,{\cdot}\,), \widehat{x}(t_0),\overline\alpha_{\widehat{N}},0,0)$, and $V=U_X(\widehat{x},\delta_0)$.

In order to distinguish between $\widehat{x}$ and $\widehat{x}(\,{\cdot}\,)$, we will write $\widehat z$ in place of $\widehat{x}$, that is, $\widehat{z}=(\widehat{x}(\,{\cdot}\,), \widehat{x}(t_0),\overline\alpha_{\widehat{N}},0,0)$.

It is clear that $Y_1$ is continuously embedded in $Y$, and $K$ and $Q$ are closed convex subsets of $X$.

The pair $(\widehat{x}(\,{\cdot}\,), \widehat{u}(\,{\cdot}\,))$ satisfies the differential equation in (2), and hence $\widehat{x}(\,{\cdot}\,)$ is a Lipschitz function with constant $C_0$. Therefore, $\widehat{x}(\,{\cdot}\,)\in Q_L$, and further, since $\overline\alpha_{\widehat{N}}=(1,0,\dots,0)\in\Sigma^{\widehat N+1}$, we have $\widehat{z}\in K\cap Q$.

Let the mapping $F\colon V\to Y$ be defined by

$$ \begin{equation} \begin{aligned} \, &F(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(t)=\biggl(x(t)-\xi-\int_{t_0}^t \biggl(\sum_{i=0}^{\widehat N}\alpha_i\varphi(\tau,x(\tau),v_i(\tau))\biggr)\, d\tau, \nonumber \\ &\qquad\qquad f_0(\xi, x(t_1))+\nu_0,\, f(\xi,x(t_1))+\nu,\,g(\xi,x(t_1))\biggr)\quad \forall\,t\in [t_0,t_1], \end{aligned} \end{equation} \tag{29} $$

where $v_0(\,{\cdot}\,)=\widehat{u}(\,{\cdot}\,)$ and $\overline\alpha=(\alpha_0,\alpha_1,\dots,\alpha_{\widehat N})$. The dependence of $F$ on a fixed tuple $(\widehat{u}(\,{\cdot}\,),v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))$ will not be indicated.

The first component of the image of the mapping $F$ lies in $C([t_0,t_1],\mathbb R^n)$ (this is verified as for the above operator $A_N$, and since $V$ is a bounded set and the mappings $\varphi$, $f_0$, $f$, and $g$ are continuous, it is easily seen that $F\in C(V,Y)$.

Let us now check that the above spaces, sets, and the mapping satisfy the conditions of Lemma 1.

Since $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ is an admissible pair in problem (1)–(3), it follows that the first component $F(\widehat{z})$ is zero, and $\xi=\widehat{x}(t_0)$. Hence $F(\widehat{z})=(0, f_0(\widehat{x}(t_0),\widehat{x}(t_1)),f(\widehat{x}(t_0), \widehat{x}(t_1)),0)\in Y_1$.

The mapping $F$ is differentiable (for differentiability with respect to $x(\,{\cdot}\,)$, see, for example, § 2.4.2 in [13], the mapping $F$ being linear with respect to the remaining variables). Hence its derivative at $\widehat{z}$ is given by

$$ \begin{equation*} \begin{aligned} \, &F'(\widehat{z})[h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu](t) \\ &\qquad=\biggl(h(t)-\xi- \int_{t_0}^t\biggl(\widehat \varphi_x(\tau)h(\tau)+ \sum_{i=0}^{\widehat N}\alpha_i\varphi(\tau,\widehat{x}(\tau),v_i(\tau))\biggr)\, d\tau, \\ &\qquad\qquad\widehat f_0'[\xi,h(t_1)]+\nu_0,\,\widehat f'[\xi,h(t_1)]+\nu,\,\widehat g^{\,\prime}[\xi,h(t_1)]\biggr) \end{aligned} \end{equation*} \notag $$

for all $(h(\,{\cdot}\,),\xi,\overline \alpha,\nu_0,\nu)\in X$ and $t\in[t_0,t_1]$.

This verifies condition 1) of Lemma 1. Let us check condition 2).

It is clear that $F'(\widehat{z})=A_{\widehat{N}}(\overline v)$ and $K=K_{\widehat{N}}+\overline\alpha_{\widehat{N}}$. By the assumption, $\Lambda(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))=\{0\}$, and now condition 2) of Lemma 1 is secured by Lemma 2.

Let us verify condition 3). If $(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)\in B_X(0,1)\cap(F'(\widehat{x}))^{-1}(B_{Y_1}(0,1))$, then $(h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)\in B_X(0,1)$, and there exists a tuple $(y(\,{\cdot}\,),w_1,w_2,w_3)\in B_{Y_1}(0,1)$ such that

$$ \begin{equation*} F'(\widehat{z})[h(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu]=(y(\,{\cdot}\,),w_1,w_2,w_3), \end{equation*} \notag $$

that is,

$$ \begin{equation*} h(t)-\xi- \int_{t_0}^t\biggl(\widehat \varphi_x(\tau)h(\tau)+ \sum_{i=0}^{\widehat N} \alpha_i\varphi(\tau,\widehat{x}(\tau),v_i(\tau))\biggr)\,d\tau =y(t) \end{equation*} \notag $$

for all $t\in[t_0,t_1]$ and

$$ \begin{equation*} \widehat f'_0[\xi,h(t_1)]+\nu_0=w_1,\qquad \widehat f'[\xi,h(t_1)]+\nu=w_2,\qquad \widehat g^{\,\prime}[\xi,h(t_1)]=w_3. \end{equation*} \notag $$

Since $y(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, we have $h(\,{\cdot}\,)\in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, and, therefore, for almost all $t\in[t_0,t_1]$,

$$ \begin{equation*} \dot h(t)-\widehat\varphi_x(t)h(t)-\sum_{i=0}^{\widehat N}\alpha_i\varphi(t,\widehat{x}(t),v_i(t))=\dot y(t). \end{equation*} \notag $$

We have $\|h(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}\leqslant1$, $|\overline\alpha|\leqslant1$, and $\|y(\,{\cdot}\,)\|_{W_\infty^1([t_0,t_1],\mathbb R^n)}\leqslant1$, and hence $|\dot h(t)|\leqslant C_1+C_0\sqrt{\widehat N+1}+1$ for almost all $t\in[t_0,t_1]$ (see (28)).

So, the set of these functions $h(\,{\cdot}\,)$ is bounded in $C([t_0,t_1],\mathbb R^n)$; these functions are Lipschitz-continuous with the same Lipschitz constant $C_1+C_0\sqrt{\widehat N+1}+ 1$. Therefore, they are equicontinuous. Now an appeal to the Arzelá–Ascoli theorem shows that the set of these functions is precompact in $C([t_0,t_1],\mathbb R^n)$.

Next, $|\xi|+|\overline\alpha|+|\nu_0|+|\nu|\leqslant1$, and hence the set of such tuples $(\xi,\overline\alpha,\nu_0,\nu)$ is compact in $\mathbb R^n\times \mathbb R^{\widehat N+1}\times\mathbb R\times\mathbb R^{m_1}$. Hence the set $B_X(0,1)\cap(F'(\widehat{x}))^{-1}(B_{Y_1}(0,1))$ is precompact in $X$.

It remains to verify condition 4). For all $z=(x(\,{\cdot}\,), \xi, \overline\alpha,\nu_0,\nu)\in V$ and $t\in[t_0,t_1]$,

$$ \begin{equation*} \begin{aligned} \, &F[z](t)-F'(\widehat{z})[z](t) \\ &\qquad =\biggl(-\int_{t_0}^{t}\biggl(\sum_{i=0}^{\widehat N}\alpha_i(\varphi(\tau,x(\tau),v_i(\tau)) -\varphi(\tau,\widehat{x}(\tau),v_i(\tau)))+\widehat\varphi_x(\tau)x(\tau) \biggr)\,d\tau, \\ &\qquad f_0(\xi,x(t_1)) -\widehat f'_0[\xi,x(t_1)],\,f(\xi,x(t_1))-\widehat f'[\xi,x(t_1)], \, g(\xi,x(t_1))-\widehat g^{\,\prime}[\xi,x(t_1)]\biggr). \end{aligned} \end{equation*} \notag $$

Now, by continuity of the mappings $\varphi$ and $\varphi_x$, the first component of the image of the difference $F[z](\,{\cdot}\,)-F'(\widehat{z})[z](\,{\cdot}\,)$ lies in $W_\infty^1([t_0,t_1],\mathbb R^n)$, and hence, the image of this difference belongs to $Y_1$. Since $V$ is bounded, and the mappings $f_0$, $f$ and $g$ are continuous, it follows that this difference lies in $C(V,Y_1)$, that is, condition 4) of Lemma 1 is met.

So, all the hypotheses of Lemma 1 are fulfilled. Let the constants $0<\delta_1\leqslant1$, $c>0$ and the neighbourhood $W$ be as in this lemma, and let $\mathcal{O}$ be an arbitrary neighbourhood of the function $\widehat{x}(\,{\cdot}\,)$ in $C([t_0,t_1],\mathbb R^n)$. There exist $r_0>0$ and $0<r<r_0/c$ such that $U_{C([t_0,t_1],\mathbb R^n)}(\widehat{x}(\,{\cdot}\,),r_0)\subset \mathcal{O}$ and $U_{C(V\cap K\cap Q,Y)}(F,r)\subset W$.

For each $s\in\mathbb N$, consider the mapping $F_s\colon V\cap K\to Y$ defined by

$$ \begin{equation*} \begin{aligned} \, F_s(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(t) &=\biggl(x(t)-\xi-\int_{t_0}^t\varphi(\tau,x(\tau),u_s(\overline\alpha;\overline u)(\tau))\,d\tau, \\ & f_0(\xi,x(t_1))+\nu_0,\,f(\xi,x(t_1))+\nu,\, g(\xi,x(t_1))\biggr), \quad t\in[t_0,t_1], \end{aligned} \end{equation*} \notag $$

where the functions $u_s(\overline\alpha;\overline u)(\,{\cdot}\,)$ are from Lemma 3, $M$ is the projection of $V$ onto $C([t_0,t_1],\mathbb R^n)$, $\Omega$ is the projection of $V$ onto $\mathbb R^n$, $N=\widehat N+1$, and $\overline u(\,{\cdot}\,)=(\widehat{u}(\,{\cdot}\,),v_1(\,{\cdot}\,),\dots,v_{\widehat N}(\,{\cdot}\,))$. The dependence of the mapping $F_s$ on a fixed tuple $\overline u(\,{\cdot}\,)$ is not indicated.

By Lemma 3, the controls $u_s(\overline\alpha;\overline u)(\,{\cdot}\,)$ are uniformly bounded with respect to $s$ for almost all $t\in [t_0,t_1]$, and hence (by the same reasons as for the mapping $F$), for all $s\in\mathbb N$, the mappings $F_s$ lie in $C(V\cap K\cap Q,Y)$, and there exists $s_0$ such that

$$ \begin{equation*} \begin{aligned} \, &|\widetilde{F}(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(t) -F(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(t)| \\ &\qquad= |(\Phi_{s_0}(x(\,{\cdot}\,),\xi,\overline\alpha)(t) -\Phi(x(\,{\cdot}\,),\xi,\overline\alpha)(t), 0, 0, 0)|<r \end{aligned} \end{equation*} \notag $$

(where we set $\widetilde{F}=F_{s_0}$) for all $(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu) \in V\cap K\cap Q$ and $t\in[t_0,t_1]$. Hence $\widetilde{F} \in U_{C(V\cap K\cap Q,Y)}(F,r) \subset W$.

From the definitions of the mappings $\Phi_s$ and $\Phi$ it is clear that the difference $\widetilde{F}(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(\,{\cdot}\,) -F(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu)(\,{\cdot}\,)$ lies in $C(V\cap K\cap Q,Y_1)$.

Let us now verify (4). Let $(x(\,{\cdot}\,), \xi,\overline\alpha,\nu_0,\nu)\in V\cap K\cap Q$ and $(y(\,{\cdot}\,),w_1,w_2,w_3)\in U_{Y_1}(F(\widehat{z}),\delta_1)$. If $(x'(\,{\cdot}\,), \xi',\overline\alpha^{\,\prime},\nu_0',\nu')$ lies in the set on the left of this inclusion, then

$$ \begin{equation*} \begin{aligned} \, F'(\widehat{z})[x'(\,{\cdot}\,), \xi',\overline\alpha^{\,\prime},\nu'_0,\nu'] &=F'(\widehat{z})[x(\,{\cdot}\,)-\widehat{x}(\,{\cdot}\,), \,\xi-x(t_0),\overline\alpha-\overline\alpha_{\widehat{N}},\nu_0,\nu] \\ &\qquad+(y(\,{\cdot}\,),w_1,w_2,w_3)- \widetilde{F}(x(\,{\cdot}\,),\xi,\overline\alpha,\nu_0,\nu) \end{aligned} \end{equation*} \notag $$

(recall that $\widehat{z}=(\widehat{x}(\,{\cdot}\,), \widehat{x}(t_0),\overline\alpha_{\widehat{N}},0,0)$).

According to the above, $F'(\widehat{z})=A_{\widehat{N}}(\overline v)$, and now this equality assumes the form, for any $t\in[t_0,t_1]$

$$ \begin{equation*} \begin{aligned} \, &\biggl(x'(t)-\xi'- \int_{t_0}^t\biggl(\widehat \varphi_x(\tau)x'(\tau)+ \sum_{i=0}^{\widehat N}\alpha'_i\varphi(\tau,\widehat{x}(\tau),v_i(\tau))\biggr)\,d\tau, \\ &\qquad\qquad f_0'[\xi,x(t_1)]+\nu'_0,\, f'[\xi,x(t_1)]+\nu',\,\widehat g^{\,\prime}[\xi,x(t_1)]\biggr) \\ &=\biggl(-\widehat{x}(t)+\widehat{x}(t_0)- \int_{t_0}^t\biggl(\widehat \varphi_x(\tau)(x(\tau)-\widehat{x}(\tau))+ (\alpha_0-1)\widehat\varphi(\tau) \\ &\qquad +\sum_{i=1}^{\widehat N}\alpha_i\varphi(\tau,\widehat{x}(\tau),v_i(\tau)) \biggr)\, d\tau +y(t)+\int_{t_0}^t\varphi(\tau,x(\tau),u_{s_0}(\overline\alpha;\overline u)(\tau))\,d\tau, \\ &\qquad\qquad\! \widehat f_0'[\xi-\widehat{x}(t_0),x(t_1)-\widehat{x}(t_1)] +\nu_0+w_1,\, \widehat f'[\xi-\widehat{x}(t_0), x(t_1)-\widehat{x}(t_1)]\,{+}\,\nu+w_2, \\ &\qquad\qquad\! \widehat g^{\,\prime}[\xi-\widehat{x}(t_0),x(t_1)-\widehat{x}(t_1)]+w_3\biggr) \end{aligned} \end{equation*} \notag $$

(recall that $\overline\alpha^{\,\prime}=(\alpha_0',\alpha_1',\dots,\alpha_{\widehat N}'))$.

Let us now proceed as in the verification of condition 3) above. Since $y(\,{\cdot}\,) \in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, we have $x'(\,{\cdot}\,) \in \mathrm{AC}([t_0,t_1],\mathbb R^n)$, and hence, for almost all $t \in [t_0,t_1]$,

$$ \begin{equation} \begin{aligned} \, &\dot x'(t)=\widehat \varphi_x(t)x'(t)+ \sum_{i=0}^{\widehat N}\alpha'_i\varphi(t,\widehat{x}(t),v_i(t))-\dot{\widehat x}(t)-\widehat \varphi_x(t)(x(t)-\widehat{x}(t)) \nonumber \\ &\ -(\alpha_0-1)\widehat\varphi(t)-\sum_{i=1}^{\widehat N}\alpha_i\varphi(t,\widehat{x}(t),v_i(t))+\dot y(t)+\varphi(t,x(t),u_{s_0}(\overline\alpha;\overline u)(t)). \end{aligned} \end{equation} \tag{30} $$

We have $\|x'(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}\leqslant1$, $|\overline\alpha^{\,\prime}|\leqslant1$, $\overline\alpha\in\Sigma^{\widehat N+1}$, $\|y(\,{\cdot}\,)\|_{W_\infty^1([t_0,t_1],\mathbb R^n)}<\delta_1\leqslant1$, $\|x(\,{\cdot}\,)-\widehat{x}(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}<\delta_0\leqslant1$, and hence, for almost all $t\in[t_0,t_1]$, the sum of the terms (except the last one) on the right of (30) is estimated by $C_1+C_0\sqrt{\widehat N+1}+C_0+C_1+2C_0+1=(3+\sqrt{\widehat N+1})C_0+2C_1+1$.

Subtracting and adding $\varphi(t,\widehat{x}(t),u_{s_0}(\overline\alpha^{\,\prime};\overline u)(t))$ to the last term on the right of (30), it follows by (27) that this term is majorized by $1+C_0$ for almost all $t\in[t_0,t_1]$. So, for almost all $t\in[t_0,t_1]$,

$$ \begin{equation*} |\dot x'(t)|\leqslant\Bigl(4+\sqrt{\widehat N+1}\Bigr)C_0+2C_1+2=L-C_0. \end{equation*} \notag $$

We next claim that $(x'(\,{\cdot}\,), \xi',\overline\alpha^{\,\prime},\nu_0',\nu')\in Q-\widehat{z}$. Clearly, to verify this inclusion, it suffices to verify that the function $x'(\,{\cdot}\,)$ lies in $Q_L-\widehat{x}(\,{\cdot}\,)$. The Lipschitz constant of this function is $L-C_0$, and the Lipschitz constant of $\widehat{x}(\,{\cdot}\,)$ is $C_0$. Hence $x'(\,{\cdot}\,)+\widehat{x}(\,{\cdot}\,)\in Q_L$, that is, $x'(\,{\cdot}\,)\in Q_L-\widehat{x}(\,{\cdot}\,)$, which proves inclusion (4).

So, all the assumptions of in Lemma 1 on the mapping $\widetilde{F}$ are met, and, therefore, there exists a mapping

$$ \begin{equation*} \psi_{\widetilde{F}}\colon U_{Y_1}(F(\widehat z),\delta_1)\to V\cap K\cap Q \end{equation*} \notag $$

satisfying (5).

Recall that $F(\widehat z)=(0,f_0(\widehat{x}(t_0),\widehat{x}(t_1)), f(\widehat{x}(t_0),\widehat{x}(t_1)),0)$. Let

$$ \begin{equation*} y=\bigl(0,f_0(\widehat{x}(t_0),\widehat{x}(t_1))-\varepsilon, f(\widehat{x}(t_0),\widehat{x}(t_1)),0\bigr), \end{equation*} \notag $$

where $0<\varepsilon\leqslant(r_0/c)-r$ is so small that $y\in U_{Y_1}(F(\widehat z),\delta_1)$. We set $\psi_{\widetilde{F}}(y)=(\widetilde x(\,{\cdot}\,),\widetilde\xi,\widetilde {\overline\alpha},\widetilde\nu_0,\widetilde\nu)$, where $\widetilde\nu=\widetilde \nu_1+f(\widehat{x}(t_0),\widehat{x}(t_1))$ and $\widetilde\nu_1\geqslant0$. Now the equality in (5) assumes the form

$$ \begin{equation*} \begin{aligned} \, \widetilde{F}(\psi_{\widetilde{F}}(y))(t) &=\widetilde{F}\bigl(\widetilde x(\,{\cdot}\,),\widetilde\xi,\widetilde{\overline\alpha},\widetilde\nu_0,\widetilde\nu\bigr)(t) \\ &=\biggl(\widetilde x(t)-\widetilde \xi-\int_{t_0}^t\varphi(\tau,\widetilde x(\tau), u_{s_0}(\widetilde{\overline\alpha};\overline u)(\tau))\,d\tau,\, f_0(\widetilde \xi,\widetilde x(t_1))+\widetilde\nu_0, \\ &\qquad\qquad f(\widetilde \xi,\widetilde x(t_1))+\widetilde\nu,\, g(\widetilde \xi,\widetilde x(t_1))\biggr) \\ &=\bigl(0,f_0(\widehat{x}(t_0),\widehat{x}(t_1))-\varepsilon, f(\widehat{x}(t_0),\widehat{x}(t_1)),0\bigr) \end{aligned} \end{equation*} \notag $$

for all $t\in [t_0,t_1]$.

The equality of the first components implies that the pair $(\widetilde x(\,{\cdot}\,), \widetilde u(\,{\cdot}\,))$, where $\widetilde u(\,{\cdot}\,)=u_{s_0}(\widetilde{\overline\alpha};\overline u)(\,{\cdot}\,)$, satisfies condition (2), and, in addition, $\widetilde \xi=\widetilde x(t_0)$. Hence, since the third and fourth components are equal, we have $f(\widetilde x(t_0),\widetilde x(t_1))\leqslant0$ and $g(\widetilde x(t_0),\widetilde x(t_1))=0$, that is, the pair $(\widetilde x(\,{\cdot}\,), \widetilde u(\,{\cdot}\,))$ is admissible in problem (1)–(3).

The equality of the second components implies that

$$ \begin{equation*} f_0(\widetilde x(t_0),\widetilde x(t_1))<f_0(\widehat{x}(t_0),\widehat{x}(t_1)), \end{equation*} \notag $$

and from the inequality in (5) and the choice of $r$ we have

$$ \begin{equation*} \|\widetilde x(\,{\cdot}\,)-\widehat{x}(\,{\cdot}\,)\|_{C([t_0,t_1],\mathbb R^n)}< c(r+\varepsilon)\leqslant r_0, \end{equation*} \notag $$

that is, $\widetilde x(\,{\cdot}\,)\in \mathcal{O}$.

So, for any neighbourhood $\mathcal{O}$ of a point $\widehat{x}(\,{\cdot}\,)$, there exists a pair $(\widetilde x(\,{\cdot}\,),\widetilde u(\,{\cdot}\,))$ admissible for problem (1)–(3) and such that $\widetilde x(\,{\cdot}\,)\in \mathcal{O}$ and the value of the functional $f_0$ on $\widetilde x(\,{\cdot}\,)$ is smaller than on $\widehat{x}(\,{\cdot}\,)$. This, however, contradicts the fact that the pair $(\widehat{x}(\,{\cdot}\,),\widehat{u}(\,{\cdot}\,))$ is a strong minimum in problem (1)–(3). Theorem 1 is proved.



Bibliography

1.	L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko, The mathematical theory of optimal processes, Intersci. Publ. John Wiley & Sons, Inc., New York–London, 1962
2.	R. V. Gamkrelidze, Principles of optimal control theory, Math. Concepts Methods Sci. Eng., 7, Rev. ed., Plenum Press, New York–London, 1978
3.	E. R. Avakov and G. G. Magaril-Il'yaev, “Local infimum and a family of maximum principles in optimal control”, Sb. Math., 211:6 (2020), 750–785
4.	A. A. Milyutin, “General schemes of necessary conditions for extrema and problems of optimal control”, Russian Math. Surveys, 25:5 (1970), 109–115
5.	A. D. Ioffe and V. M. Tihomirov, Theory of extremal problems, Stud. Math. Appl., 6, North-Holland Publishing Co., Amsterdam–New York, 1979
6.	E. R. Avakov, G. G. Magaril-Il'yaev, and V. M. Tikhomirov, “Lagrange's principle in extremum problems with constraints”, Russian Math. Surveys, 68:3 (2013), 401–433
7.	F. H. Clarke, “A new approach to Lagrange multipliers”, Math. Oper. Res., 1:2 (1976), 165–174
8.	A. V. Arutyunov, Optimality conditions. Abnormal and degenerate problems, Math. Appl., 526, Kluwer Acad. Publ., Dordrecht, 2000
9.	A. D. Ioffe, “On necessary conditions for a minimum”, J. Math. Sci. (N.Y.), 217:6 (2016), 751–772
10.	E. R. Avakov and G. G. Magaril-Il'yaev, “General implicit function theorem for close mappings”, Proc. Steklov Inst. Math., 315 (2021), 1–12
11.	A. Granas and J. Dugundji, Fixed point theory, Springer Monogr. Math., Springer-Verlag, New York, 2003
12.	G. G. Magaril-Il'yaev and V. M. Tikhomirov, Convex analysis: theory and applications, Transl. Math. Monogr., 222, Amer. Math. Soc., Providence, RI, 2003
13.	V. M. Alekseev, V. M. Tikhomirov, and S. V. Fomin, Optimal control, Contemp. Soviet Math., Consultants Bureau, New York, 1987

Citation: E. R. Avakov, G. G. Magaril-Il'yaev, “Schauder's fixed point theorem and Pontryagin maximum principle”, Izv. Math., 88:6 (2024), 1013–1031

Citation in format AMSBIB

\Bibitem{AvaMag24}

\by E.~R.~Avakov, G.~G.~Magaril-Il'yaev

\paper Schauder's fixed point theorem and Pontryagin maximum principle

\jour Izv. Math.

\yr 2024

\vol 88

\issue 6

\pages 1013--1031

\mathnet{http://mi.mathnet.ru//eng/im9471}

\crossref{https://doi.org/10.4213/im9471e}

Linking options:

https://www.mathnet.ru/eng/im9471

https://doi.org/10.4213/im9471e

https://www.mathnet.ru/eng/im/v88/i6/p3

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Известия Российской академии наук. Серия математическая

Statistics & downloads:
Abstract page:	155
Russian version PDF:	2
English version PDF:	3
Russian version HTML:	4
English version HTML:	15
References:	19
First page:	10

Что такое QR-код?

Registration to the website

Logotypes

Introduction

§ 1. The main result

§ 2. Lemma on an inverse function and auxiliary results

§ 3. Proof of the Pontryagin maximum principle

Bibliography