Abstract:
Various aspects of Dirac's generalized Hamiltonian dynamics are under consideration. The starting point is a Hamiltonian system on a symplectic manifold on which, in addition, a distribution of multidimensional tangent planes is defined. The Hamiltonian vector field must be modified so that this distribution becomes invariant under the phase flow of the modified dynamical system. This problem can be solved in various ways. The simplest approach is to project the Hamiltonian vector field onto the planes of the distribution by using the symplectic structure, the closed nondegenerate 2-form on the symplectic manifold (which defines the symplectic geometry on tangent planes to the phase space). If the distribution in question is integrable, then this approach leads to generalized Hamiltonian dynamics developed by Dirac (and other authors) to quantize systems whose Lagrangian is degenerate in velocities. In application to the mechanics of Lagrange's systems with non-integrable constraints, this approach yields classical non-holonomic systems. Another approach is based on the definition of the motion of Hamiltonian systems as an extremal of a variational problem with constraints. Then in the case of non-holonomic constraints we obtain dynamical systems of a quite different type. In application to Lagrange's systems with non-integrable constraints this approach yields the equations of motion of so-called vaconomic dynamics. For example geometric optic is considered, which is based on Fermat's variatonal principle with Largangian which is homogeneous of degree one with respect to the velocities.
Bibliography: 34 titles.
This work was performed at the Steklov International Mathematical Center and supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-15-2022-265).
When Hamilton developed his methods, of course he had no idea that they would acquire this importance. All he could have had as a guiding beacon in carrying out his work was the mathematical beauty of his equations. It shows his great genius that he was able to appreciate that his work was valuable in spite of its not having practical importance at the time.
P. Dirac, “Hamiltonian methods and quantum mechanics”
0. Introduction
0.1.
The dynamics of conservative systems is usually presented in the framework of either Lagrangian of Hamiltonian formalism. Lagrange’s and Hamiltonian equations are related by means of the Legendre transform, which has the property of duality. In mechanics the phase space of a Lagrange system is the total space of the tangent bundle of the configuration space, while a Hamiltonian system is defined as the total space of the contangent bundle of $M$. However, the nature of Hamiltonian systems is more general: it is natural to consider such systems on arbitrary symplectic manifolds (not necessarily equal to $T^*M$).
In [1] and [2] Dirac developed a Hamiltonian approach in the case when the Lagrangian $L(\dot{q},q)$ is degenerate in velocities. This range of problems was also treated in [3]. An important example is the problem of a relativistic particle in a given electromagnetic field in the four-dimensional space-time. Here the ‘kinetic’ energy of the particle is a homogeneous function of degree one in the generalized velocities. Problems in geometric optics also relate here. The fact that problems with degenerate Lagrangians are of importance is mainly connected with the question of quantization, which is based on the Hamiltonian form of equations of motion.
Dirac proposed to use the Legendre transform and introduce momenta by the standard rule:
here $n$ is the number of degrees of freedom. If the Lagrangian is degenerate in the velocities (that is, $\det\|\partial^2 L/\partial\dot{q}^2\|=0$), then from (0.1) we can deduce several relations
$$
\begin{equation}
\varphi_k(p,q)=0,\qquad 1 \leqslant k \leqslant m.
\end{equation}
\tag{0.2}
$$
In [3] these were called primary constraints. The Hamiltonian is introduced in the standard way:
which is zero by (0.1). Hence we can treat $H$ as a function of $p$ and $q$ alone (in a certain ‘weak’ sense of Dirac). More precisely, $H$ is a function on some $2n$-submanifold of the $3n$-dimensional space of the variables $q$, $\dot{q}$, and $p$, which is defined by the $n$ independent equations (0.1) and (0.2).
After that Dirac ‘deduced’ the generalized Hamiltonian equations
Here $\lambda_1,\dots,\lambda_m$ are multipliers to be defined yet. To (0.4) we must add the further $m$ equations of constraints (0.2). In addition, the ‘consistence conditions’ must hold for the constraints:
$$
\begin{equation}
\dot{\varphi}_i=\{\varphi_i,H\}+\sum \lambda_k\{\varphi_i,\varphi_k\}=0,\qquad 1 \leqslant i \leqslant m,
\end{equation}
\tag{0.5}
$$
when all the $\varphi_k$ vanish. Here $\{\,\cdot\,{,}\,\cdot\,\}$ is the standard Poisson bracket.
If the matrix of Poisson brackets $\|\{\varphi_i,\varphi_j\}\|$ is non-singular, then equations (0.5) determine $ \lambda_k$ uniquely as a function of $p$ and $q$. When this matrix is singular, seondary constraints arise in the form of algebraic conditions of the solvability of (0.5) with respect to $\lambda_1,\dots,\lambda_m$. These secondary constraints are added to the primary ones, and consistency conditions are written out again. Eventually, we either arrive at a contradiction (for instance, when Lagrange’s equations with degenerate Lagrangian have no solutions at all), or the linear system of equations of the form (0.5) is consistent. In the latter case it is possible that the multipliers $\lambda$ can be found in more than one way.
The derivation of (0.4) is based on so-called strong and weak relations introduced by Dirac. Commentators of [1] and [2] noted that Dirac’s rule of transition from Largange’s degenerate equations to equations (0.4) should be substantiated more thoroughly (in particular, one should show that these rules are compatible). However, “… as usual, for some mysterious reasons questionable situations just do not occur in his arguments and constructions” [4].
0.2.
It is fair to say that the first author to consider Hamiltonian aspects of Lagrangian dynamical systems with degenerate Lagrangian was Hamilton himself. Initially, the object of his investigations was geometric optics, and only then he turned to the dynamics of mechanical systems in a potential force field. The optical properties of a medium are determined by the refraction coefficient $n$, which is the reciprocal of the speed of light. In general, this coefficient is a function of the point $x$ and the direction of the velocity of particles of light $v=\dot{x}$:
The integral (0.6) is often called the Fermat action. By Fermat’s principle light propagates along a path that takes the least time. In other words, the path $t \mapsto x(t)$ is a stationary point of the functional (0.6) with fixed endpoints, so that it satisfies Lagrange’s equation with Lagrangian $L$.
Since $L$ is homogeneous in the velocity, we have (by Euler’s theorem)
Thus, we cannot use the classical Legendre transform in this case.
Hamilton coped with this difficulty as follows. Consider the closed set of points $v \in \mathbb{R}^3$ such that $L(x,v)=1$. This surface is called the indicatrix at the point $x$. The next to introduce is the figuratrix, the set of momenta $y \in \mathbb{R}^3$ defined by the relations
When the indicatrix is a convex surface, the figuratrix is too. In this important case there exists a unique function $H(x,y)$ that is positive homogeneous in $y$ ($H(x,\lambda y)=\lambda H(x,y)$ for $\lambda>0$) and equal to 1 for all momenta lying on the figuratrix. The transformation
takes the figuratrix to the indicatrix. Thus, the functions $L$ and $H$ are dual (as also are the indicatrix and figuratrix). Hamilton showed that a path $t \mapsto x(t)$ is a light ray if and only is there exists a ‘conjugate’ function $t \mapsto y(t)$ such that, in combination, they satisfy the canonical equations
This result can be found in Hamilton’s memoir submitted to the Royal Irish Academy in 1824 (for a brief presentation, see [5], Chap. I, § 4).
How does all this look from the point of view of Dirac’s Hamiltonian formalizm? It is easy to understand that if the indicatrix is convex, then the equation $y=\partial L/\partial v$ produces the unique non-trivial primary constraint
Since the Lagrangian $L$ is a homogeneous function of degree one in the velocities, the Hamiltonian (0.3) vanishes identically. Hence Dirac’s equations (0.4) assume the form
It is obvious that the consistency conditions (0.5) hold for all values of $\lambda$. Equations (0.8) are only different from the Hamiltonian equations (0.7) by the coefficient $\lambda$, which can be an arbitrary positive function of time. (The fact that it is positive follows from the simple observation that particles of light move along rays with non-zero velocity and preserve the direction of their motion.) Thus, from the standpoint of the problem of finding light rays equations (0.7) and (0.8) coincide.
Hamilton’s method was refined in the calculus of variations (for instance, see [6]). It is based on the transformations of poles and polars with respect to the unit sphere which is known in geometry. By the way, Hamilton’s results on geometric optics were not mentioned in [1]–[3] (nor in other papers on Dirac’s dynamics: for instance, see [7]–[10]).
0.3.
For example we also consider another extreme case, when the Lagrangian is linear in velocities:
Birkhoff called a dynamnical system of this form a gyroscopic particle ([11], Chap. I, § 9). For example, a charged particle of negligible mass in a static magnetic field has this type.
Assume that the skew-symmetric matrix $\|a_{ij}\|$ has a determinant distinct from zero (the general case was discussed in [12] and [9]). Then (0.10) defines a usual dynamical system in the configuration space:
$$
\begin{equation*}
\dot{q}_i=\sum a^{ij}\frac{\partial u}{\partial q_j}\,,\qquad 1 \leqslant i \leqslant n.
\end{equation*}
\notag
$$
Here $\|a^{ij}\|$ is the inverse matrix of $\|a_{ij}\|$.
In this case $n$ must be even and system (0.10) is Hamiltonian, although it is not expressed in terms of canonical variables. The symplectic structure on $M=\{q\}$ is defined by the closed non-degenerate 2-form
and $-u(q)$ is the Hamiltonian (in fact, the choice of the sign before the Hamiltonian is of no importance). In particular, the function $u\colon M \to \mathbb{R}$ is a first integral of system (0.10).
We apply Dirac’s Hamiltonian formalism to the degenerate Lagrangian (0.9). The relations for momenta
Since we assume that the matrix $\|a_{ij}\|$ is non-singular, from (0.14) we can find the multipliers $\lambda_1,\dots,\lambda_n$ in a unique way. Substituting these into the first equation in (0.13) yields Lagrange’s equations (0.10). The second equation in (0.13) produces nothing new in comparison to (0.11).
We see that Dirac’s formalizm results in the same Hamiltonian system (0.10) with $n/2$ degrees of freedom. In [13] and [10] Lagrange’s system with Lagrangian (0.9) was looked upon from another point of view. It was proposed there that equations (0.10) (not involving second derivatives) could be treated as constraints imposed on a Lagrangian system. After that the well-known Hamiltonian formalism was used in the Lagrange variational problem with constraints. As a result, the equations assumed the form of canonical Hamiltonian equations in a $2n$-dimensional phase space, where the equations for the $q$-coordinates can be separated and, of course, are just (0.10). This approach was pursued in a number of papers (for instance, see [8] and [10]), where attempts to combine it with Dirac’s approach were made. In our opinion this only leads to unnecessary complications. Furthermore, the invariant meaning of such constructions is not quite clear.
0.4.
An effective method of the substantiation of Dirac’s generalized Hamiltonian dynamics was indicated in [14], Chap. I, § 6; it is related to considerations of mechanical systems with small masses. More precisely, we mean systems with $n+1$ degrees of freedom whose dynamics is described by Lagrange’s equations with Lagrangian
Solutions of Lagrange’s equations with Lagrangian (0.15) can be represented by series in powers of $\varepsilon$, where for $\varepsilon=0$ we obtain solutions of the relevant Dirac equations. These series can be divergent but (as shown in [15]) they are certainly asymptotic.
The theory of gyroscopic systems (for instance, see [16]–[18]) is linked with this trick. Consider the dynamics of a mechanical system with Lagrangian
The matrix $A$ is assumed to be positive definite for all $q \in M$; it determines the inertial properties of the system. The parameter $N$ is assumed to be large. We go over to the new time $\tau=t/N$ and denote differentiation with respect to $\tau$ by primes. Then Lagrange’s equations can be put into the form
$T=(Aq',q')/2$ is the kinetic energy of the system, and $\varepsilon=1/N^2$ is a small parameter. For $\varepsilon=0$ we obtain equations (0.10), which are Hamiltonian when the skew-symmetric matrix $\Lambda$ is non-singular. Equations (0.10) are called precession equations. An analysis of the transition from the full equations to simplified ones as $\varepsilon\to 0$ is an important problem in the theory of gyroscopic systems [16]–[18]. As mentioned in [16], for $\det\Lambda=0$ the precession equations are not necessarily solvable, or they can have infinitely many solutions (as in Dirac’s theory). This question was considered in [19], where the power expansions of solutions of the full system (0.16) in $\varepsilon$ were also investigated. The transition, as $\varepsilon\to 0$, from the full equations (0.16) to the precession equations was considered in [9] from the standpoint of Dirac’s Hamiltonian formalism. For applications of Dirac’s dynamics to concrete problems with small generalized velocities in mechanics the reader can address [20] and [21].
0.5.
So far, we discussed just one (but not the only) important aspect of Dirac’s Hamiltonian formalism. It is related to an algorithm representing Lagrange’s equations with degenerate Lagrangian in a certain generalized Hamiltonian form. The core question in this range of problems is as follows: is it true that all solutions of Lagrange’s equations can be obtained by use of the Dirac representation described above? This is perhaps true, but to our best knowledge no full rigorous proof is available. Lagrange’s equations with degenerate Lagrangian need not have solutions at all. Here is a trivial example:
Of course, a solution of the equivalence problem must also cover such cases.
Equations (0.4) can also be treated otherwise, as Hamiltonian equations with constraints (with no link to Lagrangian mechanics whatsoever). In [1] Dirac also discussed this problem. The starting point here is the Hamiltonian and the contraints (0.2) imposed in the phase space.
However, this setup changes the situation drastically. In the initial approach (when we develop the Hamiltonian aspect of Lagrangian mechanics with degenerate Lagrangian) we have a definition of motion: this is a solution of Lagrange’s equations. In the framework of the second approach (when, given a Hamiltonian system, we impose constraints on it which are not invariant under its phase flow) we must first define a motion, and then derive differential equations governing this motion.
This is just the approach to the development of Dirac’s generalized Hamiltonian mechanics that was presented in [14], Chap. I, § 5. Dirac’s himself made no stress on the necessity to have a clear definition of a motion of a Hamiltonian system with constraints. Perhaps the reason lies in an implicit assumption that the Hamiltonian systems under consideration have been deduced from Lagrange’s equations in some natural way. On the other hand, the modern view (going back to È. Cartan [22]) is that the two aspects of mechanics, the Hamiltonian and Lagrangian ones, are clearly delineated. Hamiltonian systems ‘live’ on symplectic manifolds, which are not necessarily the total spaces of the contangent bundles of the configuration spaces for mechanical systems.
An analogy with the theory of Lagrange’s systems with constraints is appropriate here. Let $L(\dot{q},q)$ be a Lagrangian and $(a(q),\dot{q})=0$ be a constraint. In non-holonomic mechanics it is postulated that motions $t \mapsto q(t)$ are solutions of the equations
Here $\lambda$ is a coefficient found as a function of the state ($\dot{q}$, $q$) of the system in the case when the Lagrangian is non-degenerate in velocities. In the vaconomic model of motion the motion equations look otherwise:
in the class of curves with fixed endpoints that satisfy the constraint $(a,\dot{q})=0$. If the constraint is integrable, then these two models are equivalent: see the discussion in [14], Chap. I.
0.6.
In this paper we put forward a generalization of Dirac’s Hamiltonian dynamics, when constraints linear in the velocities of canonical variables are imposed on a Hamiltonian system. When these constraints describe an integrable distribution on the phase space, we obtain Dirac’s dynamics.
First (in § 1) we expose our ideas in the simple situation when differential constraints are imposed on a dynamical system on a Riemannian space. Next we consider a Hamiltonian system with constraints described by some distribution of tangent planes (§ 2). Taking these constraints into account is based on projecting the vector field onto the tangent planes of this distribution. We perform this projection using the symplectic geometry of tangent planes which is induced by the symplectic geometry of the phase space.
In § 3 we discuss the case when the distribution of tangent planes (describing the constraint) is integrable. Then the general construction in § 2 leads to Dirac’s Hamiltonian dynamics. In § 4 we discuss another approach to the dynamics of Hamiltonian systems with constraints. In contrast to § 2, by motions of such systems we mean extremals of a variational problem with prescribed constraints, when the Poincaré–Helmholtz action is takes as the functional. Again, in the case of integrable constraints we obtain Dirac’s Hamiltonian dynamics.
Special cases of the constructions in §§ 2 and 4 are the non-holonomic and vaconomic models of motion of mechanical systems, which we mentioned above. In § 5 we consider some other examples, in particular, ones from geometric optics. In § 6 J. Bernoulli’s optico-mechanical analogy is extended to strongly anisotropic optical media. In § 7 we discuss a comparison between Dirac’s generalized Hamiltonian dynamics and the principles of construction of Lagrangian systems with constraints known in analytical mechanics.
1. Dynamical systems with constraints on a Riemannian space
1.1.
Let $S^m=\{x_1,\dots,x_m\}$ be an $m$-dimensional Riemannian manifold with metric
where the coefficients $a_{ij}$ are smooth functions and the matrix $A(x)=\|a_{ij}(x)\|$ is positive definite for all $x \in S^m$. The value of the quadratic form (1.1) at a pair of vector fields $u_1(x)$, $u_2(x)$ on $S$ can be expressed as
where $(\,\cdot\,{,}\,\cdot\,)$ is canonical pairing (the value of a covector at a vector at a point $x \in S$). In fact, $Au_1$ is indeed a covector (a vector in the cotangent space $T_x^* S$).
Let $v$ be a smooth vector field on $S$. It gives rise to the autonomous system of differential equations
$$
\begin{equation}
\dot{x}=v(x),\qquad x \in S
\end{equation}
\tag{1.2}
$$
(as usual, points denote derivatives with respect to the independent variable, which we denote by $t$). Also consider a few constraints linear in the velocities:
Here $a_1,\dots,a_k$ ($k<m$) are covector fields on $S$ which are linearly independent at each point. The system of equations (1.3) defines a smooth $(m-k)$-dimensional distribution $\Pi$ on $S$. It is non-integrable in the general case.
Generally speaking, the distribution $\Pi$ is not invariant under the phase flow of the dynamical system (1.2). How can we achieve invariance? To do this we must replace $v$ by a vector field $w$ such that
for all $x \in S$. This can be done in various ways.
This type of problem is common in control theory for dynamical systems. In particular, in the description of so-called sliding modes (for instance, see [23] and [24]). Actually, authors consider mostly integrable constraints, when
Here $F_1,\dots,F_k\colon S \to \mathbb{R}$ are some independent smooth functions; in particular, $a_j=\partial F_j/\partial x$. We assume that the field $v(x)$ is defined away frpm the ‘singular’ set
on which it has discontinuities. We assume that phase trajectories of $v$ reach $N$ from ‘different’ sides. For a full description of the dynamics, $v$ must also be defined at the points in $N$. Trajectories of this ‘extended’ system that lie on $N$ correspond just to sliding modes.
We return to the definition of the vector field $w$ in question. One of the most natural ways to define it is to project the original field $v$ onto tangent $(m- k)$-planes (1.3). However, the result depends on the choice of a metric. In our case $S$ is already endowed with a Riemannian metric, which we can use.
Remark. In the theory of sliding modes authors usually consider the case $k=1$. Then for a natural definition of the field $v$ on the hypersubmanifold $N$ (for instance, Filippov’s definition) we do not need a metric in the ambiant space at all. This is a characteristic feature of sliding modes [23], [24].
Now, using the obvious equalities $(a_j,w)=0$ and multiplying (1.5) by $a_1,\dots,a_k$ we obtain a linear system of algebraic equations for the multipliers:
$$
\begin{equation}
(a_s,v)+\sum_{j=1}^k \lambda_j (A^{-1}a_j,a_s)=0,\qquad 1 \leqslant s \leqslant k.
\end{equation}
\tag{1.6}
$$
Since $A^{-1}$ is a positive definite matrix and $a_1,\dots,a_k$ are linearly independent covectors, the Gram matrix
has a positive determinant. Hence we uniquely find $\lambda_1,\dots,\lambda_k$ from (1.6) as smooth functions on $S$.
1.2.
There can be different interpretations of our choice of $w$. We point out an analogue of Gauss’s principle in the dynamics of mechanical systems with constraints, which is closely connected with the least square method (see [25]). By a conceivable vector field $u$ we mean any vector field satisfying $u(x) \in \Pi_x$ for all $x$. We say that the vector field $v$ is unconstrained while $w$ is an actual field. According to Gauss, conceivable motions are maps $t \mapsto x(t)$ that satisfy the constraints (1.3) for all values of $t$. The freed motion has velocity $v(x)$, and the actual motion (regarded as one of conceivable ones) has velocity $w(x)$. The general Gaussian least constraint principle can be stated as follows.
Theorem 1. Among all conceivable motions, the actual one has the least deviation from the unconstrained motion.
We can measure the deviation (constraint in the sense of Gauss) of a conceivable motion from the unconstrained one in terms of the quantity
where the vectors $u-w$ and $w-v$ are orthogonal in the Riemanian metric on $S$. In fact, $u(x)-w(x) \in \Pi_x$, while $w(x)-v(x)$ is orthogonal to $\Pi_x$ by (1.5). Hence by Pyphagoras’s theorem
As the matrix $A$ is positive definite, we have $Z(u,v)\geqslant Z(v,w)$ and the ‘constraint’ $Z(u,v)$ takes its minimum value for $u=w$, as required.
1.3.
For a more transparent comparison of these observations with Dirac’s generalized Hamitonian dynamics (which we discuss in § 3), consider so-called gradient dynamical systems in Euclidean space $S=\mathbb{R}^m$ with standard metric. A gradient system is defined by a vector field with components
where $\varphi\colon \mathbb{R}^m \to \mathbb{R}$ is a smooth function. We look at the special case when the constraints (1.3) are integrable: it will be about a motion along the $(m-k)$-dimensional manifold (1.4) such that the vector field $w$ assumes the following form at its points:
$$
\begin{equation}
\dot{x}=\frac{\partial\varphi}{\partial x}+ \sum_{j=1}^k\lambda_j\frac{\partial F_j}{\partial x}\,,\qquad x \in N.
\end{equation}
\tag{1.7}
$$
Equations (1.6) for the undetermined coefficients $\lambda_1,\dots,\lambda_k$ are as follows:
These are the conditions for the ‘consistency’ of the constraints involving the modified vector field: $\dot{F}_s= 0$ (the derivative with respect to system (1.7)) at all points, where $F_1=\dots=F_k=0$.
We can also attempt to use this approach in the more general situation when the manifold $S$ is endowed with a pseudo-Riemannian metric: the matrix of coefficients $A(x)=\|a_{ij}(x)\|$ is non-singular but not positive definite. Let (in the simplest case) the distribution $\Pi$ be defined by the single equality $(a(x),\dot{x})=0$. Then the orthogonal projection produces the vector field
Let $(a,v) \ne 0$ (so that the original field $v$ is not a ‘conceivable’ one), and let the covector $a$ be isotropic (that is, $(A^{-1}a,a)=0$). Then relation (1.9) is contradictory. Thus, in the case of a pseudo-Euclidean metric it is not always possible to project $v$ orthogonally onto planes of the distribution defining the constraint.
2. Hamiltonian systems with constraints. Symplectic projection
2.1.
Now let $(M^{2n},\Omega^2)$ be a symplectic manifold, $M^{2n}=\{x\}$ be an even-dimensional smooth manifold, and $\Omega^2$ be a closed non-degenerate 2-form (a symplectic structure on $M^{2n}$). In what follows $M^{2n}$ is the phase space of a Hamiltonian system with $n$ degrees of freedom.
The value of the 2-form at tangent vector fields $u_1(x)$ and $u_2(x)$ has the form
where $J$ is a non-degenerate skew-symmetric $2n\times 2n$ matrix depending freely on the point $x \in M^{2n}$. By Darboux’s theorem, in certain special local canonical variables $p_1,\dots,p_n$, $q_1,\dots,q_n$ on $M^{2n}$ $J$ can be reduced to the form
on $M^{2n}$. The covector fields $a_1,\dots,a_k$ are assumed to be linearly independent at each point $x \in M^{2n}$. Of course, the distribution (2.3) is not integrable in the general case. How can we now agree the Hamiltonian system (2.2) with the constraints (2.3), which, generally speaking, are not invariant under the phase flow of the Hamiltonian system?
We can proceed as in § 1, by projecting the Hamiltonian vector field $v$ onto the $(2n-k)$-dimensional tangent planes (2.3) and obtaining the dynamical system
$$
\begin{equation*}
\dot{x}=w(x),\qquad x \in M^{2n},
\end{equation*}
\notag
$$
whose flow on $M^{2n}$ now preserves the distribution $\Pi$. However, we do not have a natural structure of a Riemannian manifold on $M^{2n}$ at our disposal. On the other hand we have a symplectic structure, which makes of each $2n$-dimensional tangent plane at points in $M^{2n}$ a symplectic spaces with inherent symplectic geometry. This means that we can introduce the ‘orthogonal’ projections of vectors $v(x)$ onto the tangent planes $\Pi_x$ (as objects of the symplectic space $T_x M^{2n}$). As a result, we obtain
However, this is the end of the analogy with dynamical systems in a Riemannian space. In fact, the matrix $J^{-1}$ (just as $J$) is skew symmetric, so that the $ k\times k $ matrix
can turn out to be singular. Then, in particular, the vector $v(x)$ cannot be projected orthogonally onto the plane $\Pi_x$, or such a projection is not uniquely defined.
Consider the tangent vectors
$$
\begin{equation*}
\alpha_s(x)=J^{-1}(x)a_s(x),\qquad 1 \leqslant s \leqslant k.
\end{equation*}
\notag
$$
They are linearly independent and orthogonal to the planes $\Pi_x$:
$$
\begin{equation}
(J\alpha_s,\dot{x})=(a_s,\dot{x})=0,\qquad 1 \leqslant s \leqslant k.
\end{equation}
\tag{2.6}
$$
spanned by the vectors $\alpha_1,\dots,\alpha_k$ is symplectic. In particular, $k$ must be even.
Thus, we can make the following conclusion.
Theorem 2. If the restriction of the symplectic structure to $\Sigma_x \subset T_x M^{2n}$ is a non-degenerate 2-form, then the vector field (2.4) is well defined and its phase flow preserves the distribution (2.3).
The assumptions of Theorem 2 mean that $\Sigma_x$ is a symplectic subspace of $T_x M^{2n}$.
Lemma. Let $\Sigma_x$ be a symplectic subspace of $T_x M^{2n}$. Then
Since $\Sigma_x$ is a symplectic subspace, the matrix (2.5) is non-singular, and it follows from this linear system that $\mu_1=\dots=\mu_k=0$.
Next we prove (b). Suppose that $\Pi_x$ is not a symplectic subspace. Then the restriction of $\Omega^2$ to $\Pi_x$ is a degenerate 2-form. Hence there exists a vector $\widehat{u} \in \Pi_x$ such that $\Omega^2(\widehat{u},u)=0$ for all $u \in \Pi_x$. By conclusion (a) each element of $T_x M^{2n}$ is a sum $\alpha+u$, where $\alpha \in \Sigma_x$. Since $\Omega^2(\widehat{u},\alpha)=0$ (by (2.6)), we have
However, then $\Omega^2$ is degenerate on $T_x M^{2n}$, which is impossible. $\Box$
It is obvious that we can interchange $\Sigma_x$ and $\Pi_x$ in the statement of the lemma.
Theorem 2'. If the restriction of the symplectic structure to the tangent planes in the distribution (2.3) is a non-degenerate 2-form, then the vector field (2.4) is well defined and its phase flow preserves the distribution (2.3).
This is an immediate consequence of the lemma and Theorem 2. Theorems 2 and 2' are dual.
2.2.
How does (2.4) look in the canonical variables $p$ and $q$? Let the constraints (2.3) have the form
$$
\begin{equation}
(b_j,dp)+(c_j,dq)=0,\qquad 1 \leqslant j\leqslant k.
\end{equation}
\tag{2.7}
$$
Here $b_j$ and $c_j$ are smooth ‘vector functions’ of $p$ and $q$. Taking (2.1) into account we obtain the differential equations
being non-singular is sufficient for an unambiguous consistent definition of the vector field $w$ allowed by the constraints. However, this condition is not necessary.
For example, the matrix (2.10) is certainly singular for $b_1=\dots=b_k=0$. We present an interesting example showing that we can find the required vector field $w$ for such a singular matrix (2.10).
2.3.
Let $Q^n=\{q\}$ be a smooth $n$-manifold (the configuration space of a mechanical system) and $M^{2n}=T^*Q$ be the total space of the cotangent bundle of $Q$. Let $p$ be an element of $T_q^*Q$. The manifold $M^{2n}$ is endowed with the natural symplectic structure
where $T=\bigl(K(q)p,p\bigr)/2$ is the kinetic energy and the smooth function $V\colon Q \to \mathbb{R}$ is the potential energy. For all $q \in Q$ the matrix $K(q)$ is positive definite.
Assume that the constraints imposed on the Hamiltonian system under consideration have the form
The covectors $c_1,\dots,c_k$ are assumed to be linearly independent everywhere on $Q$. Relations (2.11) are a special case of (2.7); here $b_1=\dots=b_k=0$ and $c_1,\dots,c_k$ depend only on the point in the configuration space $Q$.
The Hamiltonian equations with constraints (2.8) take the following form:
In mechanics such equations describe the dynamics of non-holonomic systems. It is well known that (as the matrix $K$ is positive definite) the multipliers $\lambda_1,\dots,\lambda_k$ are well defined as functions of the state, that is, of $q$ and $\dot{q}$ (for instance, see [25]).
Remark. Setting $b_1=\dots=b_k=0$ in (2.9) we obtain the relations
which can fail in general. However, in this case $\partial H/\partial p=\dot{q}$ (by (2.8)), so that (2.14) is transformed into the constraints (2.7), – so there is no contradiction here.
Thus we obtain a non-trivial example of a Hamiltonian system with constraints (2.7) and singular matrix (2.10) in which the multipliers $\lambda_1,\dots,\lambda_k$ are nevertheless well defined as functions of the canonical variables $q$ and $p$ ( $=K^{-1}\dot{q}$). By the way, in the case of non-integrable constraints (2.11) the Hamiltonian equations with constraints (2.12) can no longer be reduced to Hamiltonian equations, not even for even $k$. They have different dynamical properties. In particular, as general equations of non-holonomic dyhanics, equations (2.12) (or (2.13)) admit no integral invariants with smooth densities [26]. However, if the constraints (2.11) are integrable, then the Hamiltonian equations with constraints (2.12) are themselves Hamiltonian equations with $n-k$ degrees of freedom.
3. Dirac problem
3.1.
Now consider the special case when the constraints (2.7) are integrable. More precisely, fix an $(2n-k)$-dimensional smooth submanifold
The system of algebraic equations for the multipliers $\lambda_1,\dots,\lambda_k$ assumes the following form:
$$
\begin{equation}
\{H,F_s\}+\sum\lambda_j\{F_j,F_s\}=0,\qquad 1 \leqslant s \leqslant k.
\end{equation}
\tag{3.3}
$$
Here $\{\,\cdot\,{,}\,\cdot\,\}$ is the standard Poisson bracket. Therefore, a sufficient condition for the unique solvability of this system (and thus a condition for a well-defined projection of the original Hamiltonian vector field onto the $(2n-k)$-planes tangent to $N$) is that the skew-symmetric $k\times k$ matrix of Poisson brackets
Remark. Equations (3.1) and (3.3) are quite analogous to equations (1.7) and (1.8) for gradient dynamical systems with integrable constraints in Euclidean space.
Equations (3.1)–(3.2) (as well as the consistency conditions (3.3)) were originally derived by Dirac for quantization of systems with degenerate Lagrangians. An application of the Legendre transform to a degenerate Lagrangian produces several relations of the form (3.2), which are now usually called primary constraints. Among the infinite-dimensional systems with degenerate Lagrangian are, for example, Maxwell’s equations of the evolution of an electromagnetic field.1[x]1Another way to represent Maxwell’s equations (and other linear equations of mathematical physics with a quadratic invariant) can be found in [27].
In contrast to Hamiltonian systems with differential constraints considered in § 2, the projection of Hamiltonian equations with finite constraints onto the corresponding tangent subspaces produces extremals of a certain variational problem. In his derivation of (3.1) Dirac just based on a variational problem (namely, Hamilton’s principle in Lagrangian mechanics), rather then on the simpler idea of symplectic projection. We reproduce Dirac’s idea in a more general and concise form; it
For brevity we denote a point ($p$, $q$) in $M^{2n}$ by $x$, the restriction of $H$ to $N$ by $\Phi$, and the restriction of the 2-form to $N$ by $\omega^2$. Of course, $\omega^2$ is a closed form, but it can be degenerate (for example, if $\dim N$ is odd, that is, $k$ is odd).
We call a smooth path $x \colon \Delta \to M$, $x(t) \in N$ for all $t \in \Delta$, a motion of the Hamiltonian system with constraints $(M,\Omega^2,H,N)$ if
If $N$ coincides with $M$ (there are no constraints), then Dirac’s system coincides with a usual Hamiltonian system with $n$ degrees of freedom. More generally, if the 2-form $\omega^2$ is non-degenerate, then $(N,\omega^2)$ is a symplectic submanifold of $(M^{2n},\Omega^2)$ and motions of the Dirac system $(M,\Omega^2,H,N)$ coincide with solutions of the Hamiltonian system with Hamiltonian $\Phi$ on the symplectic manifold $(N,\omega^2)$. The condition that $\omega^2$ is non-degenerate is the same as the non-singularity of the skew-symmetric matrix (3.4). The Poisson bracket $\{\,\cdot\,{,}\,\cdot\,\}'$ on $N \subset M$ is defined by
where $\|c_{ij}\|$ is the inverse matrix of (3.4). It turns out that the restriction of $\{\Phi_1,\Phi_2\}'$ to $N$ depends only on the restrictions of $\Phi_1$ and $\Phi_2$ to $N$ (see [1] for details).
If some equations in (3.3) involve no multipliers $\lambda_1,\dots,\lambda_k$, then we obtain new constraints $\Psi_j=\{H,F_j\}=0$, which are called secondary constraints. In the general case secondary constraints are algebraic conditions for the solvability of systems of equations (3.3) with respect to the multipliers $\lambda_j$. The functions $\Psi_j$ must be added to the $\Phi_s$. If these function make up an independent system, then we can examine conditions for consistency (of the primary and secondary constraints) again. As a result, we either arrive at a contradiction (and then the Dirac problem has no solutions) or system (3.4) tirns out to be consistent for some choice of the multipliers $\lambda$. In the latter case $\lambda$ is not necessarily well defined, and then the initial conditions do not specify a unique solution of the Dirac problem.
3.3.
We have the following result.
Theorem 3. A smooth path $x\colon [t_1,t_2] \to N$ is a motion of the system with constraints $(M,\Omega^2,H,N)$ if and only if the action functional
and regard $\lambda_1,\dots,\lambda_k$ as additional generalized coordinates. Writing down Lagrange’s equations with Lagrangian $\mathcal{L}$ we obtain equations (3.1) and relations (3.2).
Theorem 3 reduces the Dirac problem to a variational Lagrange problem with degenerate Lagrangian $(p,\dot{q})-H$ and the integrable constraints given by (3.2). Thus, the equations in the Dirac problem are Hamiltonian equations projected onto the tangent planes to the constraint submanifold $N \subset M$. However, the part of Riemannian geometry is taken here by the symplectic geometry of the phase space of the Hamiltonian system.
4. Hamiltonian systems with constraints. Variational approach
4.1.
As in § 2, we look at a Hamiltonian system with differential constraints
However, in contrast to § 2, by motions of such systems we mean extremals of an appropriate variational problem.
Again, let $(M^{2n},\Omega^2)$ be a symplectic manifold and $H \colon M \to \mathbb{R}$ be a Hamiltonian. The symplectic structure $\Omega^2$ is a non-degenerate closed 2-form. Locally, $\Omega^2=d\omega^1$, where $\omega^1=(a(x),\dot{x})$ is a smooth 1-form. In the phase space consider the action functional (in the sense of Poincaré and Helmholtz) of the form
on the space of curves $x\colon [t_1,t_2] \to M$ with fixed endpoints that satisfy relations (4.1).
We look for extremals in this variational problem. Dirac considered an important but special case when the differential constraints (4.1) are integrable. The above problem is a Lagrange variational problem with non-integrable constraints (4.1) and degenerate Lagrangian
These $2n$ equations must be considered in combination with the $k$ constraints (4.1). Equations (4.3) and (4.1) are used to find the $2n+k$ variables $x$ and $\lambda$ as functions of time.
Before we move further, we make a few observations.
A. We have introduced above a 1-form $\omega^1$ such that $d\omega^1=\Omega^2$. It is defined up to a closed 1-form (whcih is locally the differential of a smooth function). However, this closed form does not affect extremals of the functional (4.2) and does not participate in (4.3).
B. In Hamiltonian mechanics the skew-symmetric $ 2n\times 2n $ matrix
In the Dirac problem the situation is simpler: $\Lambda_1=\dots=\Lambda_k=0$, so the first equation in (4.4) does not contain $\lambda_1,\dots,\lambda_k$. Then we can set $\mu_s=\dot{\lambda}_s$ and regard $\mu_1,\dots,\mu_k$ as independent additional parameters.
Thus, a function $x \colon [t_1,t_2] \to M$ is a solution of the generalized Dirac problem if and only if there exists a function $\lambda \colon [t_1,t_2] \to \mathbb{R}^k$ such that, in combination, these two functions satisfy equations (4.4). These equations can be presented in an invariant form. To do this we set $\omega_s^1(\,\cdot\,)=(a_s,\,\cdot\,)$; these are differential 1-forms on $M^{2n}$. If $w$ is the required vector field on $M^{2n}$ ($w=\dot{x}$), then
Here (as usual) $i_w$ denotes the interior product of the vector field $w$ and a differential form. For $k=0$ we obtain the original Hamiltonian system
has a non-zero determinant, then the system of equations (4.4) has a unique solution $t \mapsto x(t)$, $t \mapsto \lambda(t)$ such that $x(0)=x^0$ and $\lambda(0)=\lambda^0$.
As usual, this solution certainly exists in a neighbourhood of $t=0$. We must stress that when the constraints are non-integrable, the motions $t \mapsto x(t)$ of the system for fixed $x(0)$ and different $\lambda(0)$ are distinct. If the constraints are integrable, then it is not so.
Proof of Theorem 4. We represent (4.4) in the following equivalent form:
If the determinant of (4.5) is distinct from zero at the point $(x^0,\lambda^0) \in M^{2n}\times \mathbb{R}^k$, then (4.6) defines a dynamical system in a neighbourhood of this point. In particular, the values $x(0)=x^0$ and $\lambda(0)=\lambda^0$ determine uniquely the evolution of this system in the future, as required. $\Box$
Corollary. If the matrix (2.5) is non-singular for $x=x^0$, then the system (4.4) has a unique solution $x(t)$, $\lambda(t)$ satisfying the initial conditions $x(0)=x^0$ and $\lambda(0)=0$.
In fact, if the $k \times k$ matrix (2.5) is not singular, then for $\lambda_1=\dots=\lambda_k=0$ the determinant of (4.5) is distinct from zero.
Now we briefly discuss conditions for the solvability of (4.4) in the case when the matrix (4.5) is singular. The condition for the existence of velocities $\dot{x}$ and $\dot{\lambda}$ satisfying (4.6) reduces to the equality of the ranks of two matrices: the matrix (4.5) and the matrix
of size $(2n+k)\times(2n+k+1)$. Typically, the matrix (4.5) has a constant rank in some domain in the extended phase space $M^{2n} \times \mathbb{R}^k$. Let this rank be $r<2n+k$. Then all minors of the matrix (4.7) of order $r+1$ or higher must vanish. As a result, we obtain several independent relations
as a condition for the equality of the ranks of the matrices (4.5) and (4.7). By analogy with the Dirac dynamics they can be called secondary constraints. We must bear in mind that in Dirac’s theory secondary connections do not depend on the miltipliers $\lambda$.
Relations (4.8) are integrable constraints for the Lagrangian system with $n+k$ degrees of freedom and degenerate Lagrangian $\mathcal{L}(\dot{x},x,\lambda)$. The new system of equations has the form of Lagrange’s equations with additional multipliers $\nu_1,\dots,\nu_p$:
As a result, we arrive at equations that also have the form (4.4). They must also be examined for solvability with respect to $\dot{x}$, $\dot{\lambda}$, and $\nu$. Conditions for consistency can produce new constraints. This procedure is repeated as in Dirac’s dynamics.
4.3.
In conclusion, we present an example which is of interest from the standpoint of mechanics and in which the matrix (4.5) is singular. It is similar to the example in § 2, where a Hamiltonian field was projected onto the planes of a non-integrable distribution. However, now we start with a variational problem which is a generalization of the one considered by Dirac.
So assume that $M^{2n}=T^*Q^n$ is endowed with the standard symplectic structure (as in § 2), let
In combination with the constraints, these are the equations of motion of a mechanical system in so-called vaconomic mechanics [29]. They are different from the non-holonomic equations (2.13), although both models start from the same Lagrangian and the same constraints. The difference is in the principles underlying the definitions of motion. A comparison and a discussion of various models of motion of mechanical systems with constraints can be found, for instance, in [14].
In the absence of a potential (when $V \equiv 0$) the equations of vaconomic mechanics are the equations of geodesics in so-called sub-Riemannian geometry [30]–[33]. On the other hand, in sub-Riemannian geometry authors usually assume that the linear constraints (4.9) are completely non-integrable: we can attain any point on $Q^n$ from any other point by moving in accordance with the constraints (4.9). The condition for complete non-integrability is described by the celebrated Chow–Rashevskii theorem.
That non-holonomic and vaconomic equations of geodesics in Riemannian geometry with non-integrable distributions are different was pointed out already by Hertz [34]. On the other hand, it is written in some later texts discussing variational problems with constraints that ‘actual’ problems in mechanics (which are in fact described by non-holonomic equations) are solved in this way.
5. Some examples
In this section we apply Dirac’s approach to some problems in analytic mechanics and geometric optics.
5.1.
Consider the dynamics of a mechanical system with Lagrangian
where $L_j$ is a homogeneous form of degree $j$ in powers of $\dot{q}$ which depends smoothly on the $q$-coordinates. Lagrange’s equations with this Largangian have the energy integral
We set this constant equal to zero; this is always possible because $L_0(q)$ is defined up to an additive constant. By the least action priciple trajectories of the motion $t \mapsto q(t)$ are geodesic curves of the Finsler metric
In mechanics $U$ corresponds to the power function. The matrix $A$ is usually positive definite, but this is not necessary. We assume that it is non-singular. In this case the functional (5.2) is not defined at all curves, but rather on the ones lying in an appropriate ‘light cone’. This property is characteristic for the dynamics of relativistic particles.
where $\lambda$ is a non-trivial indefinite coefficient.
Thus, functions $t \mapsto q(t)$ satisfying equations (5.4), in combination with the ‘conjugate’ momenta $t \mapsto p(t)$, describe extremals of the Finsler metric (4.1). The function $F$ coincides with the Hamiltonian of the original Lagrange’s system with Lagrangian $L_2+L_1+L_0$ (as $L_2-L_0=0$, $F$ also vanishes on solutions of (5.4)). Since $H \equiv 0$, system (3.3) is always consistent, and there is no need to introduce secondary constraints.
Thus we have deduced the least action principle by using Dirac’s generalized dynamics. Equations (5.4) are transformed into usual Hamiltonian equations with respect to the new time
is the optical length of the path $x\colon [t_1,t_2] \to \mathbb{R}^3$ (the Fermat action), and $n(x)$ is the refraction coefficient on the isotropic optical medium. By Fermat’s principle stationary points of the functional (5.7) are light rays.
These are the equations of motion of a unit point mass in a potential field with power function $U$. Thus, rays of light in the optical medium $\mathbb{R}^3=\{x\}$ with refraction coefficient $n(x)$ coincide with trajectories of motion of a point mass in a potential field with power function $n^2/2$. This is the content of the optico-mechanical analogy established by J. Bernoulli as long ago as 1696. Usually, authors use other ways to derive Hamiltonian equations in geometric optics (see [5], Chap. I).
6. Light rays in strongly anisotropic media
6.1.
In anisotropic optical media the refraction coefficient has the form
$$
\begin{equation*}
f\biggl(x,\frac{\dot{x}}{|\dot{x}|}\biggr),\qquad x \in \mathbb{R}^3.
\end{equation*}
\notag
$$
For an isotropic medium this function is independent of the direction of the motion of particles of light; it is usually denoted by $n(x)$. Light rays coincide with the trajectories of solutions of Lagrange’s equations with Lagrangian
Here $a(x)\ne 0$ is a smooth vector field in 3-dimensional Euclidean space $\mathbb{R}^3= \{x\}$ and $N$ is a parameter (which will tend to $+\infty$). For $N=0$ we obtain the Lagrangian (5.6) describing the propagation of light in an isotropic medium. In the limit as $N \to \infty$, the motion of particles of light satisfies the constraint
How can we verify this? In fact the refraction coefficient of an optical medium at a point $x$ in a direction $v/|v|$ is the reciprocal of the velocity $v$ of a particle of light at $x$. So $L(v,x)=1$ on ‘actual’ motions. This is the equation of a surface in $\mathbb{R}^3=\{v\}$ called the indicatrix at $x$. In an isotropic medium the indicatrix is the sphere of radius $1/n(x)$. For the Lagrangian (6.1) the indicatrix is an ellipsoid squeezed in the direction of the vector $a(x)$. So, in the limit as $N \to \infty$, the motion of particles of light satisfies the constraint (6.2).
6.2.
We deduce the ‘limiting’ equations of light rays in two ways. The first is formal. We are based on Fermat’s principle $\delta I=0$ and take the constraint (6.2) into account. To do this we use the method of Lagrange multipliers and then Dirac’s generalized Hamiltonian dynamics.
The second approach is more substantive. We represent the equations of light rays in an optical medium with Lagrangian $L_N$ as Hamilton’s equations, and then let the parameter $N$ tend to infinity. In both approaches light rays are described by the same Hamiltonian equations.
Thus, first (following Lagrange) we introduce the new Lagrangian
and treat $\lambda$ as a new variable (along with $x_1$, $x_2$, and $x_3$). The Lagrangian $\mathcal{L}$ is degenerate in the velocities $\dot{x}$ and $\dot{\lambda}$.
The coefficient $\mu$ is of no importance because the velocity of particles of light does not affect the shape of light rays we are interested in. Therefore, we can set $\mu=1$. Now, by (6.2) and (6.3)
We must remember that these equations are to be considered on the zero level of the Hamiltonian (for $F=0$).
6.3.
In the framework of the second approach we must transform Lagrange’s equations with Hamiltonian $L_N$ to Hamiltonian ones and then let $N$ tend to infinity. Since $L_N$ is homogeneous of degree one in velocities, we cannot apply the Legendre transform directly. Usually, authors take another way, by introducing the new Lagrangian $L_N^*=L^2_N/2$, which is homogeneous of degree 2 in $\dot{x}$. It is obvious that the Hessian matrix
Thus, for $N \to \infty$ the function $x(\,\cdot\,)$ describing a ray of light and the conjugate momentum $y(\,\cdot\,)$ satisfy the system of canonical Hamiltonian equations with Hamiltonian (6.8):
This theorem on limit transition, established in 1982 [29], is an example of the realization of the general idea of the method of penalty functions. Since $L_N^*=1/2$ on light rays, equations (6.9) must (in accordance with (6.7)) be considered at the level $H^*=1/2$.
Comparing the Hamiltonians (6.6) and (6.8) we conclude that the above two approaches lead indeed to the same result.
where the ‘kinetic’ energy $K$ is defined by (6.10) and the power function $U$ is $n^2/2$.
Consider the field $\alpha=a/\sqrt{(a,a)}$ of vectors with unit length. Then $y-(y,\alpha)\alpha=y_\alpha$ is the projection of a momentum onto the direction specified by the vector $\alpha$. The second equation in (6.11) takes the form
This is the kinetic energy of a unit point mass moving in Euclidean space.
Theorem 5. In a strongly anisotropic optical medium light rays are trajectories of the motion of a vaconomic particle in a potential field with power function $n^2/2$.
The proof is based on the observation that the Hamiltonian equations (6.12) with kinetic energy (6.10) are in fact the vaconomic equations of motion, under the constraint $\bigl(\alpha(x),\dot{x}\bigr)=0$, of an unconstrained unit point mass in a potential field with power function $U=n^2/2$ (cf. [29]).
Theorem 5 extends Bernoulli’s optico-mechanical analogy to strongly anisotropic optical media. We stress that, after imposing the constraint, we obtain a vaconomic mechanical system, rather than a familiar non-holonomic system. The reason is that light rays are defined by Fermat’s variational principle, while classical non-holonomic equations admit no variational representation at all in the general case.
7. Final observations
7.1.
In fact, the main ideas of Dirac’s generalized Hamiltonian dynamics had already been known in analytic mechanics much earlier. They were long used in other terms (and with less generality) in Lagrangian mechanics with integrable (holonomic) constraints. Recall the main principles (going back to Lagranges) of the account for holonomic constraints.
Let $L(\dot{q},q)$ be a Lagrangian and $F(q)=0$ be an equation of integrable constraint. Assume that $dF \ne 0$ if $F=0$. The dynamics of this systems is described by Lagrange’s equation with a multiplier
The term on the right $\lambda\,\partial F/\partial q$ is called the reaction of the constraint $\{F=0\}$ in mechanics. Can we find a coefficient $\lambda$ such that system (7.1) is consistent?
The standard way is as follows. Since $F\bigl(q(t)\bigr)=0$, it follows that
where $A$ is the matrix of second derivatives $\partial^2 L/\partial \dot{q}^2$, and the term $\Gamma$ is independent of the acceleration $\ddot{q}$. In mechanics $A$ is positive definite; in particular, it is non-singular. Then from (7.3) and (7.4) we obtain
We have assumed that $\partial F/\partial q \ne 0$ (in any case, at the points where $F(q)=0$). Hence from (7.5) we can uniquely find $\lambda$ as a function of the state ($\dot{q}$, $q$) of the system.
The first equation (7.1) is a second-order differential equation with respect to $q$, and the function (7.2) is a first integral there. We are only interested in solutions of that equation on which the functions $F$ and $\Phi$ vanish.
7.2.
How does it all look like from the standpoint of Dirac’s generalized Hamiltonian formalism? If the matrix $A$ is non-singular (at least locally), then we can make the Legentre transform and introduce the canonical momenta and a Hamiltonian:
This is the same as equation (7.2), but presented in canonical variables.
Next (in accordance with Dirac), we must treat the relations $F=0$ and $\Phi=0$ as new ‘primary’ constraints, and in place of (7.6) we must write the new equations
are mutually inverse. If (as it occurs in mechanics) $A=\partial^2 L/\partial \dot{q}^2$ is positive definite, then so is $\partial^2 H/\partial p^2$. However, then it follows from (7.8) that $\{F,\Phi\} \ne 0$.
Since $\Phi=\{F,H\}=0$ (a secondary constraint), the first consistency condition in (7.7) yields $\mu=0$. It is easy to see that the second equation in (7.7) is just equation (7.5) in canonical variables.
7.3.
In § 2 we considered the example of a ‘natural’ Lagrangian system with non-integrable constraint
Lagrange’s equations can be presented in Hamiltonian variables. If we treat (7.9) as a distribution in the phase space of the Hamiltonian system, then the symplectic projection onto the tangent hyperplane (7.9) yields classical non-holonomic equations.
On the other hand, if $H(p,q)$ is a Hamiltonian, then $\dot{q}=\partial H/\partial p$ and equation (7.9) takes the form
This can be regarded as a primary constraint imposed on the system with Hamiltonian $H$. What equations in Largange’s representation does Dirac’s method produce? We could imagine that these would be the same non-holonomic equation, but this is not so.
Consider the simple example of the Hamiltonian system with Hamiltonian
To this equation we must add the constraint $(a(q),\dot{q})=0$. As a result, we obtain a vaconomic equation of motion of a particle in a potential field with non-integrable constraint.
Equation (7.11) is the equation of extremals in the Lagrange variational problem
It is clear why we obtain vaconomic equations in place of the more common non-holonomic ones: Dirac’s equations (3.1) have the variational nature, and this, of course, manifests itself in the Lagrangian representation for equations of motion with constraints.
Bibliography
1.
P. A. M. Dirac, “Generalized Hamiltonian dynamics”, Canad. J. Math., 2:2 (1950), 129–148
2.
P. A. M. Dirac, “Generalized Hamiltonian dynamics”, Proc. Roy. Soc. London Ser. A, 246 (1958), 326–332
3.
J. L. Anderson and P. G. Bergmann, “Constraints in covariant field theories”, Phys. Rev. (2), 83:5 (1951), 1018–1025
4.
B. V. Medvedev, “P. A. M. Dirac and the logical foundations of quantum theory. II”: P. A. M. Dirac, Collected works, v. III, Fizmatlit, Moscow, 2004, 649–650 (Russian)
5.
V. V. Kozlov, General theory of vortices, 2nd revised and augmented ed., Institute for Computer Studies, Moscow–Izhevsk, 2013, 324 с. ; English transl. of 1st ed. Dynamical systems X. General theory of vortices, Encyclopaedia Math. Sci., 67, Springer-Verlag, Berlin, 2003, viii+184 pp.
6.
L. C. Young, Lectures on the calculus of variations and optimal control theory, W. B. Saunders Co., Philadelphia–London–Toronto, ON, 1969, xi+331 pp.
7.
A. J. Hanson, T. Regge, and C. Teitelboim, Constrained Hamiltonian systems, Accad. Naz. Lincei, Rome, 1976, 135 pp.
8.
V. V. Nesterenko and A. M. Chervyakov, “Some properties of constraints in theories with degenerate Lagrangians”, Theoret. and Math. Phys., 64:1 (1985), 701–707
9.
L. Faddeev and R. Jackiw, “Hamiltonian reduction of unconstrained and constrained systems”, Phys. Rev. Lett., 60:17 (1988), 1692–1694
10.
B. M. Barbashov, “Hamiltonian formalism for Lagrangian systems with prescribed constraints”, Fiz. Elementarnykh Chastits i Atom. Yadra, 34:1 (2003), 5–42 (Russian)
11.
G. D. Birkhoff, Dynamical systems, Amer. Math. Soc. Colloq. Publ., 9, Amer. Math. Soc., New York, 1927, viii+295 pp.
12.
E. Newman and P. G. Bergmann, “Lagrangians linear in the “velocities””, Phys. Rev. (2), 99:2 (1955), 587–592
13.
F. A. Berezin, “Hemiltonian formalizm in the general Lagrange problem”, Uspekhi Mat. Nauk, 29:3(177) (1974), 183–184 (Russian)
14.
V. I. Arnold, V. V. Kozlov, and A. I. Neĭshtadt, Mathematical aspects of classical and celestial mechanics, Encyclopaedia Math. Sci., 3, Dynamical systems. III, 3rd ed., Springer-Verlag, Berlin, 2006, xiv+518 pp.
15.
M. V. Deryabin, “The Dirac Hamiltonian formalism and the realization of constraints by small masses”, J. Appl. Math. Mech., 64:1 (2000), 35–39
16.
D. R. Merkin, Giroscopic systems, 2nd ed., Nauka, Moscow, 1974, 344 pp. (Russian)
17.
V. V. Strygin and V. A. Sobolev, Separation of motions by the method of integral manifolds, Nauka, Moscow, 1988, 256 pp. (Russian)
18.
A. Yu. Ishlinskii, Mechanics of gyroscolic systems, 2nd ed., USSR Academy of Sciences publishing house, Moscow, 1963, 482 pp. (Russian)
19.
V. V. Kozlov, “The dynamics of systems with large gyroscopic forces and the realization of constraints”, J. Appl. Math. Mech., 78:3 (2014), 213–219
20.
A. V. Vlakhova, “Nonholonomic motions of gyroscopic and wheeled systems”, Moscow Univ. Mech. Bull., 68:5 (2013), 126–132
21.
A. V. Vlakhova, “The dynamics of systems with rolling and gyroscopic systems with small generalized velocities and the realization of constraints”, J. Appl. Math. Mech., 78:6 (2014), 568–579
22.
E. Cartan, Leçons sur les invariants intégraux, Hermann, Paris, 1922, x+210 pp.
23.
A. F. Filippov, “Differential equations with discontinnuous right-hand side”, Amer. Math. Soc. Transl. Ser. 2, 42, Amer. Math. Soc., Providence, RI, 1964, 199–231
24.
V. I. Utkin, Sliding modes and their application in variable structure systems, Mir, Moscow, 1978, 257 pp.
25.
P. Appell, Traité de mécanique rationnelle, v. I, 5-e éd., Gauthier-Villars, Paris, 1926, 619 pp. ; v. II, 6-e éd., 1953, 575 pp.
26.
V. V. Kozlov, “On the integration theory of equations of nonholonomic mechanics”, Regul. Chaotic. Dyn., 7:2 (2002), 161–176
27.
V. V. Kozlov, “Quadratic conservation laws for equations of mathematical physics”, Russian Math. Surveys, 75:3 (2020), 445–494
28.
G. A. Bliss, Lectures on the calculus of variations, Univ. of Chicago Press, Chicago, IL, 1946, ix+296 pp.
29.
V. V. Kozlov, “Dynamics of systems with nonintegrable constraints. I”, Moscow Univ. Mech. Bull., 37:3-4 (1982), 27–34; II, 37:3-4 (1982), 74–80; III, 38:3 (1983), 40–51; IV. Integral principles, 42:5 (1987), 40–49; V. Freedom principle and ideal constraints condition, 43:6 (1988), 23–29
30.
R. S. Strichartz, “Sub-Riemannian geometry”, J. Differential Geom., 24:2 (1986), 221–263
31.
P. A. Griffiths, Exterior differential systems and the calculus of variations, Progr. Math., 25, Birkhäuser, Boston, MA, 1983, ix+335 pp.
32.
A. M. Vershik and V. Ya. Gershkovich, “Nonholonomic dynamical systems, geometry of distributions and variational problems”, Dynamical systems VII, Encyclopaedia Math. Sci., 16, Springer, Berlin, 1994, 1–81
33.
A. A. Agrachev, “Topics in sub-Riemannian geometry”, Russian Math. Surveys, 71:6 (2016), 989–1019
34.
H. Hertz, Die Principien der Mechanik in neuem Zusammenhange dargestellt, Gesammelte Werke, III, J. A. Barth, Leipzig, 1894, xxix+312 pp.
Citation:
V. V. Kozlov, “On Dirac's generalized Hamiltonian dynamics”, Russian Math. Surveys, 79:4 (2024), 649–681