V. V. Kozlov, “On Dirac's generalized Hamiltonian dynamics”, Russian Math. Surveys, 79:4 (2024), 649

Russian Mathematical Surveys

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor
	Submit a manuscript

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Uspekhi Mat. Nauk:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Russian Mathematical Surveys, 2024, Volume 79, Issue 4, Pages 649–681
DOI: https://doi.org/10.4213/rm10183e (Mi rm10183)

This article is cited in 5 scientific papers (total in 5 papers)

On Dirac's generalized Hamiltonian dynamics

V. V. Kozlov

Steklov Mathematical Institute of Russian Academy of Sciences, Moscow

English version PDF (664 kB) HTML full-text Citations (5) Russian version article

References:

PDF

HTML

DOI: https://doi.org/10.4213/rm10183e

Abstract: Various aspects of Dirac's generalized Hamiltonian dynamics are under consideration. The starting point is a Hamiltonian system on a symplectic manifold on which, in addition, a distribution of multidimensional tangent planes is defined. The Hamiltonian vector field must be modified so that this distribution becomes invariant under the phase flow of the modified dynamical system. This problem can be solved in various ways. The simplest approach is to project the Hamiltonian vector field onto the planes of the distribution by using the symplectic structure, the closed nondegenerate 2-form on the symplectic manifold (which defines the symplectic geometry on tangent planes to the phase space). If the distribution in question is integrable, then this approach leads to generalized Hamiltonian dynamics developed by Dirac (and other authors) to quantize systems whose Lagrangian is degenerate in velocities. In application to the mechanics of Lagrange's systems with non-integrable constraints, this approach yields classical non-holonomic systems. Another approach is based on the definition of the motion of Hamiltonian systems as an extremal of a variational problem with constraints. Then in the case of non-holonomic constraints we obtain dynamical systems of a quite different type. In application to Lagrange's systems with non-integrable constraints this approach yields the equations of motion of so-called vaconomic dynamics. For example geometric optic is considered, which is based on Fermat's variatonal principle with Largangian which is homogeneous of degree one with respect to the velocities.
Bibliography: 34 titles.

Keywords: Hamiltonian system, symplectic structure, symplectic geometry, Lagrangian, Hamiltonian, primary constraints, secondary constraints, distributions, Poincaré–Helmholtz action, non-holonomic systems, vaconomic systems, non-isotropic optic media, Fermat's principle.

Funding agency	Grant number
Ministry of Science and Higher Education of the Russian Federation	075-15-2022-265
This work was performed at the Steklov International Mathematical Center and supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-15-2022-265).

Received: 13.05.2024

Published: 26.11.2024

Bibliographic databases:

Document Type: Article

UDC: 517.9+530.1

MSC: 37J60, 70H03, 70H05, 70H45

Language: English

Original paper language: Russian

When Hamilton developed his methods, of course he had no idea that they would acquire this importance. All he could have had as a guiding beacon in carrying out his work was the mathematical beauty of his equations. It shows his great genius that he was able to appreciate that his work was valuable in spite of its not having practical importance at the time.

P. Dirac, “Hamiltonian methods and quantum mechanics”

0. Introduction

0.1.

The dynamics of conservative systems is usually presented in the framework of either Lagrangian of Hamiltonian formalism. Lagrange’s and Hamiltonian equations are related by means of the Legendre transform, which has the property of duality. In mechanics the phase space of a Lagrange system is the total space of the tangent bundle of the configuration space, while a Hamiltonian system is defined as the total space of the contangent bundle of $M$. However, the nature of Hamiltonian systems is more general: it is natural to consider such systems on arbitrary symplectic manifolds (not necessarily equal to $T^*M$).

In [1] and [2] Dirac developed a Hamiltonian approach in the case when the Lagrangian $L(\dot{q},q)$ is degenerate in velocities. This range of problems was also treated in [3]. An important example is the problem of a relativistic particle in a given electromagnetic field in the four-dimensional space-time. Here the ‘kinetic’ energy of the particle is a homogeneous function of degree one in the generalized velocities. Problems in geometric optics also relate here. The fact that problems with degenerate Lagrangians are of importance is mainly connected with the question of quantization, which is based on the Hamiltonian form of equations of motion.

Dirac proposed to use the Legendre transform and introduce momenta by the standard rule:

$$ \begin{equation} p_j=\frac{\partial L}{\partial\dot{q}_j}\,,\qquad 1 \leqslant j \leqslant n; \end{equation} \tag{0.1} $$

here $n$ is the number of degrees of freedom. If the Lagrangian is degenerate in the velocities (that is, $\det\|\partial^2 L/\partial\dot{q}^2\|=0$), then from (0.1) we can deduce several relations

$$ \begin{equation} \varphi_k(p,q)=0,\qquad 1 \leqslant k \leqslant m. \end{equation} \tag{0.2} $$

In [3] these were called primary constraints. The Hamiltonian is introduced in the standard way:

$$ \begin{equation} H=\sum p_j \dot{q}_j-L. \end{equation} \tag{0.3} $$

At this point this is a function of the $3n$ variables $q$, $\dot{q}$, and $p$. However,

$$ \begin{equation*} \frac{\partial H}{\partial\dot{q}_j}=p_j-\frac{\partial L}{\partial \dot{q}_j}\,, \end{equation*} \notag $$

which is zero by (0.1). Hence we can treat $H$ as a function of $p$ and $q$ alone (in a certain ‘weak’ sense of Dirac). More precisely, $H$ is a function on some $2n$-submanifold of the $3n$-dimensional space of the variables $q$, $\dot{q}$, and $p$, which is defined by the $n$ independent equations (0.1) and (0.2).

After that Dirac ‘deduced’ the generalized Hamiltonian equations

$$ \begin{equation} \dot{q}=\frac{\partial H}{\partial p}+ \sum\lambda_k\frac{\partial \varphi_k}{\partial p}\,,\qquad \dot{p}=-\frac{\partial H}{\partial q}- \sum\lambda_k\frac{\partial \varphi_k}{\partial q}\,. \end{equation} \tag{0.4} $$

Here $\lambda_1,\dots,\lambda_m$ are multipliers to be defined yet. To (0.4) we must add the further $m$ equations of constraints (0.2). In addition, the ‘consistence conditions’ must hold for the constraints:

$$ \begin{equation} \dot{\varphi}_i=\{\varphi_i,H\}+\sum \lambda_k\{\varphi_i,\varphi_k\}=0,\qquad 1 \leqslant i \leqslant m, \end{equation} \tag{0.5} $$

when all the $\varphi_k$ vanish. Here $\{\,\cdot\,{,}\,\cdot\,\}$ is the standard Poisson bracket.

If the matrix of Poisson brackets $\|\{\varphi_i,\varphi_j\}\|$ is non-singular, then equations (0.5) determine $ \lambda_k$ uniquely as a function of $p$ and $q$. When this matrix is singular, seondary constraints arise in the form of algebraic conditions of the solvability of (0.5) with respect to $\lambda_1,\dots,\lambda_m$. These secondary constraints are added to the primary ones, and consistency conditions are written out again. Eventually, we either arrive at a contradiction (for instance, when Lagrange’s equations with degenerate Lagrangian have no solutions at all), or the linear system of equations of the form (0.5) is consistent. In the latter case it is possible that the multipliers $\lambda$ can be found in more than one way.

The derivation of (0.4) is based on so-called strong and weak relations introduced by Dirac. Commentators of [1] and [2] noted that Dirac’s rule of transition from Largange’s degenerate equations to equations (0.4) should be substantiated more thoroughly (in particular, one should show that these rules are compatible). However, “… as usual, for some mysterious reasons questionable situations just do not occur in his arguments and constructions” [4].

0.2.

It is fair to say that the first author to consider Hamiltonian aspects of Lagrangian dynamical systems with degenerate Lagrangian was Hamilton himself. Initially, the object of his investigations was geometric optics, and only then he turned to the dynamics of mechanical systems in a potential force field. The optical properties of a medium are determined by the refraction coefficient $n$, which is the reciprocal of the speed of light. In general, this coefficient is a function of the point $x$ and the direction of the velocity of particles of light $v=\dot{x}$:

$$ \begin{equation*} n=f\biggl(x,\frac{\dot{x}}{|\dot{x}|}\biggr). \end{equation*} \notag $$

The duration of the propagation of light along a ray (optical length of the path) is given by the integral

$$ \begin{equation} \int_{t_1}^{t_2}L(\dot{x},x)\,dt,\qquad L=|\dot{x}|f. \end{equation} \tag{0.6} $$

The integral (0.6) is often called the Fermat action. By Fermat’s principle light propagates along a path that takes the least time. In other words, the path $t \mapsto x(t)$ is a stationary point of the functional (0.6) with fixed endpoints, so that it satisfies Lagrange’s equation with Lagrangian $L$.

Since $L$ is homogeneous in the velocity, we have (by Euler’s theorem)

$$ \begin{equation*} \biggl(\frac{\partial L}{\partial v}\,,v\biggr)=L. \end{equation*} \notag $$

Applying Euler’s formula again, to $\partial L/\partial v$ at the point $v \ne 0$, we obtain

$$ \begin{equation*} \frac{\partial^2 L}{\partial v^2}\,v=0. \end{equation*} \notag $$

Therefore,

$$ \begin{equation*} \det\biggl\|\frac{\partial^2 L}{\partial v^2}\biggr\|=0. \end{equation*} \notag $$

Thus, we cannot use the classical Legendre transform in this case.

Hamilton coped with this difficulty as follows. Consider the closed set of points $v \in \mathbb{R}^3$ such that $L(x,v)=1$. This surface is called the indicatrix at the point $x$. The next to introduce is the figuratrix, the set of momenta $y \in \mathbb{R}^3$ defined by the relations

$$ \begin{equation*} y=\frac{\partial L}{\partial v}\quad\text{and}\quad L(x,v)=1. \end{equation*} \notag $$

When the indicatrix is a convex surface, the figuratrix is too. In this important case there exists a unique function $H(x,y)$ that is positive homogeneous in $y$ ($H(x,\lambda y)=\lambda H(x,y)$ for $\lambda>0$) and equal to 1 for all momenta lying on the figuratrix. The transformation

$$ \begin{equation*} v=\frac{\partial H}{\partial y}\,,\qquad H(x,y)=1 \end{equation*} \notag $$

takes the figuratrix to the indicatrix. Thus, the functions $L$ and $H$ are dual (as also are the indicatrix and figuratrix). Hamilton showed that a path $t \mapsto x(t)$ is a light ray if and only is there exists a ‘conjugate’ function $t \mapsto y(t)$ such that, in combination, they satisfy the canonical equations

$$ \begin{equation} \dot{x}=\frac{\partial H}{\partial y}\,,\qquad \dot{y}=-\frac{\partial H}{\partial x}\,. \end{equation} \tag{0.7} $$

This result can be found in Hamilton’s memoir submitted to the Royal Irish Academy in 1824 (for a brief presentation, see [5], Chap. I, § 4).

How does all this look from the point of view of Dirac’s Hamiltonian formalizm? It is easy to understand that if the indicatrix is convex, then the equation $y=\partial L/\partial v$ produces the unique non-trivial primary constraint

$$ \begin{equation*} \varphi=H(x,y)-1=0. \end{equation*} \notag $$

Since the Lagrangian $L$ is a homogeneous function of degree one in the velocities, the Hamiltonian (0.3) vanishes identically. Hence Dirac’s equations (0.4) assume the form

$$ \begin{equation} \dot{x}=\lambda\,\frac{\partial \varphi}{\partial y}= \lambda\,\frac{\partial H}{\partial y}\,,\qquad \dot{y}=-\lambda\,\frac{\partial \varphi}{\partial x}= -\lambda\,\frac{\partial H}{\partial x}\,. \end{equation} \tag{0.8} $$

It is obvious that the consistency conditions (0.5) hold for all values of $\lambda$. Equations (0.8) are only different from the Hamiltonian equations (0.7) by the coefficient $\lambda$, which can be an arbitrary positive function of time. (The fact that it is positive follows from the simple observation that particles of light move along rays with non-zero velocity and preserve the direction of their motion.) Thus, from the standpoint of the problem of finding light rays equations (0.7) and (0.8) coincide.

Hamilton’s method was refined in the calculus of variations (for instance, see [6]). It is based on the transformations of poles and polars with respect to the unit sphere which is known in geometry. By the way, Hamilton’s results on geometric optics were not mentioned in [1]–[3] (nor in other papers on Dirac’s dynamics: for instance, see [7]–[10]).

0.3.

For example we also consider another extreme case, when the Lagrangian is linear in velocities:

$$ \begin{equation} L=(a,\dot{q})+u(q)=\sum a_i(q)\dot{q}_i+u(q_1,\dots,q_n). \end{equation} \tag{0.9} $$

It is clear that

$$ \begin{equation*} \biggl\|\frac{\partial^2 L}{\partial \dot{q}^2}\biggr\|=0. \end{equation*} \notag $$

Largange’s equations with Lagrangian (0.9) are the first-order equations

$$ \begin{equation} \biggl[\biggl(\frac{\partial a}{\partial q}\biggr)^{\!*}- \frac{\partial a}{\partial q}\biggr]\dot{q}=-\frac{\partial u}{\partial q} \end{equation} \tag{0.10} $$

(here $*$ denotes the transposed matrix) or, at greater length

$$ \begin{equation*} \sum_j a_{ij}\dot{q}_j=\frac{\partial u}{\partial q_i}\,,\qquad a_{ij}=\frac{\partial a_i}{\partial q_j}-\frac{\partial a_j}{\partial q_i}\,. \end{equation*} \notag $$

Birkhoff called a dynamnical system of this form a gyroscopic particle ([11], Chap. I, § 9). For example, a charged particle of negligible mass in a static magnetic field has this type.

Assume that the skew-symmetric matrix $\|a_{ij}\|$ has a determinant distinct from zero (the general case was discussed in [12] and [9]). Then (0.10) defines a usual dynamical system in the configuration space:

$$ \begin{equation*} \dot{q}_i=\sum a^{ij}\frac{\partial u}{\partial q_j}\,,\qquad 1 \leqslant i \leqslant n. \end{equation*} \notag $$

Here $\|a^{ij}\|$ is the inverse matrix of $\|a_{ij}\|$.

In this case $n$ must be even and system (0.10) is Hamiltonian, although it is not expressed in terms of canonical variables. The symplectic structure on $M=\{q\}$ is defined by the closed non-degenerate 2-form

$$ \begin{equation*} \sum a^{ij}(q)\,dq_i \wedge dq_j, \end{equation*} \notag $$

and $-u(q)$ is the Hamiltonian (in fact, the choice of the sign before the Hamiltonian is of no importance). In particular, the function $u\colon M \to \mathbb{R}$ is a first integral of system (0.10).

We apply Dirac’s Hamiltonian formalism to the degenerate Lagrangian (0.9). The relations for momenta

$$ \begin{equation} p=\frac{\partial L}{\partial\dot{q}}=a(q) \end{equation} \tag{0.11} $$

produce the $n$ independent primary constraints

$$ \begin{equation} \varphi_k(p,q)=p_k-a_k(q_1,\dots,q_n)=0,\qquad 1 \leqslant k \leqslant n. \end{equation} \tag{0.12} $$

The Hamiltonian (0.3) has the form

$$ \begin{equation*} H=(p,\dot{q})-L=-u(q). \end{equation*} \notag $$

The generalized Hamiltonain equations (in accordance with Dirac) have the explicit form

$$ \begin{equation} \begin{aligned} \, \dot{q}_i&=\sum \lambda_k\frac{\partial \varphi_k}{\partial p_i}=\lambda_i, \\ \dot{p}_i&=\frac{\partial u}{\partial q_i}- \sum \lambda_k\frac{\partial \varphi_k}{\partial q_i}= \frac{\partial u}{\partial q_i}+ \sum \lambda_k\frac{\partial a_k}{\partial q_i}\,. \end{aligned} \end{equation} \tag{0.13} $$

Now we write down the consistency conditions for (0.12):

$$ \begin{equation} \dot{\varphi}_k=\frac{\partial u}{\partial q_k}+\sum \biggl[\frac{\partial a_s}{\partial q_k}- \frac{\partial a_k}{\partial q_s}\biggr]\lambda_s=0. \end{equation} \tag{0.14} $$

Since we assume that the matrix $\|a_{ij}\|$ is non-singular, from (0.14) we can find the multipliers $\lambda_1,\dots,\lambda_n$ in a unique way. Substituting these into the first equation in (0.13) yields Lagrange’s equations (0.10). The second equation in (0.13) produces nothing new in comparison to (0.11).

We see that Dirac’s formalizm results in the same Hamiltonian system (0.10) with $n/2$ degrees of freedom. In [13] and [10] Lagrange’s system with Lagrangian (0.9) was looked upon from another point of view. It was proposed there that equations (0.10) (not involving second derivatives) could be treated as constraints imposed on a Lagrangian system. After that the well-known Hamiltonian formalism was used in the Lagrange variational problem with constraints. As a result, the equations assumed the form of canonical Hamiltonian equations in a $2n$-dimensional phase space, where the equations for the $q$-coordinates can be separated and, of course, are just (0.10). This approach was pursued in a number of papers (for instance, see [8] and [10]), where attempts to combine it with Dirac’s approach were made. In our opinion this only leads to unnecessary complications. Furthermore, the invariant meaning of such constructions is not quite clear.

0.4.

An effective method of the substantiation of Dirac’s generalized Hamiltonian dynamics was indicated in [14], Chap. I, § 6; it is related to considerations of mechanical systems with small masses. More precisely, we mean systems with $n+1$ degrees of freedom whose dynamics is described by Lagrange’s equations with Lagrangian

$$ \begin{equation} L_\varepsilon=L_0(\dot{q},q,Q)+\frac{\varepsilon\dot{Q}^2}{2}+ \varepsilon L_1(\dot{q},q,Q,\varepsilon);\qquad q \in \mathbb{R}^n,\quad Q \in \mathbb{R}, \end{equation} \tag{0.15} $$

where $\varepsilon$ is a small parameter. The function $L_0$ is assumed to be non-degenerate in $\dot{q}$.

For $\varepsilon=0$ we obtain a degenerate system. A primary constraint (in the sense of Dirac) is $P=0$, where

$$ \begin{equation*} P=\frac{\partial L_0}{\partial \dot{Q}}\,. \end{equation*} \notag $$

Solutions of Lagrange’s equations with Lagrangian (0.15) can be represented by series in powers of $\varepsilon$, where for $\varepsilon=0$ we obtain solutions of the relevant Dirac equations. These series can be divergent but (as shown in [15]) they are certainly asymptotic.

The theory of gyroscopic systems (for instance, see [16]–[18]) is linked with this trick. Consider the dynamics of a mechanical system with Lagrangian

$$ \begin{equation*} L=\frac{1}{2}\bigl(A(q)\dot{q},\dot{q}\bigr)+N\bigl(a(q),\dot{q}\bigr)+u(q). \end{equation*} \notag $$

The matrix $A$ is assumed to be positive definite for all $q \in M$; it determines the inertial properties of the system. The parameter $N$ is assumed to be large. We go over to the new time $\tau=t/N$ and denote differentiation with respect to $\tau$ by primes. Then Lagrange’s equations can be put into the form

$$ \begin{equation} \Lambda q'=\frac{\partial u}{\partial q}- \varepsilon\biggl(\frac{d}{d\tau}\,\frac{\partial T}{\partial q'}- \frac{\partial T}{\partial q}\biggr), \end{equation} \tag{0.16} $$

where

$$ \begin{equation*} \Lambda=\frac{\partial a}{\partial q}- \biggl(\frac{\partial a}{\partial q}\biggr)^{\!*}, \end{equation*} \notag $$

$T=(Aq',q')/2$ is the kinetic energy of the system, and $\varepsilon=1/N^2$ is a small parameter. For $\varepsilon=0$ we obtain equations (0.10), which are Hamiltonian when the skew-symmetric matrix $\Lambda$ is non-singular. Equations (0.10) are called precession equations. An analysis of the transition from the full equations to simplified ones as $\varepsilon\to 0$ is an important problem in the theory of gyroscopic systems [16]–[18]. As mentioned in [16], for $\det\Lambda=0$ the precession equations are not necessarily solvable, or they can have infinitely many solutions (as in Dirac’s theory). This question was considered in [19], where the power expansions of solutions of the full system (0.16) in $\varepsilon$ were also investigated. The transition, as $\varepsilon\to 0$, from the full equations (0.16) to the precession equations was considered in [9] from the standpoint of Dirac’s Hamiltonian formalism. For applications of Dirac’s dynamics to concrete problems with small generalized velocities in mechanics the reader can address [20] and [21].

0.5.

So far, we discussed just one (but not the only) important aspect of Dirac’s Hamiltonian formalism. It is related to an algorithm representing Lagrange’s equations with degenerate Lagrangian in a certain generalized Hamiltonian form. The core question in this range of problems is as follows: is it true that all solutions of Lagrange’s equations can be obtained by use of the Dirac representation described above? This is perhaps true, but to our best knowledge no full rigorous proof is available. Lagrange’s equations with degenerate Lagrangian need not have solutions at all. Here is a trivial example:

$$ \begin{equation*} L=\dot{q}+q,\qquad q \in \mathbb{R}. \end{equation*} \notag $$

Of course, a solution of the equivalence problem must also cover such cases.

Equations (0.4) can also be treated otherwise, as Hamiltonian equations with constraints (with no link to Lagrangian mechanics whatsoever). In [1] Dirac also discussed this problem. The starting point here is the Hamiltonian and the contraints (0.2) imposed in the phase space.

However, this setup changes the situation drastically. In the initial approach (when we develop the Hamiltonian aspect of Lagrangian mechanics with degenerate Lagrangian) we have a definition of motion: this is a solution of Lagrange’s equations. In the framework of the second approach (when, given a Hamiltonian system, we impose constraints on it which are not invariant under its phase flow) we must first define a motion, and then derive differential equations governing this motion.

This is just the approach to the development of Dirac’s generalized Hamiltonian mechanics that was presented in [14], Chap. I, § 5. Dirac’s himself made no stress on the necessity to have a clear definition of a motion of a Hamiltonian system with constraints. Perhaps the reason lies in an implicit assumption that the Hamiltonian systems under consideration have been deduced from Lagrange’s equations in some natural way. On the other hand, the modern view (going back to È. Cartan [22]) is that the two aspects of mechanics, the Hamiltonian and Lagrangian ones, are clearly delineated. Hamiltonian systems ‘live’ on symplectic manifolds, which are not necessarily the total spaces of the contangent bundles of the configuration spaces for mechanical systems.

An analogy with the theory of Lagrange’s systems with constraints is appropriate here. Let $L(\dot{q},q)$ be a Lagrangian and $(a(q),\dot{q})=0$ be a constraint. In non-holonomic mechanics it is postulated that motions $t \mapsto q(t)$ are solutions of the equations

$$ \begin{equation*} \frac{d}{dt}\,\frac{\partial L}{\partial \dot{q}}- \frac{\partial L}{\partial q}=\lambda a,\qquad (a,\dot{q})=0. \end{equation*} \notag $$

Here $\lambda$ is a coefficient found as a function of the state ($\dot{q}$, $q$) of the system in the case when the Lagrangian is non-degenerate in velocities. In the vaconomic model of motion the motion equations look otherwise:

$$ \begin{equation*} \frac{d}{dt}\,\frac{\partial L}{\partial \dot{q}}- \frac{\partial L}{\partial q}=\lambda\biggl[\frac{\partial a}{\partial q}- \biggl(\frac{\partial a}{\partial q}\biggr)^{\!*}\,\biggr]\dot{q}+ \dot{\lambda}a,\qquad (a,\dot{q})=0. \end{equation*} \notag $$

We can represent in this form the equations of extremals in the Lagrange variational problem

$$ \begin{equation*} \delta\int_{t_1}^{t_2}L\,dt=0 \end{equation*} \notag $$

in the class of curves with fixed endpoints that satisfy the constraint $(a,\dot{q})=0$. If the constraint is integrable, then these two models are equivalent: see the discussion in [14], Chap. I.

0.6.

In this paper we put forward a generalization of Dirac’s Hamiltonian dynamics, when constraints linear in the velocities of canonical variables are imposed on a Hamiltonian system. When these constraints describe an integrable distribution on the phase space, we obtain Dirac’s dynamics.

First (in § 1) we expose our ideas in the simple situation when differential constraints are imposed on a dynamical system on a Riemannian space. Next we consider a Hamiltonian system with constraints described by some distribution of tangent planes (§ 2). Taking these constraints into account is based on projecting the vector field onto the tangent planes of this distribution. We perform this projection using the symplectic geometry of tangent planes which is induced by the symplectic geometry of the phase space.

In § 3 we discuss the case when the distribution of tangent planes (describing the constraint) is integrable. Then the general construction in § 2 leads to Dirac’s Hamiltonian dynamics. In § 4 we discuss another approach to the dynamics of Hamiltonian systems with constraints. In contrast to § 2, by motions of such systems we mean extremals of a variational problem with prescribed constraints, when the Poincaré–Helmholtz action is takes as the functional. Again, in the case of integrable constraints we obtain Dirac’s Hamiltonian dynamics.

Special cases of the constructions in §§ 2 and 4 are the non-holonomic and vaconomic models of motion of mechanical systems, which we mentioned above. In § 5 we consider some other examples, in particular, ones from geometric optics. In § 6 J. Bernoulli’s optico-mechanical analogy is extended to strongly anisotropic optical media. In § 7 we discuss a comparison between Dirac’s generalized Hamiltonian dynamics and the principles of construction of Lagrangian systems with constraints known in analytical mechanics.

1. Dynamical systems with constraints on a Riemannian space

1.1.

Let $S^m=\{x_1,\dots,x_m\}$ be an $m$-dimensional Riemannian manifold with metric

$$ \begin{equation} \sum a_{ij}(x)\,dx_i\,dx_j, \end{equation} \tag{1.1} $$

where the coefficients $a_{ij}$ are smooth functions and the matrix $A(x)=\|a_{ij}(x)\|$ is positive definite for all $x \in S^m$. The value of the quadratic form (1.1) at a pair of vector fields $u_1(x)$, $u_2(x)$ on $S$ can be expressed as

$$ \begin{equation*} (Au_1,u_2), \end{equation*} \notag $$

where $(\,\cdot\,{,}\,\cdot\,)$ is canonical pairing (the value of a covector at a vector at a point $x \in S$). In fact, $Au_1$ is indeed a covector (a vector in the cotangent space $T_x^* S$).

Let $v$ be a smooth vector field on $S$. It gives rise to the autonomous system of differential equations

$$ \begin{equation} \dot{x}=v(x),\qquad x \in S \end{equation} \tag{1.2} $$

(as usual, points denote derivatives with respect to the independent variable, which we denote by $t$). Also consider a few constraints linear in the velocities:

$$ \begin{equation} f_1=\bigl(a_1(x),\dot{x}\bigr)=0,\quad\dots,\quad f_k=\bigl(a_k(x),\dot{x}\bigr)=0. \end{equation} \tag{1.3} $$

Here $a_1,\dots,a_k$ ($k<m$) are covector fields on $S$ which are linearly independent at each point. The system of equations (1.3) defines a smooth $(m-k)$-dimensional distribution $\Pi$ on $S$. It is non-integrable in the general case.

Generally speaking, the distribution $\Pi$ is not invariant under the phase flow of the dynamical system (1.2). How can we achieve invariance? To do this we must replace $v$ by a vector field $w$ such that

$$ \begin{equation*} w(x) \in\Pi_x \end{equation*} \notag $$

for all $x \in S$. This can be done in various ways.

This type of problem is common in control theory for dynamical systems. In particular, in the description of so-called sliding modes (for instance, see [23] and [24]). Actually, authors consider mostly integrable constraints, when

$$ \begin{equation*} f_j=\bigl(F_j(x)\bigr)^{\boldsymbol{\cdot}},\qquad 1 \leqslant j \leqslant k. \end{equation*} \notag $$

Here $F_1,\dots,F_k\colon S \to \mathbb{R}$ are some independent smooth functions; in particular, $a_j=\partial F_j/\partial x$. We assume that the field $v(x)$ is defined away frpm the ‘singular’ set

$$ \begin{equation} N=\{x \in S\colon F_1(x)=\dots=F_k(x)=0\}, \end{equation} \tag{1.4} $$

on which it has discontinuities. We assume that phase trajectories of $v$ reach $N$ from ‘different’ sides. For a full description of the dynamics, $v$ must also be defined at the points in $N$. Trajectories of this ‘extended’ system that lie on $N$ correspond just to sliding modes.

We return to the definition of the vector field $w$ in question. One of the most natural ways to define it is to project the original field $v$ onto tangent $(m- k)$-planes (1.3). However, the result depends on the choice of a metric. In our case $S$ is already endowed with a Riemannian metric, which we can use.

Remark. In the theory of sliding modes authors usually consider the case $k=1$. Then for a natural definition of the field $v$ on the hypersubmanifold $N$ (for instance, Filippov’s definition) we do not need a metric in the ambiant space at all. This is a characteristic feature of sliding modes [23], [24].

Set

$$ \begin{equation} w=v+\sum_{j=1}^k \lambda_j A^{-1}a_j, \end{equation} \tag{1.5} $$

where $\lambda_1,\dots,\lambda_k$ are multiplies to be specified. Since $a_1,\dots,a_k$ are covectors, it follows that the vectors

$$ \begin{equation*} \alpha_1= A^{-1}a_1,\quad\dots,\quad \alpha_k= A^{-1}a_k \end{equation*} \notag $$

are tangent to $S$. Moreover,

$$ \begin{equation*} (A\alpha_j,\dot{x})=0 \end{equation*} \notag $$

for all vectors $\dot{x}$ in the distribution $\Pi$. In fact,

$$ \begin{equation*} (A\alpha_j,\dot{x})=(a_j,\dot{x})=0,\qquad 1 \leqslant j \leqslant k, \end{equation*} \notag $$

by (1.3).

Now, using the obvious equalities $(a_j,w)=0$ and multiplying (1.5) by $a_1,\dots,a_k$ we obtain a linear system of algebraic equations for the multipliers:

$$ \begin{equation} (a_s,v)+\sum_{j=1}^k \lambda_j (A^{-1}a_j,a_s)=0,\qquad 1 \leqslant s \leqslant k. \end{equation} \tag{1.6} $$

Since $A^{-1}$ is a positive definite matrix and $a_1,\dots,a_k$ are linearly independent covectors, the Gram matrix

$$ \begin{equation*} \|(A^{-1}a_j,a_s)\|,\qquad 1 \leqslant s,j \leqslant k, \end{equation*} \notag $$

has a positive determinant. Hence we uniquely find $\lambda_1,\dots,\lambda_k$ from (1.6) as smooth functions on $S$.

1.2.

There can be different interpretations of our choice of $w$. We point out an analogue of Gauss’s principle in the dynamics of mechanical systems with constraints, which is closely connected with the least square method (see [25]). By a conceivable vector field $u$ we mean any vector field satisfying $u(x) \in \Pi_x$ for all $x$. We say that the vector field $v$ is unconstrained while $w$ is an actual field. According to Gauss, conceivable motions are maps $t \mapsto x(t)$ that satisfy the constraints (1.3) for all values of $t$. The freed motion has velocity $v(x)$, and the actual motion (regarded as one of conceivable ones) has velocity $w(x)$. The general Gaussian least constraint principle can be stated as follows.

Theorem 1. Among all conceivable motions, the actual one has the least deviation from the unconstrained motion.

We can measure the deviation (constraint in the sense of Gauss) of a conceivable motion from the unconstrained one in terms of the quantity

$$ \begin{equation*} Z(u,v)=\bigl(A(u-v),u-v\bigr). \end{equation*} \notag $$

Gauss’s principle is easy to prove. Clearly,

$$ \begin{equation*} u-v=(u-w)+(w-v), \end{equation*} \notag $$

where the vectors $u-w$ and $w-v$ are orthogonal in the Riemanian metric on $S$. In fact, $u(x)-w(x) \in \Pi_x$, while $w(x)-v(x)$ is orthogonal to $\Pi_x$ by (1.5). Hence by Pyphagoras’s theorem

$$ \begin{equation*} Z(u,v)=Z(u,w)+Z(v,w). \end{equation*} \notag $$

As the matrix $A$ is positive definite, we have $Z(u,v)\geqslant Z(v,w)$ and the ‘constraint’ $Z(u,v)$ takes its minimum value for $u=w$, as required.

1.3.

For a more transparent comparison of these observations with Dirac’s generalized Hamitonian dynamics (which we discuss in § 3), consider so-called gradient dynamical systems in Euclidean space $S=\mathbb{R}^m$ with standard metric. A gradient system is defined by a vector field with components

$$ \begin{equation*} v_1=\frac{\partial\varphi}{\partial x_1}\,,\quad\dots,\quad v_m=\frac{\partial\varphi}{\partial x_m}\,, \end{equation*} \notag $$

where $\varphi\colon \mathbb{R}^m \to \mathbb{R}$ is a smooth function. We look at the special case when the constraints (1.3) are integrable: it will be about a motion along the $(m-k)$-dimensional manifold (1.4) such that the vector field $w$ assumes the following form at its points:

$$ \begin{equation} \dot{x}=\frac{\partial\varphi}{\partial x}+ \sum_{j=1}^k\lambda_j\frac{\partial F_j}{\partial x}\,,\qquad x \in N. \end{equation} \tag{1.7} $$

Equations (1.6) for the undetermined coefficients $\lambda_1,\dots,\lambda_k$ are as follows:

$$ \begin{equation} \biggl(\frac{\partial F_s}{\partial x}\,, \frac{\partial\varphi}{\partial x}\biggr)+\sum_{j=1}^k\lambda_j \biggl(\frac{\partial F_s}{\partial x}\,, \frac{\partial F_j}{\partial x}\biggr)=0,\qquad 1 \leqslant s \leqslant k. \end{equation} \tag{1.8} $$

These are the conditions for the ‘consistency’ of the constraints involving the modified vector field: $\dot{F}_s= 0$ (the derivative with respect to system (1.7)) at all points, where $F_1=\dots=F_k=0$.

We can also attempt to use this approach in the more general situation when the manifold $S$ is endowed with a pseudo-Riemannian metric: the matrix of coefficients $A(x)=\|a_{ij}(x)\|$ is non-singular but not positive definite. Let (in the simplest case) the distribution $\Pi$ be defined by the single equality $(a(x),\dot{x})=0$. Then the orthogonal projection produces the vector field

$$ \begin{equation*} w=v+\lambda A^{-1}a. \end{equation*} \notag $$

Hence

$$ \begin{equation} (a,v)+\lambda(A^{-1}a,a)=0. \end{equation} \tag{1.9} $$

Let $(a,v) \ne 0$ (so that the original field $v$ is not a ‘conceivable’ one), and let the covector $a$ be isotropic (that is, $(A^{-1}a,a)=0$). Then relation (1.9) is contradictory. Thus, in the case of a pseudo-Euclidean metric it is not always possible to project $v$ orthogonally onto planes of the distribution defining the constraint.

2. Hamiltonian systems with constraints. Symplectic projection

2.1.

Now let $(M^{2n},\Omega^2)$ be a symplectic manifold, $M^{2n}=\{x\}$ be an even-dimensional smooth manifold, and $\Omega^2$ be a closed non-degenerate 2-form (a symplectic structure on $M^{2n}$). In what follows $M^{2n}$ is the phase space of a Hamiltonian system with $n$ degrees of freedom.

The value of the 2-form at tangent vector fields $u_1(x)$ and $u_2(x)$ has the form

$$ \begin{equation*} \Omega^2(u_1,u_2)=\bigl(J(x)u_1(x),u_2(x)\bigr), \end{equation*} \notag $$

where $J$ is a non-degenerate skew-symmetric $2n\times 2n$ matrix depending freely on the point $x \in M^{2n}$. By Darboux’s theorem, in certain special local canonical variables $p_1,\dots,p_n$, $q_1,\dots,q_n$ on $M^{2n}$ $J$ can be reduced to the form

$$ \begin{equation} \begin{bmatrix} 0 & E \\ -E & 0 \end{bmatrix}, \end{equation} \tag{2.1} $$

where $E$ is the identity matrix of order $n$.

Let $H \colon M^{2n} \to \mathbb{R}$ be a smooth function. This is a Hamiltonian producing a Hamiltonian tangent vector field $v$ on $M^{2n}$:

$$ \begin{equation*} \Omega^2(v,\,\cdot\,)=dH(\,\cdot\,). \end{equation*} \notag $$

In its turn $v$ gives rise to the Hamiltonian system of differential equations

$$ \begin{equation} \dot{x}=v(x),\qquad x \in M^{2n}. \end{equation} \tag{2.2} $$

In the variables $p$ and $q$ this system takes the canonical form

$$ \begin{equation*} \dot{p}=-\frac{\partial H}{\partial q}\,,\qquad \dot{q}=\frac{\partial H}{\partial p}\,. \end{equation*} \notag $$

Now consider a distribution $\Pi$ of tangent planes

$$ \begin{equation} \bigl(a_1(x),\dot{x}\bigr)=\dots=\bigl(a_k(x),\dot{x}\bigr)=0,\qquad k<2n, \end{equation} \tag{2.3} $$

on $M^{2n}$. The covector fields $a_1,\dots,a_k$ are assumed to be linearly independent at each point $x \in M^{2n}$. Of course, the distribution (2.3) is not integrable in the general case. How can we now agree the Hamiltonian system (2.2) with the constraints (2.3), which, generally speaking, are not invariant under the phase flow of the Hamiltonian system?

We can proceed as in § 1, by projecting the Hamiltonian vector field $v$ onto the $(2n-k)$-dimensional tangent planes (2.3) and obtaining the dynamical system

$$ \begin{equation*} \dot{x}=w(x),\qquad x \in M^{2n}, \end{equation*} \notag $$

whose flow on $M^{2n}$ now preserves the distribution $\Pi$. However, we do not have a natural structure of a Riemannian manifold on $M^{2n}$ at our disposal. On the other hand we have a symplectic structure, which makes of each $2n$-dimensional tangent plane at points in $M^{2n}$ a symplectic spaces with inherent symplectic geometry. This means that we can introduce the ‘orthogonal’ projections of vectors $v(x)$ onto the tangent planes $\Pi_x$ (as objects of the symplectic space $T_x M^{2n}$). As a result, we obtain

$$ \begin{equation} w(x)=v(x)+\sum_{j=1}^n \lambda_j J^{-1}(x)a_j(x), \end{equation} \tag{2.4} $$

which is quite similar to (1.5).

However, this is the end of the analogy with dynamical systems in a Riemannian space. In fact, the matrix $J^{-1}$ (just as $J$) is skew symmetric, so that the $ k\times k $ matrix

$$ \begin{equation} \|(J^{-1}a_j,a_s)\| \end{equation} \tag{2.5} $$

can turn out to be singular. Then, in particular, the vector $v(x)$ cannot be projected orthogonally onto the plane $\Pi_x$, or such a projection is not uniquely defined.

Consider the tangent vectors

$$ \begin{equation*} \alpha_s(x)=J^{-1}(x)a_s(x),\qquad 1 \leqslant s \leqslant k. \end{equation*} \notag $$

They are linearly independent and orthogonal to the planes $\Pi_x$:

$$ \begin{equation} (J\alpha_s,\dot{x})=(a_s,\dot{x})=0,\qquad 1 \leqslant s \leqslant k. \end{equation} \tag{2.6} $$

Clearly,

$$ \begin{equation*} \Omega^2(\alpha_j,\alpha_s)=(J\alpha_j,\alpha_s)= (J^{-1}a_j,a_s). \end{equation*} \notag $$

Hence the non-singularity of the matrix (2.5) is equivalent to that of the $ k\times k $ matrix

$$ \begin{equation*} \|\Omega^2(\alpha_j,\alpha_s)\|. \end{equation*} \notag $$

Its non-singularity is equivalent to the condition that the $k$-dimensional linear space

$$ \begin{equation*} \Sigma_x=\{\mu_1\alpha_1+\dots+\mu_k\alpha_k;\,\mu_j \in \mathbb{R}\}, \end{equation*} \notag $$

spanned by the vectors $\alpha_1,\dots,\alpha_k$ is symplectic. In particular, $k$ must be even.

Thus, we can make the following conclusion.

Theorem 2. If the restriction of the symplectic structure to $\Sigma_x \subset T_x M^{2n}$ is a non-degenerate 2-form, then the vector field (2.4) is well defined and its phase flow preserves the distribution (2.3).

The assumptions of Theorem 2 mean that $\Sigma_x$ is a symplectic subspace of $T_x M^{2n}$.

Lemma. Let $\Sigma_x$ be a symplectic subspace of $T_x M^{2n}$. Then

(a) $T_x M^{2n}=\Sigma_x \oplus \Pi_x$,
(b) $\Pi_x$ is a symplectic subspace of $T_x M^{2n}$.

Since $\dim\Sigma_x+\dim \Pi_x=2n$, to prove (a) it is sufficient to show that

$$ \begin{equation*} \Sigma_x \cap \Pi_x=\{0\}. \end{equation*} \notag $$

In fact, let

$$ \begin{equation*} \sum_{j=1}^k \mu_j \alpha_j \in \Pi_x. \end{equation*} \notag $$

Then

$$ \begin{equation*} \Bigl(a_s,\sum \mu_j \alpha_j\Bigr)=\sum \mu_j(a_s,\alpha_j)= \sum \mu_j(a_s,J^{-1}a_j)=0. \end{equation*} \notag $$

Since $\Sigma_x$ is a symplectic subspace, the matrix (2.5) is non-singular, and it follows from this linear system that $\mu_1=\dots=\mu_k=0$.

Next we prove (b). Suppose that $\Pi_x$ is not a symplectic subspace. Then the restriction of $\Omega^2$ to $\Pi_x$ is a degenerate 2-form. Hence there exists a vector $\widehat{u} \in \Pi_x$ such that $\Omega^2(\widehat{u},u)=0$ for all $u \in \Pi_x$. By conclusion (a) each element of $T_x M^{2n}$ is a sum $\alpha+u$, where $\alpha \in \Sigma_x$. Since $\Omega^2(\widehat{u},\alpha)=0$ (by (2.6)), we have

$$ \begin{equation*} \Omega^2(\widehat{u},\alpha+u)=\Omega^2(\widehat{u},u)=0. \end{equation*} \notag $$

However, then $\Omega^2$ is degenerate on $T_x M^{2n}$, which is impossible. $\Box$

It is obvious that we can interchange $\Sigma_x$ and $\Pi_x$ in the statement of the lemma.

Theorem 2'. If the restriction of the symplectic structure to the tangent planes in the distribution (2.3) is a non-degenerate 2-form, then the vector field (2.4) is well defined and its phase flow preserves the distribution (2.3).

This is an immediate consequence of the lemma and Theorem 2. Theorems 2 and 2' are dual.

2.2.

How does (2.4) look in the canonical variables $p$ and $q$? Let the constraints (2.3) have the form

$$ \begin{equation} (b_j,dp)+(c_j,dq)=0,\qquad 1 \leqslant j\leqslant k. \end{equation} \tag{2.7} $$

Here $b_j$ and $c_j$ are smooth ‘vector functions’ of $p$ and $q$. Taking (2.1) into account we obtain the differential equations

$$ \begin{equation} \dot{p}=-\frac{\partial H}{\partial q}+\sum \lambda_s c_s\quad\text{and}\quad \dot{q}=\frac{\partial H}{\partial p}-\sum \lambda_s b_s. \end{equation} \tag{2.8} $$

The multipliers $\{\lambda_s\}$ are determined by the system of linear algebraic equations

$$ \begin{equation} \biggl(b_j,\frac{\partial H}{\partial q}\biggr)- \biggl(c_j,\frac{\partial H}{\partial p}\biggr)- \sum\lambda_s[(c_s,b_j)-(b_s,c_j)]=0,\qquad 1 \leqslant j\leqslant k. \end{equation} \tag{2.9} $$

The skew-symmetric matrix

$$ \begin{equation} \|(c_s,b_j)-(b_s,c_j)\| \end{equation} \tag{2.10} $$

being non-singular is sufficient for an unambiguous consistent definition of the vector field $w$ allowed by the constraints. However, this condition is not necessary.

For example, the matrix (2.10) is certainly singular for $b_1=\dots=b_k=0$. We present an interesting example showing that we can find the required vector field $w$ for such a singular matrix (2.10).

2.3.

Let $Q^n=\{q\}$ be a smooth $n$-manifold (the configuration space of a mechanical system) and $M^{2n}=T^*Q$ be the total space of the cotangent bundle of $Q$. Let $p$ be an element of $T_q^*Q$. The manifold $M^{2n}$ is endowed with the natural symplectic structure

$$ \begin{equation*} \Omega^2=d(p,dq)=\sum dp_j \wedge dq_j. \end{equation*} \notag $$

Assume that the Hamiltonian has a form characteristic for so-called natural mechanical systems:

$$ \begin{equation*} H=T+V, \end{equation*} \notag $$

where $T=\bigl(K(q)p,p\bigr)/2$ is the kinetic energy and the smooth function $V\colon Q \to \mathbb{R}$ is the potential energy. For all $q \in Q$ the matrix $K(q)$ is positive definite.

Assume that the constraints imposed on the Hamiltonian system under consideration have the form

$$ \begin{equation} \bigl(c_1(q),dq\bigr)=\dots=\bigl(c_k(q),dq\bigr)=0. \end{equation} \tag{2.11} $$

The covectors $c_1,\dots,c_k$ are assumed to be linearly independent everywhere on $Q$. Relations (2.11) are a special case of (2.7); here $b_1=\dots=b_k=0$ and $c_1,\dots,c_k$ depend only on the point in the configuration space $Q$.

The Hamiltonian equations with constraints (2.8) take the following form:

$$ \begin{equation} \dot{p}=-\frac{\partial H}{\partial q}+\sum_{j=1}^k \lambda_j c_j,\qquad \dot{q}=\frac{\partial H}{\partial p}\,. \end{equation} \tag{2.12} $$

We can express $p$ from the second equation $\dot{q}=Kp$ and introduce the Lagrangian

$$ \begin{equation*} L(\dot{q},q)=\bigl[(p,\dot{q})-H(p,q)\bigr]_{p \to \dot{q}}. \end{equation*} \notag $$

In accordance with the Legendre transform, $L=T-V$, where

$$ \begin{equation*} T=\frac{1}{2}\bigl(K^{-1}(q)\dot{q},\dot{q}\bigr)\quad\text{and}\quad \frac{\partial H}{\partial q}=-\frac{\partial L}{\partial q}\,. \end{equation*} \notag $$

Now the first equation in (2.12) takes the form of Lagrange’s equation with constraints (2.11):

$$ \begin{equation} \frac{d}{dt}\,\frac{\partial L}{\partial \dot{q}}- \frac{\partial L}{\partial q}=\sum \lambda_j c_j,\qquad (c_1,\dot{q})=\dots=(c_k,\dot{q})=0. \end{equation} \tag{2.13} $$

In mechanics such equations describe the dynamics of non-holonomic systems. It is well known that (as the matrix $K$ is positive definite) the multipliers $\lambda_1,\dots,\lambda_k$ are well defined as functions of the state, that is, of $q$ and $\dot{q}$ (for instance, see [25]).

Remark. Setting $b_1=\dots=b_k=0$ in (2.9) we obtain the relations

$$ \begin{equation} \biggl(c_j,\frac{\partial H}{\partial p}\biggr)=0,\qquad 1 \leqslant j \leqslant k, \end{equation} \tag{2.14} $$

which can fail in general. However, in this case $\partial H/\partial p=\dot{q}$ (by (2.8)), so that (2.14) is transformed into the constraints (2.7), – so there is no contradiction here.

Thus we obtain a non-trivial example of a Hamiltonian system with constraints (2.7) and singular matrix (2.10) in which the multipliers $\lambda_1,\dots,\lambda_k$ are nevertheless well defined as functions of the canonical variables $q$ and $p$ ( $=K^{-1}\dot{q}$). By the way, in the case of non-integrable constraints (2.11) the Hamiltonian equations with constraints (2.12) can no longer be reduced to Hamiltonian equations, not even for even $k$. They have different dynamical properties. In particular, as general equations of non-holonomic dyhanics, equations (2.12) (or (2.13)) admit no integral invariants with smooth densities [26]. However, if the constraints (2.11) are integrable, then the Hamiltonian equations with constraints (2.12) are themselves Hamiltonian equations with $n-k$ degrees of freedom.

3. Dirac problem

3.1.

Now consider the special case when the constraints (2.7) are integrable. More precisely, fix an $(2n-k)$-dimensional smooth submanifold

$$ \begin{equation*} N=\{p,q \in M^{2n}\colon F_1(p,q)=\dots=F_k(p,q)=0\}, \end{equation*} \notag $$

where the functions $F_1,\dots,F_k\colon M^{2n} \to\mathbb{R}$ are independent at the points in $N$. By (2.7) we have

$$ \begin{equation*} b_j=\frac{\partial F_j}{\partial p} \quad\text{and}\quad c_j=\frac{\partial F_j}{\partial q}\,;\qquad 1 \leqslant j \leqslant k. \end{equation*} \notag $$

Then the Hamiltonian equations (2.8) with ‘finite’ constraints take the following form:

$$ \begin{equation} \dot{p}=-\frac{\partial H}{\partial q}- \sum_{j=1}^k\lambda_j\frac{\partial F_j}{\partial q}\,,\qquad \dot{q}=\frac{\partial H}{\partial p}+ \sum_{j=1}^k\lambda_j\frac{\partial F_j}{\partial p}\,. \end{equation} \tag{3.1} $$

We must add to them the constraints

$$ \begin{equation} F_1(p,q)=0,\quad\dots,\quad F_k(p,q)=0. \end{equation} \tag{3.2} $$

The system of algebraic equations for the multipliers $\lambda_1,\dots,\lambda_k$ assumes the following form:

$$ \begin{equation} \{H,F_s\}+\sum\lambda_j\{F_j,F_s\}=0,\qquad 1 \leqslant s \leqslant k. \end{equation} \tag{3.3} $$

Here $\{\,\cdot\,{,}\,\cdot\,\}$ is the standard Poisson bracket. Therefore, a sufficient condition for the unique solvability of this system (and thus a condition for a well-defined projection of the original Hamiltonian vector field onto the $(2n-k)$-planes tangent to $N$) is that the skew-symmetric $k\times k$ matrix of Poisson brackets

$$ \begin{equation} \|\{F_j,F_s\}\|,\qquad 1 \leqslant s,j \leqslant k, \end{equation} \tag{3.4} $$

is non-singular.

Remark. Equations (3.1) and (3.3) are quite analogous to equations (1.7) and (1.8) for gradient dynamical systems with integrable constraints in Euclidean space.

Equations (3.1)–(3.2) (as well as the consistency conditions (3.3)) were originally derived by Dirac for quantization of systems with degenerate Lagrangians. An application of the Legendre transform to a degenerate Lagrangian produces several relations of the form (3.2), which are now usually called primary constraints. Among the infinite-dimensional systems with degenerate Lagrangian are, for example, Maxwell’s equations of the evolution of an electromagnetic field.¹

In contrast to Hamiltonian systems with differential constraints considered in § 2, the projection of Hamiltonian equations with finite constraints onto the corresponding tangent subspaces produces extremals of a certain variational problem. In his derivation of (3.1) Dirac just based on a variational problem (namely, Hamilton’s principle in Lagrangian mechanics), rather then on the simpler idea of symplectic projection. We reproduce Dirac’s idea in a more general and concise form; it

(a) clears up the variational nature of equations (3.1), and
(b) is used in § 4 in a more general situation.

3.2.

For brevity we denote a point ($p$, $q$) in $M^{2n}$ by $x$, the restriction of $H$ to $N$ by $\Phi$, and the restriction of the 2-form to $N$ by $\omega^2$. Of course, $\omega^2$ is a closed form, but it can be degenerate (for example, if $\dim N$ is odd, that is, $k$ is odd).

We call a smooth path $x \colon \Delta \to M$, $x(t) \in N$ for all $t \in \Delta$, a motion of the Hamiltonian system with constraints $(M,\Omega^2,H,N)$ if

$$ \begin{equation*} \omega^2(\dot{x}(t),\,\cdot\,)=d\Phi(x(t)) \end{equation*} \notag $$

for all values $t \in \Delta$.

If $N$ coincides with $M$ (there are no constraints), then Dirac’s system coincides with a usual Hamiltonian system with $n$ degrees of freedom. More generally, if the 2-form $\omega^2$ is non-degenerate, then $(N,\omega^2)$ is a symplectic submanifold of $(M^{2n},\Omega^2)$ and motions of the Dirac system $(M,\Omega^2,H,N)$ coincide with solutions of the Hamiltonian system with Hamiltonian $\Phi$ on the symplectic manifold $(N,\omega^2)$. The condition that $\omega^2$ is non-degenerate is the same as the non-singularity of the skew-symmetric matrix (3.4). The Poisson bracket $\{\,\cdot\,{,}\,\cdot\,\}'$ on $N \subset M$ is defined by

$$ \begin{equation*} \{\Phi_1,\Phi_2\}'=\{\Phi_1,\Phi_2\}+\sum_{i,j=1}^k\{F_i,\Phi_1\}c_{ij} \{F_j,\Phi_2\}, \end{equation*} \notag $$

where $\|c_{ij}\|$ is the inverse matrix of (3.4). It turns out that the restriction of $\{\Phi_1,\Phi_2\}'$ to $N$ depends only on the restrictions of $\Phi_1$ and $\Phi_2$ to $N$ (see [1] for details).

If some equations in (3.3) involve no multipliers $\lambda_1,\dots,\lambda_k$, then we obtain new constraints $\Psi_j=\{H,F_j\}=0$, which are called secondary constraints. In the general case secondary constraints are algebraic conditions for the solvability of systems of equations (3.3) with respect to the multipliers $\lambda_j$. The functions $\Psi_j$ must be added to the $\Phi_s$. If these function make up an independent system, then we can examine conditions for consistency (of the primary and secondary constraints) again. As a result, we either arrive at a contradiction (and then the Dirac problem has no solutions) or system (3.4) tirns out to be consistent for some choice of the multipliers $\lambda$. In the latter case $\lambda$ is not necessarily well defined, and then the initial conditions do not specify a unique solution of the Dirac problem.

3.3.

We have the following result.

Theorem 3. A smooth path $x\colon [t_1,t_2] \to N$ is a motion of the system with constraints $(M,\Omega^2,H,N)$ if and only if the action functional

$$ \begin{equation*} \int_{t_1}^{t_2}[(p,\dot{q})-H]\,dt \end{equation*} \notag $$

on the space of smooth paths with fixed endpoints in $N$ takes a stationary value at this path.

In fact, to derive equations of extremals we must introduce the new Lagrangian

$$ \begin{equation*} \mathcal{L}=(p,\dot{q})-H(p,q)+\sum\lambda_j F_j(p,q) \end{equation*} \notag $$

and regard $\lambda_1,\dots,\lambda_k$ as additional generalized coordinates. Writing down Lagrange’s equations with Lagrangian $\mathcal{L}$ we obtain equations (3.1) and relations (3.2).

Theorem 3 reduces the Dirac problem to a variational Lagrange problem with degenerate Lagrangian $(p,\dot{q})-H$ and the integrable constraints given by (3.2). Thus, the equations in the Dirac problem are Hamiltonian equations projected onto the tangent planes to the constraint submanifold $N \subset M$. However, the part of Riemannian geometry is taken here by the symplectic geometry of the phase space of the Hamiltonian system.

4. Hamiltonian systems with constraints. Variational approach

4.1.

As in § 2, we look at a Hamiltonian system with differential constraints

$$ \begin{equation} \bigl(a_1(x),dx\bigr)=\dots=\bigl(a_k(x),dx\bigr)=0. \end{equation} \tag{4.1} $$

However, in contrast to § 2, by motions of such systems we mean extremals of an appropriate variational problem.

Again, let $(M^{2n},\Omega^2)$ be a symplectic manifold and $H \colon M \to \mathbb{R}$ be a Hamiltonian. The symplectic structure $\Omega^2$ is a non-degenerate closed 2-form. Locally, $\Omega^2=d\omega^1$, where $\omega^1=(a(x),\dot{x})$ is a smooth 1-form. In the phase space consider the action functional (in the sense of Poincaré and Helmholtz) of the form

$$ \begin{equation} \int_{t_1}^{t_2}\bigl[\bigl(a(x),\dot{x}\bigr)-H(x)\bigr]\,dt \end{equation} \tag{4.2} $$

on the space of curves $x\colon [t_1,t_2] \to M$ with fixed endpoints that satisfy relations (4.1).

We look for extremals in this variational problem. Dirac considered an important but special case when the differential constraints (4.1) are integrable. The above problem is a Lagrange variational problem with non-integrable constraints (4.1) and degenerate Lagrangian

$$ \begin{equation*} L=\bigl(a(x),\dot{x}\bigr)-H(x). \end{equation*} \notag $$

This function is linear in the velocity $\dot{x}$ and contains no terms quadratic in $\dot{x}$ (which are quite common in classical mechanics).

In accordance with the Lagrange multiplier method (for instance, see [6] and [28]), we must introduce the new Lagrangian

$$ \begin{equation*} \mathcal{L}=L-\sum_{s=1}^k \lambda_s(a_s,\dot{x}) \end{equation*} \notag $$

and regard $\lambda_1,\dots,\lambda_k$ as additional coordinates. We write Lagrange’s equations explicitly:

$$ \begin{equation*} \biggl(\frac{\partial\mathcal{L}}{\partial\dot{x}}\biggr)^{\!\boldsymbol{\cdot}}= \frac{\partial\mathcal{L}}{\partial x}\,,\qquad \biggl(\frac{\partial\mathcal{L}}{\partial\dot{\lambda}}\biggr)^{\!\boldsymbol{\cdot}}= \frac{\partial\mathcal{L}}{\partial \lambda}\,. \end{equation*} \notag $$

The second group of equations gives us the constraints (4.1), while the first assumes the following form:

$$ \begin{equation*} \Bigl[a(x)-\sum\lambda_s a_s(x)\Bigr]^{\boldsymbol{\cdot}}= \biggl(\frac{\partial a}{\partial x}\biggr)^{\!*}\dot{x}- \frac{\partial H}{\partial x}-\sum \lambda_s \biggl(\frac{\partial a_s}{\partial x}\biggr)^{\!*}\dot{x}. \end{equation*} \notag $$

We can transform this equation:

$$ \begin{equation} \biggl[\frac{\partial a}{\partial x}- \biggl(\frac{\partial a}{\partial x}\biggr)^{\!*}\,\biggr]\dot{x}- \sum \lambda_s\biggl[\frac{\partial a_s}{\partial x}- \biggl(\frac{\partial a_s}{\partial x}\biggr)^{\!*}\,\biggr]\dot{x}- \sum \dot{\lambda}_s a_s=-\frac{\partial H}{\partial x}\,. \end{equation} \tag{4.3} $$

These $2n$ equations must be considered in combination with the $k$ constraints (4.1). Equations (4.3) and (4.1) are used to find the $2n+k$ variables $x$ and $\lambda$ as functions of time.

Before we move further, we make a few observations.

A. We have introduced above a 1-form $\omega^1$ such that $d\omega^1=\Omega^2$. It is defined up to a closed 1-form (whcih is locally the differential of a smooth function). However, this closed form does not affect extremals of the functional (4.2) and does not participate in (4.3).

B. In Hamiltonian mechanics the skew-symmetric $ 2n\times 2n $ matrix

$$ \begin{equation*} J(x)=\frac{\partial a}{\partial x}- \biggl(\frac{\partial a}{\partial x}\biggr)^{\!*} \end{equation*} \notag $$

is non-singular (because the symplectic structure $\Omega^2 $ is non-degenerate).

C. If the constraints (4.1) are integrable (as in the Dirac problem), then the second group of terms is absent from the left-hand side of (4.3).

D. If $x\colon \Delta \to M$, $\lambda\colon \Delta \to \mathbb{R}^k$ is an arbitrary solution of system (4.3), (4.1), then

$$ \begin{equation*} H\bigl(x(t)\bigr)=\text{const}. \end{equation*} \notag $$

This is the energy integral in our generalization of Dirac’s generalized Hamiltonian dynamics.

4.2.

Our immediate aim is to indicate conditions for the solvability of the system of equations (4.3), (4.1):

$$ \begin{equation} J\dot{x}-\sum\lambda_s \Lambda_s\dot{x}-\sum\dot{\lambda}_s a_s= -\frac{\partial H}{\partial x}\,,\qquad (a_1,\dot{x})=\dots=(a_k,\dot{x})=0. \end{equation} \tag{4.4} $$

Here

$$ \begin{equation*} \Lambda_s=\frac{\partial a_s}{\partial x}- \biggl(\frac{\partial a_s}{\partial x}\biggr)^{\!*},\qquad 1 \leqslant s \leqslant k, \end{equation*} \notag $$

is a set of skew-symmetric matrices.

In the Dirac problem the situation is simpler: $\Lambda_1=\dots=\Lambda_k=0$, so the first equation in (4.4) does not contain $\lambda_1,\dots,\lambda_k$. Then we can set $\mu_s=\dot{\lambda}_s$ and regard $\mu_1,\dots,\mu_k$ as independent additional parameters.

Thus, a function $x \colon [t_1,t_2] \to M$ is a solution of the generalized Dirac problem if and only if there exists a function $\lambda \colon [t_1,t_2] \to \mathbb{R}^k$ such that, in combination, these two functions satisfy equations (4.4). These equations can be presented in an invariant form. To do this we set $\omega_s^1(\,\cdot\,)=(a_s,\,\cdot\,)$; these are differential 1-forms on $M^{2n}$. If $w$ is the required vector field on $M^{2n}$ ($w=\dot{x}$), then

$$ \begin{equation*} i_w\Bigl[\Omega^2-\sum\lambda_s\,d\omega_s^1\Bigr]- \sum\dot{\lambda}_s\omega_s^1=-dH \end{equation*} \notag $$

and

$$ \begin{equation*} \omega_1^1(w)=\dots=\omega_k^1(w)=0. \end{equation*} \notag $$

Here (as usual) $i_w$ denotes the interior product of the vector field $w$ and a differential form. For $k=0$ we obtain the original Hamiltonian system

$$ \begin{equation*} i_v \Omega^2\,(=\Omega^2(v,\,\cdot\,))=-dH. \end{equation*} \notag $$

Theorem 4. If for $x=x^0$ and $\lambda=\lambda^0$ the square matrix of order $2n+k$

$$ \begin{equation} \begin{bmatrix} J-\displaystyle\sum\lambda_s\Lambda_s & a_1,\dots,a_k \\ a_1^*,\dots,a_k^* & 0 \end{bmatrix} \end{equation} \tag{4.5} $$

has a non-zero determinant, then the system of equations (4.4) has a unique solution $t \mapsto x(t)$, $t \mapsto \lambda(t)$ such that $x(0)=x^0$ and $\lambda(0)=\lambda^0$.

As usual, this solution certainly exists in a neighbourhood of $t=0$. We must stress that when the constraints are non-integrable, the motions $t \mapsto x(t)$ of the system for fixed $x(0)$ and different $\lambda(0)$ are distinct. If the constraints are integrable, then it is not so.

Proof of Theorem 4. We represent (4.4) in the following equivalent form:

$$ \begin{equation} \begin{bmatrix} J-\displaystyle\sum\lambda_s\Lambda_s & a_1,\dots,a_k \\ a_1^*,\dots,a_k^* & 0 \end{bmatrix} \begin{bmatrix} \dot{x} \\ \dot{\lambda}\end{bmatrix}= \begin{bmatrix} -\dfrac{\partial H}{\partial x} \\ 0\end{bmatrix}. \end{equation} \tag{4.6} $$

If the determinant of (4.5) is distinct from zero at the point $(x^0,\lambda^0) \in M^{2n}\times \mathbb{R}^k$, then (4.6) defines a dynamical system in a neighbourhood of this point. In particular, the values $x(0)=x^0$ and $\lambda(0)=\lambda^0$ determine uniquely the evolution of this system in the future, as required. $\Box$

Corollary. If the matrix (2.5) is non-singular for $x=x^0$, then the system (4.4) has a unique solution $x(t)$, $\lambda(t)$ satisfying the initial conditions $x(0)=x^0$ and $\lambda(0)=0$.

In fact, if the $k \times k$ matrix (2.5) is not singular, then for $\lambda_1=\dots=\lambda_k=0$ the determinant of (4.5) is distinct from zero.

Now we briefly discuss conditions for the solvability of (4.4) in the case when the matrix (4.5) is singular. The condition for the existence of velocities $\dot{x}$ and $\dot{\lambda}$ satisfying (4.6) reduces to the equality of the ranks of two matrices: the matrix (4.5) and the matrix

$$ \begin{equation} \begin{bmatrix} J-\displaystyle\sum\lambda_s\Lambda_s & a & -\dfrac{\partial H}{\partial x} \\ a^* & 0 & 0\end{bmatrix} \end{equation} \tag{4.7} $$

of size $(2n+k)\times(2n+k+1)$. Typically, the matrix (4.5) has a constant rank in some domain in the extended phase space $M^{2n} \times \mathbb{R}^k$. Let this rank be $r<2n+k$. Then all minors of the matrix (4.7) of order $r+1$ or higher must vanish. As a result, we obtain several independent relations

$$ \begin{equation} \Phi_1(x,\lambda)=\dots=\Phi_p(x,\lambda)=0 \end{equation} \tag{4.8} $$

as a condition for the equality of the ranks of the matrices (4.5) and (4.7). By analogy with the Dirac dynamics they can be called secondary constraints. We must bear in mind that in Dirac’s theory secondary connections do not depend on the miltipliers $\lambda$.

Relations (4.8) are integrable constraints for the Lagrangian system with $n+k$ degrees of freedom and degenerate Lagrangian $\mathcal{L}(\dot{x},x,\lambda)$. The new system of equations has the form of Lagrange’s equations with additional multipliers $\nu_1,\dots,\nu_p$:

$$ \begin{equation*} \biggl(\frac{\partial\mathcal{L}}{\partial\dot{x}}\biggr)^{\!\boldsymbol \cdot}- \frac{\partial\mathcal{L}}{\partial x}= \sum\nu_q\frac{\partial\Phi_q}{\partial x}\,,\qquad \biggl(\frac{\partial\mathcal{L}}{\partial\dot{\lambda}}\biggr)^{\!\boldsymbol \cdot}- \frac{\partial\mathcal{L}}{\partial \lambda}= \sum\nu_q\frac{\partial\Phi_q}{\partial \lambda}\,. \end{equation*} \notag $$

Of course, equations (4.8) must be added here.

As a result, we arrive at equations that also have the form (4.4). They must also be examined for solvability with respect to $\dot{x}$, $\dot{\lambda}$, and $\nu$. Conditions for consistency can produce new constraints. This procedure is repeated as in Dirac’s dynamics.

4.3.

In conclusion, we present an example which is of interest from the standpoint of mechanics and in which the matrix (4.5) is singular. It is similar to the example in § 2, where a Hamiltonian field was projected onto the planes of a non-integrable distribution. However, now we start with a variational problem which is a generalization of the one considered by Dirac.

So assume that $M^{2n}=T^*Q^n$ is endowed with the standard symplectic structure (as in § 2), let

$$ \begin{equation*} H=\frac{1}{2}\bigl(K(q)p,p\bigr)+V(q) \end{equation*} \notag $$

be a ‘natural’ Hamiltonian, and let the constraints have the same form as in § 2:

$$ \begin{equation} \bigl(c_1(q),dq\bigr)=\dots=\bigl(c_k(q),dq\bigr)=0,\qquad k<n. \end{equation} \tag{4.9} $$

Then equations (4.4) assume the following form:

$$ \begin{equation} \Bigl(p-\sum\lambda_j c_j\Bigr)^{\boldsymbol \cdot}=-\frac{\partial H}{\partial q}- \sum\lambda_j\biggl(\frac{\partial c_j}{\partial q}\biggr)^{\!*}\dot{q},\qquad \dot{q}=\frac{\partial H}{\partial p}\,. \end{equation} \tag{4.10} $$

These equations must be considered in combination with the equations of constraints (4.9).

As in § 2, it is useful to go over to Lagrange’s equations by means of the Legendre transform. Then

$$ \begin{equation*} L=\frac{1}{2}\bigl(K^{-1}(q)\dot{q},\dot{q}\bigr)-V(q),\qquad \frac{\partial H}{\partial q}=-\frac{\partial L}{\partial q}\,, \end{equation*} \notag $$

and the first equation in (4.10) looks as follows:

$$ \begin{equation*} \frac{d}{dt}\,\frac{\partial L}{\partial \dot{q}}-\frac{\partial L}{\partial q}= \sum\lambda_j\biggl[\frac{\partial c_j}{\partial q}- \biggl(\frac{\partial c_j}{\partial q}\biggr)^{\!*}\,\biggr]\dot{q}+ \sum\dot{\lambda}_j c_j. \end{equation*} \notag $$

In combination with the constraints, these are the equations of motion of a mechanical system in so-called vaconomic mechanics [29]. They are different from the non-holonomic equations (2.13), although both models start from the same Lagrangian and the same constraints. The difference is in the principles underlying the definitions of motion. A comparison and a discussion of various models of motion of mechanical systems with constraints can be found, for instance, in [14].

In the absence of a potential (when $V \equiv 0$) the equations of vaconomic mechanics are the equations of geodesics in so-called sub-Riemannian geometry [30]–[33]. On the other hand, in sub-Riemannian geometry authors usually assume that the linear constraints (4.9) are completely non-integrable: we can attain any point on $Q^n$ from any other point by moving in accordance with the constraints (4.9). The condition for complete non-integrability is described by the celebrated Chow–Rashevskii theorem.

That non-holonomic and vaconomic equations of geodesics in Riemannian geometry with non-integrable distributions are different was pointed out already by Hertz [34]. On the other hand, it is written in some later texts discussing variational problems with constraints that ‘actual’ problems in mechanics (which are in fact described by non-holonomic equations) are solved in this way.

5. Some examples

In this section we apply Dirac’s approach to some problems in analytic mechanics and geometric optics.

5.1.

Consider the dynamics of a mechanical system with Lagrangian

$$ \begin{equation*} L_2+L_1+L_0, \end{equation*} \notag $$

where $L_j$ is a homogeneous form of degree $j$ in powers of $\dot{q}$ which depends smoothly on the $q$-coordinates. Lagrange’s equations with this Largangian have the energy integral

$$ \begin{equation*} L_2-L_0=\text{const}. \end{equation*} \notag $$

We set this constant equal to zero; this is always possible because $L_0(q)$ is defined up to an additive constant. By the least action priciple trajectories of the motion $t \mapsto q(t)$ are geodesic curves of the Finsler metric

$$ \begin{equation} L=2\sqrt{L_0L_2}+L_1. \end{equation} \tag{5.1} $$

In other words, they deliver extremuma to the action functional

$$ \begin{equation} \int_{t_1}^{t_2}\bigl(2\sqrt{L_0L_2}+L_1\bigr)\,dt \end{equation} \tag{5.2} $$

in the class of curves with fixed endpoints. The Lagrangian (5.1) is parametric: the integral (5.2) is independent on the parametrization of curves.

For more transparent calculations set

$$ \begin{equation*} L_2=\frac{1}{2}\bigl(A(q)\dot{q},\dot{q}\bigr),\quad L_1=\bigl(a(q),\dot{q}\bigr),\quad\text{and}\quad L_0=U(q). \end{equation*} \notag $$

In mechanics $U$ corresponds to the power function. The matrix $A$ is usually positive definite, but this is not necessary. We assume that it is non-singular. In this case the functional (5.2) is not defined at all curves, but rather on the ones lying in an appropriate ‘light cone’. This property is characteristic for the dynamics of relativistic particles.

The Lagrangian (5.1) is degenerate in velocities:

$$ \begin{equation*} \biggl|\frac{\partial^2 L}{\partial\dot{q}^2}\biggr|=0. \end{equation*} \notag $$

The Hamiltonian $H$ dual to $L$ in the sense of Legendre is trivial:

$$ \begin{equation*} H=[(p,\dot{q})-L]_{\dot{q} \to p}= \biggl[\biggl(\frac{\partial L}{\partial\dot{q}}\,,\dot{q}\biggr)- L\biggr]_{\dot{q} \to p}=0 \end{equation*} \notag $$

by Euler’s theorem on homogeneous function. Dirac’s Hamiltonian formalism has been designed just for systems of this type.

Setting

$$ \begin{equation*} p=\frac{\partial L}{\partial\dot{q}}=\sqrt{2U}\, \frac{A\dot{q}}{\sqrt{(A\dot{q},\dot{q})}}+a, \end{equation*} \notag $$

we obtain a primary constraint in the sense of Dirac:

$$ \begin{equation} F(p,q)=\frac{1}{2}\bigl(A^{-1}(p-a),p-a\bigr)-U(q)=0. \end{equation} \tag{5.3} $$

As a result, we arrive at equations (3.1):

$$ \begin{equation} \dot{p}=-\lambda\frac{\partial F}{\partial q}\,,\qquad \dot{q}=\lambda\frac{\partial F}{\partial p}\,, \end{equation} \tag{5.4} $$

where $\lambda$ is a non-trivial indefinite coefficient.

Thus, functions $t \mapsto q(t)$ satisfying equations (5.4), in combination with the ‘conjugate’ momenta $t \mapsto p(t)$, describe extremals of the Finsler metric (4.1). The function $F$ coincides with the Hamiltonian of the original Lagrange’s system with Lagrangian $L_2+L_1+L_0$ (as $L_2-L_0=0$, $F$ also vanishes on solutions of (5.4)). Since $H \equiv 0$, system (3.3) is always consistent, and there is no need to introduce secondary constraints.

Thus we have deduced the least action principle by using Dirac’s generalized dynamics. Equations (5.4) are transformed into usual Hamiltonian equations with respect to the new time

$$ \begin{equation} d\tau=\lambda\,dt. \end{equation} \tag{5.5} $$

5.2.

Consider the particular case when

$$ \begin{equation} L=\sqrt{\dot{x}_1^2+\dot{x}_2^2+\dot{x}_3^2}\,n(x),\qquad x \in \mathbb{R}^3. \end{equation} \tag{5.6} $$

The integral

$$ \begin{equation} \int_{t_1}^{t_2}L\,dt \end{equation} \tag{5.7} $$

is the optical length of the path $x\colon [t_1,t_2] \to \mathbb{R}^3$ (the Fermat action), and $n(x)$ is the refraction coefficient on the isotropic optical medium. By Fermat’s principle stationary points of the functional (5.7) are light rays.

Here we have $H=0$ again, and

$$ \begin{equation*} F=\frac{1}{2}\Bigl(\,\sum p_j^2-n^2(x)\Bigr)=0. \end{equation*} \notag $$

Hence equations (5.4) take the form

$$ \begin{equation*} \dot{p}=\lambda\,\frac{\partial}{\partial x}\,\frac{n^2}{2}\,,\qquad \dot{x}=\lambda p. \end{equation*} \notag $$

Going over to the new time (5.5) we arrive at the second-order equation

$$ \begin{equation*} x''=\frac{\partial U}{\partial x}\,,\qquad U=\frac{n^2}{2}\,. \end{equation*} \notag $$

These are the equations of motion of a unit point mass in a potential field with power function $U$. Thus, rays of light in the optical medium $\mathbb{R}^3=\{x\}$ with refraction coefficient $n(x)$ coincide with trajectories of motion of a point mass in a potential field with power function $n^2/2$. This is the content of the optico-mechanical analogy established by J. Bernoulli as long ago as 1696. Usually, authors use other ways to derive Hamiltonian equations in geometric optics (see [5], Chap. I).

6. Light rays in strongly anisotropic media

6.1.

In anisotropic optical media the refraction coefficient has the form

$$ \begin{equation*} f\biggl(x,\frac{\dot{x}}{|\dot{x}|}\biggr),\qquad x \in \mathbb{R}^3. \end{equation*} \notag $$

For an isotropic medium this function is independent of the direction of the motion of particles of light; it is usually denoted by $n(x)$. Light rays coincide with the trajectories of solutions of Lagrange’s equations with Lagrangian

$$ \begin{equation*} L=|\dot{x}|f\biggl(x,\frac{\dot{x}}{|\dot{x}|}\biggr). \end{equation*} \notag $$

The integral

$$ \begin{equation*} I=\int_{t_1}^{t_2}L(\dot{x},x)\,dt \end{equation*} \notag $$

is the optical length of the path $x\colon[t_1,t_2]\to \mathbb{R}^3$ (the Fermat action). Rays of light are extremals of this integral.

Consider an optical medium with Lagrangian

$$ \begin{equation} L_N=\bigl[(\dot{x},\dot{x})n^2(x)+N\bigl(a(x),\dot{x}\bigr)^2\bigr]^{1/2}. \end{equation} \tag{6.1} $$

Here $a(x)\ne 0$ is a smooth vector field in 3-dimensional Euclidean space $\mathbb{R}^3= \{x\}$ and $N$ is a parameter (which will tend to $+\infty$). For $N=0$ we obtain the Lagrangian (5.6) describing the propagation of light in an isotropic medium. In the limit as $N \to \infty$, the motion of particles of light satisfies the constraint

$$ \begin{equation} \bigl(a(x),\dot{x}\bigr)=0, \end{equation} \tag{6.2} $$

which is generally non-integrable.

How can we verify this? In fact the refraction coefficient of an optical medium at a point $x$ in a direction $v/|v|$ is the reciprocal of the velocity $v$ of a particle of light at $x$. So $L(v,x)=1$ on ‘actual’ motions. This is the equation of a surface in $\mathbb{R}^3=\{v\}$ called the indicatrix at $x$. In an isotropic medium the indicatrix is the sphere of radius $1/n(x)$. For the Lagrangian (6.1) the indicatrix is an ellipsoid squeezed in the direction of the vector $a(x)$. So, in the limit as $N \to \infty$, the motion of particles of light satisfies the constraint (6.2).

6.2.

We deduce the ‘limiting’ equations of light rays in two ways. The first is formal. We are based on Fermat’s principle $\delta I=0$ and take the constraint (6.2) into account. To do this we use the method of Lagrange multipliers and then Dirac’s generalized Hamiltonian dynamics.

The second approach is more substantive. We represent the equations of light rays in an optical medium with Lagrangian $L_N$ as Hamilton’s equations, and then let the parameter $N$ tend to infinity. In both approaches light rays are described by the same Hamiltonian equations.

Thus, first (following Lagrange) we introduce the new Lagrangian

$$ \begin{equation*} \mathcal{L}=L+\lambda(a,\dot{x})=|\dot{x}|n(x)+\lambda(a,\dot{x}) \end{equation*} \notag $$

and treat $\lambda$ as a new variable (along with $x_1$, $x_2$, and $x_3$). The Lagrangian $\mathcal{L}$ is degenerate in the velocities $\dot{x}$ and $\dot{\lambda}$.

Consider the canonical momentum

$$ \begin{equation} y=\frac{\partial \mathcal{L}}{\partial \dot{x}}=\frac{\dot{x}}{|\dot{x}|}n(x)+ \lambda a(x) \end{equation} \tag{6.3} $$

(the momentum corresponding to $\lambda$ vanishes) and the Hamiltonian

$$ \begin{equation*} H=(y,\dot{x})-\mathcal{L}=(p-\lambda a,\dot{x})-|\dot{x}|n= \biggl(\frac{\partial L}{\partial \dot{x}}\,,\dot{x}\biggr)-|\dot{x}|n=0. \end{equation*} \notag $$

According to Dirac, the momenta satisfy the primary constraint

$$ \begin{equation} 2F=\frac{(y-\lambda a,y-\lambda a)}{n^2}-1=0. \end{equation} \tag{6.4} $$

Hence the Hamiltonian equations take the form (3.1):

$$ \begin{equation} \dot{x}=\mu\frac{\partial F}{\partial y}\,,\qquad \dot{y}=-\mu\frac{\partial F}{\partial x}\,. \end{equation} \tag{6.5} $$

The coefficient $\mu$ is of no importance because the velocity of particles of light does not affect the shape of light rays we are interested in. Therefore, we can set $\mu=1$. Now, by (6.2) and (6.3)

$$ \begin{equation*} (y-\lambda a,a)=0. \end{equation*} \notag $$

Hence

$$ \begin{equation*} \lambda=\frac{(y,a)}{(a,a)}\,. \end{equation*} \notag $$

As a result, the equations of light rays reduce to the Hamiltonian equations (6.5) (where we must set $\mu=1$), and

$$ \begin{equation} F=\frac{1}{2}\biggl(y-\frac{(y,a)}{(a,a)}a,y-\frac{(y,a)}{(a,a)}a\biggr) \frac{1}{n^2}-\frac{1}{2}\,. \end{equation} \tag{6.6} $$

We must remember that these equations are to be considered on the zero level of the Hamiltonian (for $F=0$).

6.3.

In the framework of the second approach we must transform Lagrange’s equations with Hamiltonian $L_N$ to Hamiltonian ones and then let $N$ tend to infinity. Since $L_N$ is homogeneous of degree one in velocities, we cannot apply the Legendre transform directly. Usually, authors take another way, by introducing the new Lagrangian $L_N^*=L^2_N/2$, which is homogeneous of degree 2 in $\dot{x}$. It is obvious that the Hessian matrix

$$ \begin{equation*} \biggl\|\frac{\partial^2 L_N^*}{\partial \dot{x}^2}\biggr\| \end{equation*} \notag $$

is positive definite for $N \geqslant 0$. Because

$$ \begin{equation*} \delta\int_{t_1}^{t_2}L_N^*\,dt= \int_{t_1}^{t_2}L_N\delta L_N\,dt\quad\text{and}\quad L_N=1 \end{equation*} \notag $$

the function $x(\,\cdot\,)$ is an extremal of the variational problem

$$ \begin{equation*} \delta\int_{t_1}^{t_2}L_N^*\,dt=0. \end{equation*} \notag $$

Hence it solves Lagrange’s equation with Lagrangian $L_N^*$.

Next we introduce the canonical momentum

$$ \begin{equation*} y=\frac{\partial L_N^*}{\partial \dot{x}}= \dot{x}n^2(x)+N\bigl(a(x),\dot{x}\bigr)a. \end{equation*} \notag $$

Hence (for large $N$)

$$ \begin{equation*} n^2\dot{x}=y-\frac{(y,a)}{(a,a)}a+O\biggl(\frac{1}{N}\biggr). \end{equation*} \notag $$

As $L_N^*$ is homogeneous of degree 2 in $\dot{x}$, it follows that

$$ \begin{equation} H_N^*(y,x)=\biggl(\frac{\partial L_N^*}{\partial \dot{x}}\,,\dot{x}\biggr)- L_N^*=L_N^*(\dot{x},x). \end{equation} \tag{6.7} $$

Therefore,

$$ \begin{equation} H^*(y,x)=\lim_{N \to \infty}H_N^*=\frac{1}{2n^2} \biggl(y-\frac{(y,a)}{(a,a)}a,y-\frac{(y,a)}{(a,a)}a\biggr). \end{equation} \tag{6.8} $$

Thus, for $N \to \infty$ the function $x(\,\cdot\,)$ describing a ray of light and the conjugate momentum $y(\,\cdot\,)$ satisfy the system of canonical Hamiltonian equations with Hamiltonian (6.8):

$$ \begin{equation} \dot{x}=\frac{\partial H^*}{\partial y}\,,\qquad \dot{y}=-\frac{\partial H^*}{\partial x}\,. \end{equation} \tag{6.9} $$

This theorem on limit transition, established in 1982 [29], is an example of the realization of the general idea of the method of penalty functions. Since $L_N^*=1/2$ on light rays, equations (6.9) must (in accordance with (6.7)) be considered at the level $H^*=1/2$.

Comparing the Hamiltonians (6.6) and (6.8) we conclude that the above two approaches lead indeed to the same result.

6.4.

Set

$$ \begin{equation} K=\frac{1}{2}\biggl(y-\frac{(y,a)}{(a,a)}a,y-\frac{(y,a)}{(a,a)}a\biggr). \end{equation} \tag{6.10} $$

Then $H^*=K/n^2$.

Now,

$$ \begin{equation} \begin{aligned} \, \dot{y}&=-\frac{\partial K/n^2}{\partial x}= -\frac{1}{n^2}\,\frac{\partial K}{\partial x}+ \frac{K}{n^3}\,\frac{\partial n^2}{\partial x}, \\ \dot{x}&=\frac{\partial K/n^2}{\partial y}= \biggl(y-\frac{(y,a)}{(a,a)}a\biggr)\frac{1}{n^2}\,. \end{aligned} \end{equation} \tag{6.11} $$

The second equation yields the constraint $(a,\dot{x})=0$.

Because we consider the system (6.9) at the level $H^*=1/2$, equations (6.11) can be transformed by use of the relation $K=n^2/2$:

$$ \begin{equation*} n^2\dot{y}=-\frac{\partial K}{\partial x}+ \frac{\partial n^2/2}{\partial x}\,,\qquad n^2\dot{x}=\frac{\partial K}{\partial y}\,. \end{equation*} \notag $$

Switching to the new time $d\tau=n^{-2}\,dt$, we arrive at the system of differential equations

$$ \begin{equation} x'=\frac{\partial \mathcal{H}}{\partial y}\,,\quad y'=-\frac{\partial \mathcal{H}}{\partial x}\,;\qquad \mathcal{H}=K-U, \end{equation} \tag{6.12} $$

where the ‘kinetic’ energy $K$ is defined by (6.10) and the power function $U$ is $n^2/2$.

Consider the field $\alpha=a/\sqrt{(a,a)}$ of vectors with unit length. Then $y-(y,\alpha)\alpha=y_\alpha$ is the projection of a momentum onto the direction specified by the vector $\alpha$. The second equation in (6.11) takes the form

$$ \begin{equation*} x'=y_\alpha. \end{equation*} \notag $$

Hence by (6.10)

$$ \begin{equation*} K=\frac{1}{2}(x',x'). \end{equation*} \notag $$

This is the kinetic energy of a unit point mass moving in Euclidean space.

Theorem 5. In a strongly anisotropic optical medium light rays are trajectories of the motion of a vaconomic particle in a potential field with power function $n^2/2$.

The proof is based on the observation that the Hamiltonian equations (6.12) with kinetic energy (6.10) are in fact the vaconomic equations of motion, under the constraint $\bigl(\alpha(x),\dot{x}\bigr)=0$, of an unconstrained unit point mass in a potential field with power function $U=n^2/2$ (cf. [29]).

Theorem 5 extends Bernoulli’s optico-mechanical analogy to strongly anisotropic optical media. We stress that, after imposing the constraint, we obtain a vaconomic mechanical system, rather than a familiar non-holonomic system. The reason is that light rays are defined by Fermat’s variational principle, while classical non-holonomic equations admit no variational representation at all in the general case.

7. Final observations

7.1.

In fact, the main ideas of Dirac’s generalized Hamiltonian dynamics had already been known in analytic mechanics much earlier. They were long used in other terms (and with less generality) in Lagrangian mechanics with integrable (holonomic) constraints. Recall the main principles (going back to Lagranges) of the account for holonomic constraints.

Let $L(\dot{q},q)$ be a Lagrangian and $F(q)=0$ be an equation of integrable constraint. Assume that $dF \ne 0$ if $F=0$. The dynamics of this systems is described by Lagrange’s equation with a multiplier

$$ \begin{equation} \frac{d}{dt}\,\frac{\partial L}{\partial \dot{q}}- \frac{\partial L}{\partial q}=\lambda\frac{\partial F}{\partial q}\,,\qquad F(q)=0. \end{equation} \tag{7.1} $$

The term on the right $\lambda\,\partial F/\partial q$ is called the reaction of the constraint $\{F=0\}$ in mechanics. Can we find a coefficient $\lambda$ such that system (7.1) is consistent?

The standard way is as follows. Since $F\bigl(q(t)\bigr)=0$, it follows that

$$ \begin{equation} \Phi=\biggl(\frac{\partial F}{\partial q}\,,\dot{q}\biggr)=0. \end{equation} \tag{7.2} $$

Now, the total time derivative $\dot{\Phi}$ must also vanish:

$$ \begin{equation} \dot{\Phi}=\biggl(\frac{\partial F}{\partial q}\,,\ddot{q}\biggr)+ \biggl(\frac{\partial^2 F}{\partial q^2}\dot{q},\dot{q}\biggr)=0. \end{equation} \tag{7.3} $$

Equation (7.1) has the following explicit form:

$$ \begin{equation} A\ddot{q}+\Gamma(\dot{q},q)=\lambda\frac{\partial F}{\partial q}\,, \end{equation} \tag{7.4} $$

where $A$ is the matrix of second derivatives $\partial^2 L/\partial \dot{q}^2$, and the term $\Gamma$ is independent of the acceleration $\ddot{q}$. In mechanics $A$ is positive definite; in particular, it is non-singular. Then from (7.3) and (7.4) we obtain

$$ \begin{equation} \lambda\biggl(A^{-1}\frac{\partial F}{\partial q}\,, \frac{\partial F}{\partial q}\biggr)-\biggl(\frac{\partial F}{\partial q}\,, A^{-1}\Gamma\biggr)+\biggl(\frac{\partial^2 F}{\partial q^2}\, \dot{q},\dot{q}\biggr)=0. \end{equation} \tag{7.5} $$

We have assumed that $\partial F/\partial q \ne 0$ (in any case, at the points where $F(q)=0$). Hence from (7.5) we can uniquely find $\lambda$ as a function of the state ($\dot{q}$, $q$) of the system.

The first equation (7.1) is a second-order differential equation with respect to $q$, and the function (7.2) is a first integral there. We are only interested in solutions of that equation on which the functions $F$ and $\Phi$ vanish.

7.2.

How does it all look like from the standpoint of Dirac’s generalized Hamiltonian formalism? If the matrix $A$ is non-singular (at least locally), then we can make the Legentre transform and introduce the canonical momenta and a Hamiltonian:

$$ \begin{equation*} p=\frac{\partial L}{\partial \dot{q}}\,,\quad H(p,q)=\bigl[(p,\dot{q})-L\bigr]_{\dot{q} \to p}. \end{equation*} \notag $$

After that equation (7.1) assumes the Hamiltonian form

$$ \begin{equation} \dot{q}=\frac{\partial H}{\partial p}\,,\quad \dot{p}=-\frac{\partial H}{\partial q}- \lambda\frac{\partial F}{\partial q}\,;\qquad F(q)=0. \end{equation} \tag{7.6} $$

These are precisely Dirac’s equations (3.1) (as the constraint is independent of momenta). The function $F$ defines a primary constraint.

Now, to find $\lambda$ we must use the consistency condition (3.3):

$$ \begin{equation*} \dot{F}=\{F,H\}=0. \end{equation*} \notag $$

So that we obtain here the secondary constraint

$$ \begin{equation*} \Phi=\{F,H\}=0. \end{equation*} \notag $$

This is the same as equation (7.2), but presented in canonical variables.

Next (in accordance with Dirac), we must treat the relations $F=0$ and $\Phi=0$ as new ‘primary’ constraints, and in place of (7.6) we must write the new equations

$$ \begin{equation*} \dot{q}=\frac{\partial H}{\partial p}+ \mu\frac{\partial \Phi}{\partial p} \quad\text{and}\quad \dot{p}=-\frac{\partial H}{\partial q}- \lambda\frac{\partial F}{\partial q}- \mu\frac{\partial \Phi}{\partial q}\,;\qquad F=\Phi=0. \end{equation*} \notag $$

Now the consistency conditions (3.3) look as follows:

$$ \begin{equation} \{F,H\}+\mu\{F,\Phi\}=0\quad\text{and}\quad \{\Phi,H\}+\lambda\{\Phi,F\}=0. \end{equation} \tag{7.7} $$

If $\{F,\Phi\} \ne 0$, then from these relation we find the coefficients $\lambda$ and $\mu$ in a unique way.

Since

$$ \begin{equation*} \Phi=\biggl(\frac{\partial F}{\partial q}\,, \frac{\partial H}{\partial p}\biggr), \end{equation*} \notag $$

it follows that

$$ \begin{equation*} \frac{\partial \Phi}{\partial p}=\frac{\partial^2 H}{\partial p^2}\, \frac{\partial F}{\partial q}\,. \end{equation*} \notag $$

Therefore,

$$ \begin{equation} \{F,\Phi\}=\biggl(\frac{\partial F}{\partial q}\,, \frac{\partial \Phi}{\partial p}\biggr)= \biggl(\frac{\partial^2 H}{\partial p^2}\,\frac{\partial F}{\partial q}\,, \frac{\partial F}{\partial q}\biggr). \end{equation} \tag{7.8} $$

It is well known from the theory of the Legendre transform that the matrices

$$ \begin{equation*} \frac{\partial^2 L}{\partial \dot{q}^2}\quad\text{and}\quad \frac{\partial^2 H}{\partial p^2} \end{equation*} \notag $$

are mutually inverse. If (as it occurs in mechanics) $A=\partial^2 L/\partial \dot{q}^2$ is positive definite, then so is $\partial^2 H/\partial p^2$. However, then it follows from (7.8) that $\{F,\Phi\} \ne 0$.

Since $\Phi=\{F,H\}=0$ (a secondary constraint), the first consistency condition in (7.7) yields $\mu=0$. It is easy to see that the second equation in (7.7) is just equation (7.5) in canonical variables.

7.3.

In § 2 we considered the example of a ‘natural’ Lagrangian system with non-integrable constraint

$$ \begin{equation} \bigl(a(q),\dot{q}\bigr)=0. \end{equation} \tag{7.9} $$

Lagrange’s equations can be presented in Hamiltonian variables. If we treat (7.9) as a distribution in the phase space of the Hamiltonian system, then the symplectic projection onto the tangent hyperplane (7.9) yields classical non-holonomic equations.

On the other hand, if $H(p,q)$ is a Hamiltonian, then $\dot{q}=\partial H/\partial p$ and equation (7.9) takes the form

$$ \begin{equation*} F(p,q)=\biggl(a(q),\frac{\partial H}{\partial p}\biggr)=0. \end{equation*} \notag $$

This can be regarded as a primary constraint imposed on the system with Hamiltonian $H$. What equations in Largange’s representation does Dirac’s method produce? We could imagine that these would be the same non-holonomic equation, but this is not so.

Consider the simple example of the Hamiltonian system with Hamiltonian

$$ \begin{equation*} H=\frac{1}{2}(p,p)+V(q). \end{equation*} \notag $$

Let

$$ \begin{equation} F(p,q)=\bigl(a(q),p\bigr)=0 \end{equation} \tag{7.10} $$

be a primary constraint. As (by the Hamiltonian equations) $\dot{q}= p$, (7.10) coincides with (7.9).

Dirac’s equations (3.1) assume the form

$$ \begin{equation*} \dot{q}=p+\lambda a,\qquad \dot{p}=-\frac{\partial V}{\partial q}- \lambda\biggl(\frac{\partial a}{\partial q}\biggr)^{\!*}p. \end{equation*} \notag $$

However, then

$$ \begin{equation} \ddot{q}=-\frac{\partial V}{\partial q}+\dot{\lambda}a+ \lambda\biggl[\frac{\partial a}{\partial q}- \biggl(\frac{\partial a}{\partial q}\biggr)^{\!*}\,\biggr]\dot{q}. \end{equation} \tag{7.11} $$

To this equation we must add the constraint $(a(q),\dot{q})=0$. As a result, we obtain a vaconomic equation of motion of a particle in a potential field with non-integrable constraint.

Equation (7.11) is the equation of extremals in the Lagrange variational problem

$$ \begin{equation*} \delta\int_{t_1}^{t_2}\biggl[\frac{1}{2}(\dot{q},\dot{q})-V(q)\biggr]\,dt=0 \end{equation*} \notag $$

with constraint (7.9). It can be represented in the canonical Hamiltonian form encountered in (6.12) above, namely,

$$ \begin{equation*} \dot{x}=\frac{\partial \mathcal{H}}{\partial y}\,,\qquad \dot{y}=\frac{\partial \mathcal{H}}{\partial x}\,, \end{equation*} \notag $$

if we set $x=q$, $y=\dot{q}+\lambda a(x)$, and

$$ \begin{equation*} \mathcal{H}=\frac{1}{2}\biggl|y-\frac{(y,a)}{(a,a)}a\biggr|^2+V(x). \end{equation*} \notag $$

It is clear why we obtain vaconomic equations in place of the more common non-holonomic ones: Dirac’s equations (3.1) have the variational nature, and this, of course, manifests itself in the Lagrangian representation for equations of motion with constraints.



Bibliography

1.	P. A. M. Dirac, “Generalized Hamiltonian dynamics”, Canad. J. Math., 2:2 (1950), 129–148
2.	P. A. M. Dirac, “Generalized Hamiltonian dynamics”, Proc. Roy. Soc. London Ser. A, 246 (1958), 326–332
3.	J. L. Anderson and P. G. Bergmann, “Constraints in covariant field theories”, Phys. Rev. (2), 83:5 (1951), 1018–1025
4.	B. V. Medvedev, “P. A. M. Dirac and the logical foundations of quantum theory. II”: P. A. M. Dirac, Collected works, v. III, Fizmatlit, Moscow, 2004, 649–650 (Russian)
5.	V. V. Kozlov, General theory of vortices, 2nd revised and augmented ed., Institute for Computer Studies, Moscow–Izhevsk, 2013, 324 с. ; English transl. of 1st ed. Dynamical systems X. General theory of vortices, Encyclopaedia Math. Sci., 67, Springer-Verlag, Berlin, 2003, viii+184 pp.
6.	L. C. Young, Lectures on the calculus of variations and optimal control theory, W. B. Saunders Co., Philadelphia–London–Toronto, ON, 1969, xi+331 pp.
7.	A. J. Hanson, T. Regge, and C. Teitelboim, Constrained Hamiltonian systems, Accad. Naz. Lincei, Rome, 1976, 135 pp.
8.	V. V. Nesterenko and A. M. Chervyakov, “Some properties of constraints in theories with degenerate Lagrangians”, Theoret. and Math. Phys., 64:1 (1985), 701–707
9.	L. Faddeev and R. Jackiw, “Hamiltonian reduction of unconstrained and constrained systems”, Phys. Rev. Lett., 60:17 (1988), 1692–1694
10.	B. M. Barbashov, “Hamiltonian formalism for Lagrangian systems with prescribed constraints”, Fiz. Elementarnykh Chastits i Atom. Yadra, 34:1 (2003), 5–42 (Russian)
11.	G. D. Birkhoff, Dynamical systems, Amer. Math. Soc. Colloq. Publ., 9, Amer. Math. Soc., New York, 1927, viii+295 pp.
12.	E. Newman and P. G. Bergmann, “Lagrangians linear in the “velocities””, Phys. Rev. (2), 99:2 (1955), 587–592
13.	F. A. Berezin, “Hemiltonian formalizm in the general Lagrange problem”, Uspekhi Mat. Nauk, 29:3(177) (1974), 183–184 (Russian)
14.	V. I. Arnold, V. V. Kozlov, and A. I. Neĭshtadt, Mathematical aspects of classical and celestial mechanics, Encyclopaedia Math. Sci., 3, Dynamical systems. III, 3rd ed., Springer-Verlag, Berlin, 2006, xiv+518 pp.
15.	M. V. Deryabin, “The Dirac Hamiltonian formalism and the realization of constraints by small masses”, J. Appl. Math. Mech., 64:1 (2000), 35–39
16.	D. R. Merkin, Giroscopic systems, 2nd ed., Nauka, Moscow, 1974, 344 pp. (Russian)
17.	V. V. Strygin and V. A. Sobolev, Separation of motions by the method of integral manifolds, Nauka, Moscow, 1988, 256 pp. (Russian)
18.	A. Yu. Ishlinskii, Mechanics of gyroscolic systems, 2nd ed., USSR Academy of Sciences publishing house, Moscow, 1963, 482 pp. (Russian)
19.	V. V. Kozlov, “The dynamics of systems with large gyroscopic forces and the realization of constraints”, J. Appl. Math. Mech., 78:3 (2014), 213–219
20.	A. V. Vlakhova, “Nonholonomic motions of gyroscopic and wheeled systems”, Moscow Univ. Mech. Bull., 68:5 (2013), 126–132
21.	A. V. Vlakhova, “The dynamics of systems with rolling and gyroscopic systems with small generalized velocities and the realization of constraints”, J. Appl. Math. Mech., 78:6 (2014), 568–579
22.	E. Cartan, Leçons sur les invariants intégraux, Hermann, Paris, 1922, x+210 pp.
23.	A. F. Filippov, “Differential equations with discontinnuous right-hand side”, Amer. Math. Soc. Transl. Ser. 2, 42, Amer. Math. Soc., Providence, RI, 1964, 199–231
24.	V. I. Utkin, Sliding modes and their application in variable structure systems, Mir, Moscow, 1978, 257 pp.
25.	P. Appell, Traité de mécanique rationnelle, v. I, 5-e éd., Gauthier-Villars, Paris, 1926, 619 pp. ; v. II, 6-e éd., 1953, 575 pp.
26.	V. V. Kozlov, “On the integration theory of equations of nonholonomic mechanics”, Regul. Chaotic. Dyn., 7:2 (2002), 161–176
27.	V. V. Kozlov, “Quadratic conservation laws for equations of mathematical physics”, Russian Math. Surveys, 75:3 (2020), 445–494
28.	G. A. Bliss, Lectures on the calculus of variations, Univ. of Chicago Press, Chicago, IL, 1946, ix+296 pp.
29.	V. V. Kozlov, “Dynamics of systems with nonintegrable constraints. I”, Moscow Univ. Mech. Bull., 37:3-4 (1982), 27–34 ; II, 37:3-4 (1982), 74–80 ; III, 38:3 (1983), 40–51 ; IV. Integral principles, 42:5 (1987), 40–49 ; V. Freedom principle and ideal constraints condition, 43:6 (1988), 23–29
30.	R. S. Strichartz, “Sub-Riemannian geometry”, J. Differential Geom., 24:2 (1986), 221–263
31.	P. A. Griffiths, Exterior differential systems and the calculus of variations, Progr. Math., 25, Birkhäuser, Boston, MA, 1983, ix+335 pp.
32.	A. M. Vershik and V. Ya. Gershkovich, “Nonholonomic dynamical systems, geometry of distributions and variational problems”, Dynamical systems VII, Encyclopaedia Math. Sci., 16, Springer, Berlin, 1994, 1–81
33.	A. A. Agrachev, “Topics in sub-Riemannian geometry”, Russian Math. Surveys, 71:6 (2016), 989–1019
34.	H. Hertz, Die Principien der Mechanik in neuem Zusammenhange dargestellt, Gesammelte Werke, III, J. A. Barth, Leipzig, 1894, xxix+312 pp.

Citation: V. V. Kozlov, “On Dirac's generalized Hamiltonian dynamics”, Russian Math. Surveys, 79:4 (2024), 649–681

Citation in format AMSBIB

\Bibitem{Koz24}

\by V.~V.~Kozlov

\paper On Dirac's generalized Hamiltonian dynamics

\jour Russian Math. Surveys

\yr 2024

\vol 79

\issue 4

\pages 649--681

\mathnet{http://mi.mathnet.ru/eng/rm10183}

\crossref{https://doi.org/10.4213/rm10183e}

\mathscinet{https://mathscinet.ams.org/mathscinet-getitem?mr=4831497}

\adsnasa{https://adsabs.harvard.edu/cgi-bin/bib_query?2024RuMaS..79..649K}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=001386665900003}

\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85211317794}

Linking options:

https://www.mathnet.ru/eng/rm10183

https://doi.org/10.4213/rm10183e

https://www.mathnet.ru/eng/rm/v79/i4/p95

This publication is cited in the following 5 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Statistics & downloads:
Abstract page:	651
Russian version PDF:	60
English version PDF:	73
Russian version HTML:	151
English version HTML:	233
References:	75
First page:	35

Registration to the website

Logotypes