Russian Mathematical Surveys
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor
Submit a manuscript

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Uspekhi Mat. Nauk:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Russian Mathematical Surveys, 2023, Volume 78, Issue 3, Pages 443–499
DOI: https://doi.org/10.4213/rm10103e
(Mi rm10103)
 

This article is cited in 3 scientific papers (total in 4 papers)

Dynamics of metrics in measure spaces and scaling entropy

A. M. Vershikabc, G. A. Veprevbd, P. B. Zatitskiiae

a St. Petersburg Department of Steklov Mathematical Institute of Russian Academy of Sciences
b Saint Petersburg State University
c Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute)
d University of Geneva, Geneva, Switzerland
e University of Cincinnati, Cincinnati, OH, USA
References:
Abstract: This survey is dedicated to a new direction in the theory of dynamical systems, the dynamics of metrics in measure spaces and new (catalytic) invariants of transformations with invariant measure. A space equipped with a measure and a metric which are naturally consistent with each other (a metric triple, or an mm-space) defines automatically the notion of its entropy class, thus allowing one to construct a theory of scaling entropy for dynamical systems with invariant measure, which is different and more general in comparison to the Shannon–Kolmogorov theory. This possibility was hinted at by Shannon himself, but the hint went unnoticed. The classification of metric triples in terms of matrix distributions presented in this paper was proposed by Gromov and Vershik. We describe some corollaries obtained by applying this theory.
Bibliography: 88 titles.
Keywords: metric triple, mm-entropy, matrix distributions, catalytic invariants, scaling entropy of ergodic transformations.
Received: 08.03.2023
Bibliographic databases:
Document Type: Article
UDC: 519.387
Language: English
Original paper language: Russian

Chapter 1. The category of metric measure spaces. Historical overview and brief summary

1.1. Metric measure spaces

1.1.1. Measures and metrics: general considerations

The joint consideration of a measure and a metric in one space has a long tradition. However, it was, of course, preceded by a long period of formation of the concepts of metric (and topological) spaces, metrization (Hausdorff, Urysohn, and others), and the corresponding notions of measure spaces (Lebesgue, von Neumann, Kolmogorov, Rokhlin, and others; 1900–1940). In 1930–1950, both structures were already considered simultaneously (Oxtoby, Ulam, A. D. Alexandrov, and others), but still as quite separate entities. It is worth noting that some mathematicians, such as Bourbaki (see his Integration volume, 1960s), held the view that measure spaces as a separate structure do not exist at all, and there are only various procedures for integrating functions. Such a one-sided position led Bourbaki to ignore certain branches of mathematics, such as ergodic theory, the theory of $\sigma$-algebras, probability concepts, and so on. For example, the lack of understanding that the theory of integration is one for all metric spaces hindered the development of measure theory in function spaces. On the other hand many combinatorial constructions in measure theory and ergodic theory were not in demand and were not used in classical analysis due to the existence of a separating wall between them. Ignoring the structure itself and the category of measure spaces is the reason why the most substantial and useful part of measure theory — the geometry of various configurations of $\sigma$-subalgebras (measurable partitions) — remains little known and insufficiently developed.

1.1.2. Metric triples and mm-spaces

A new period began with the works of Gromov summarized in his book [14]. This book presents, in particular (see Chap. $3\frac{1}{2}$), a systematic study of so-called mm-spaces, that is, spaces equipped with both a metric and a measure. An important viewpoint, expressed by Gromov and simultaneously by Vershik [59], [57], was as follows: in contrast to the classical approach, which deals with various Borel measures on a fixed complete metric space, they proposed to consider various metrics on a fixed measure space (the Lebesgue–Rokhlin space). In [14] this approach was called the ‘reversed definition of mm-spaces’. This point of view was consistently pursued in [48]–[51], [78], [80], and [82]–[87], and we present it in this survey. Namely, a theory of metric triples $(X,\mu,\rho)$ — a space, a measure, and a metric — was developed. Here the metric space $(X,\rho)$ was assumed to be complete and separable, while the measure space $(X,\mu)$ was assumed to be a Lebesgue–Rokhlin space (in the case of continuous measure, it is isomorphic mod 0 to the interval $[0,1]$ with the Lebesgue measure, or to a countable product of two-point spaces with the Haar measure), the structures of these spaces being naturally consistent with each other (for details, see § 2.1.2). Gromov proved a classification theorem for such triples with respect to the group of measure-preserving isometries, while Vershik provided a version of this theorem with an explicit description of invariants of metric triples, namely, so-called matrix distributions, which are measures on the space of distance matrices. For more details, see [14], [55], [56] and Chap. 2 of this survey.

One of the main advantages of this new focus was the ability to consider the dynamics of metrics with respect to the group of measure-preserving automorphisms, which opened up a new source of invariants related to this dynamics, for dynamical systems with invariant measures, such as scaling entropy. This is the primary topic of this survey; a brief summary of results is provided in subsequent sections of the first chapter.

Transferring the centre of gravity from metric to measure, on the one hand, simplifies the study of the measure-metric pair — since it is well known that, up to a measure-preserving isomorphism, there exists a unique complete separable space with continuous measure defined on the whole $\sigma$-algebra of sets: namely, this is the unit interval with the Lebesgue measure, or, equivalently, a countable product of two-point spaces with the Haar measure. Therefore, in fact we can regard the entire possible arsenal of mm-structures as a set of metrics on a universal measure space $(X,\mu)$. However, we must acknowledge first of all that the metric $\rho$ is merely a class of almost everywhere coinciding measurable functions of two variables satisfying the well-known axioms almost everywhere (that is, mod 0). Such an object was called an almost metric. On the other hand there is a correction theorem (Theorem 2.2), which always allows us to find a set of full measure on which this almost metric can be corrected to become a true semimetric. Here the only compatibility requirement for a metric and a measure is that the metric is separable, meaning that the $\sigma$-algebra of sets generated by all balls of positive radius is dense in the $\sigma$-algebra of mod 0 classes of measurable sets (see Theorem 2.4). Triples $(X, \mu, \rho)$ — a space, a measure, and a metric — satisfying the separability property mod 0 are said to be admissible (or just called metric triples); they are the object of study in the second chapter. In Theorems 2.17 and 2.18 we present numerous equivalent formulations of the admissibility property. Among others, an important criterion for the admissibility of a triple $(X, \mu, \rho)$ is that the so-called $\varepsilon$-entropy of this triple is finite for every positive $\varepsilon$, where the $\varepsilon$-entropy is the logarithm of the number of balls of radius $\varepsilon$ that cover the entire space except for a set of measure at most $\varepsilon$.

Identifying almost metrics coinciding almost everywhere, we can always assume that the metric space is complete (is a Polish space, using the commonly accepted terminology). Indeed, if the space is not complete, then we can consider its completion and extend the measure to this completion; the difference between the completion and the original space will be of measure zero.

For metric triples classical theorems, stated usually under very modest assumptions, can be generalized to very general statements. For instance, an elaboration of the well-known Luzin theorem on the continuity of a measurable function is the following theorem: any two admissible metrics are topologically equivalent (that is, homeomorphic to each other) on a set of measure arbitrarily close to $1$ (see Theorems 2.5 and 2.6).

1.1.3. Classification of metric triples (and mm-spaces) and matrix distributions

In § 2.3 we outline the proof of the classification theorem for metric triples, so here we provide only the precise formulation of the theorem about matrix distributions and their characterization. For the possibility of classifying metric spaces, see [34].

Consider the set of metric triples $(X,\mu,\rho)$, where $\mu$ is a non-degenerate continuous measure (that is, its support coincides with $X$). Recall that we assume that the metric space $(X,\rho)$ is complete and separable and $(X,\mu)$ is a Lebesgue space with continuous measure.

The classification of mm-spaces with respect to measure-preserving isometries was provided by Gromov [14] and Vershik [61]. In Gromov’s formulation, a complete invariant is a set of naturally consistent random matrices (for each positive integer $n$ we consider the distances between $n$ points chosen randomly and independently, in accordance with the given distribution). Actually, the proof relies on the method of moments and Weierstrass’s approximation theorem. Vershik’s proof involves the notion of the matrix distribution of a metric (or, more generally, of a measurable function of several variables); see below.

Theorem 1.1 (matrix distribution as an invariant of metric triples). A complete system of invariants of a metric triple $(X,\mu,\rho)$ with non-degenerate measure with respect to the group of all measure-preserving almost isometries is provided by the matrix distribution $\mathfrak{D}_\infty = \mathfrak{D}_\infty(X,\mu,\rho)$, that is, the probability measure on the space of infinite distance matrices that is the image of the Bernoulli measure $(X^{\infty},\mu^{\infty})$ under the map

$$ \begin{equation*} \{x_i\}_i\mapsto \{\rho(x_i,x_j)\}_{i,j}. \end{equation*} \notag $$

The proof of Theorem 1.1 relies on the pointwise ergodic theorem and the properties of completions of metric spaces.

In connection with this proof, the following question arises: how can one describe matrix distributions as measures on the set of distance matrices? This question is discussed in detail in § 2.3.2. It turns out that these measures can be described using a special notion of simplicity of a measure. This notion, as well as related considerations, is of general nature and can be applied not only to the classification of metrics, but also to the classification of arbitrary measurable functions of several variables (see [40], [58], [70], and [71]). Matrix distributions give rise to a whole range of problems in measure theory and learning theory; in particular, in this paper we consider the problem of reconstructing a metric and a measure in a space from randomized tests, in which an important role is played by entropy and spectra.

1.1.4. Spectral equivalence of metric triples

Let us state an important problem with applications to learning theory and spectral graph theory, which arises simultaneously with the definition of a matrix distribution. With the matrix distribution of an admissible metric we associate the system of spectra of principal minors of random distance matrices. Recall that it is a system of interlacing sequences of real numbers

$$ \begin{equation*} \{\lambda_1^n(\omega) \geqslant \lambda_2^n(\omega) \geqslant\cdots \geqslant \lambda_n^n(\omega)\}, \qquad n=1,2,\dots\,. \end{equation*} \notag $$
This system can be regarded as a random infinite triangular matrix. Does this random matrix determine the original matrix distribution and, thus, the original metric? A similar question is to what extent the spectrum of an operator that arises naturally in considerations of a geometric object determines the object itself. Kac’s well-known question “Can one hear the shape of a drum?” is precisely of this kind: can a Riemannian manifold be uniquely recovered from the spectrum of the Laplace operator? In general, the answer to this question is in the negative. Another example from the early history of ergodic theory is whether the spectrum of the Koopman operator constructed from a measure-preserving transformation determines the transformation itself? The most meaningful negative answer is the discovery of Shannon–Kolmogorov entropy, which is a non-spectral invariant of transformations. In a certain sense our question corresponds to these two examples, and the answer is likely to be in the negative as well. However, the situation in our case is entirely different. For example, the answer to a similar question about measures on infinite symmetric (or Hermitian) matrices invariant under the orthogonal (unitary) group instead of the symmetric group is affirmative, because in this case the spectrum is a complete invariant even in the finite-dimensional case. Numerical experiments could be of use here. We know only one experimental work [6], performed at the request of the first-named author, which showed a strong dependence of the set of random spectra on the dimension of the sphere $S^n$. It is of interest to study the nature of those metrics that are uniquely determined by the system of spectra; but in the general case the answer is likely to be in the negative. It seems fruitful to consider the random spectra of matrix distributions (see [74], as well as [28]). On the other hand, there are many papers considering the spectra of the distance matrices of specific metric spaces (and the incidence matrices of graphs); see [16], for example.

1.1.5. The cone of metric triples and the Urysohn universal metric measure space

The set of summable admissible metrics on a space with fixed measure is a cone in the space $L^1(X^2,\mu^2)$ of functions of two variables, and we consider a natural norm for which it is a complete normalized cone (see § 2.1.4). It is natural to investigate particular classes of metrics as subsets of this cone and regard numerical characteristics of metric triples as functions on this cone. In particular, we regard $\varepsilon$-entropy as a function on this cone or on bundles over it.

The properties of the map $(X,\mu,\rho) \to \mathfrak{D}_\infty(X,\mu,\rho)$ from the cone of admissible metrics to the space of measures on distance matrices are described in detail in § 2.3. It is appropriate to mention how the problems under consideration are related to the Urysohn space. The Urysohn universal metric space $(\mathbb{U},\rho_{\rm U})$ became a popular subject of research in recent years (after being forgotten for half a century). In [55] and [59] a somewhat amazing fact was proved: in the space of all possible metrics on a countable set equipped with the weak topology, a dense $G_\delta$ subset consists of metrics whose completion yields a space isomorphic to the Urysohn universal space. Apparently, this typicality property of Urysohn spaces also holds in the context of admissible triples: the family of all (semi)metric triples $(X,\mu,\rho)$ for which the (semi)metric space $(X,\rho)$ is isometric to the Urysohn space is typical in the weak topology of the cone of admissible metrics, that is, it is a dense $G_\delta$ subset of the space of all metric triples. It follows that the set of metric triples such that, additionally, the measure is continuous and non-degenerate and the metric is Urysohn is also typical in the space of all metric triples. Nothing is known about Borel probability measures on the Urysohn space; we do not even know characteristic examples of such measures; however, there is no doubt that they will be found in further research.

1.2. Metric invariants of dynamics in ergodic theory

Let us see how admissible metrics can be used in the theory of dynamical systems. Unfortunately, the impressive success of ergodic theory in the second half of the 20th century suffered from one drawback, which was truly felt only recently: for various reasons, ergodic constructions rarely use metrics in phase spaces. Moreover, efforts were made to exclude metrics from consideration even in cases where their usefulness was evident. Nowadays, it has become clear that using metrics often allows one to define new measure invariants of dynamical systems. Namely, the metric can be used in a non-trivial way in some construction, while the answer resulting from this construction is independent of the initial metric. The first example of such an invariant is scaling entropy, defined originally by Vershik and investigated further by the authors in subsequent papers. Below we explain briefly the meaning of this invariant. It is described in detail in Chap. 3, which contains a series of results obtained in recent years and constituting a new direction in ergodic theory.

1.2.1. The scaling entropy of dynamical systems

We explore the simplest possibility of using a metric. Namely, with an admissible metric we associate its usual $\varepsilon$-entropy, that is, the germ of a function of $\varepsilon$. By averaging the metric, as is customary in the theory of transformations with invariant measures, we obtain a sequence of such germs of $\varepsilon$-entropy. The main observation, stated as a conjecture in [62] and [63] and proved in [84] using properties of metric triples, is that the asymptotic behaviour of the entropy (if it exists), as an equivalence class of growing (with the number of averages) sequences of $\varepsilon$-entropies, does not depend on the choice of the initial metric, and thus this asymptotic behaviour becomes a new invariant of the dynamical system (see § 3.1.1). It is this invariant that was called the scaling entropy of an automorphism (see [62] and [63]). In this definition it was implicitly assumed that the equivalence class of sequences of $\varepsilon$-entropies does not depend on $\varepsilon$, provided that $\varepsilon$ is sufficiently small. This is indeed the case in many examples. For instance, the class $\{cn\}$ corresponds to a positive Kolmogorov entropy, and the class of constant (in $n$) sequences corresponds to a discrete spectrum. As proved by Ferenczi and Park [10] and Zatitskii [85], any equivalence class of growing sequences between these two monotone asymptotics can be realized for some automorphism (see § 3.2.2). However, only a few of them have been encountered so far. Computing the scaling entropy for specific automorphisms is a challenging task, which has been performed only for some examples.

1.2.2. Comparison with classical entropy

On the one hand scaling entropy is a significant generalization of the notion of entropy and the Shannon–Kolmogorov entropy theory to cases where the Kolmogorov entropy is zero. However, the important issue here is the very notion of entropy of mm-spaces, that is, of metric measure spaces. The presence of a group of automorphisms suggests the idea of averaging metrics with respect to the action of a (for example, amenable) group. The mention of Shannon’s name here is not by coincidence. Apparently, over all these years, no one paid attention to or worked on deciphering Appendix 7 of Shannon’s famous paper [45] on the foundations of information theory and its applications. In this appendix Shannon suggested, in a very concrete and not immediately generalizable form, the same idea that was formulated 60 years later (!) in [62] and [63]: one should investigate the asymptotic behaviour of the usual entropy of a metric measure space (that is, the entropy of a metric triple) for successive averages of the metric under an automorphism or a group of automorphisms. It appears that neither Kolmogorov (see [27]) nor his numerous followers paid sufficient attention to the fact that the entropy of an automorphism can be computed as the asymptotic entropy of a metric measure space (and the result does not depend on the metric: see § 3.1.1). It is true that Shannon, like Kolmogorov after him, was only interested in processes that transmit information (K-processes). The fact that the definition remains meaningful also for arbitrary automorphisms remained unnoticed until very recently. In our terms, Shannon considered metric triples that are finite fragments of a stationary process equipped with the Hamming semimetric; this is not always convenient, but the model of averages is universal. The subsequent development of the theory by Kolmogorov and his followers Rokhlin and Sinai slightly overshadowed the general character of Shannon’s idea, which was partly justified since at that time the focus was on systems with positive entropy (hyperbolic, chaotic, and so on). It is worth noting that papers on topological entropy, which appeared shortly after those on metric entropy, would look much more natural within the framework of the theory of metric triples.

1.2.3. Stability and instability

In [48] (also see § 3.3.1), an important step was taken in the study of scaling entropy: the realization that the answer to the question about the existence of a universal equivalence class for the growth of entropies for averaged metrics can be negative for specially constructed metric triples, that is, the asymptotic behaviour can significantly depend on $\varepsilon$. Therefore, the definition of an equivalence class must involve functions of two variables: $n$ (the number of the particular average we take) and $\varepsilon$. The final definition of scaling entropy involves such a coarser entropy equivalence class for a given automorphism (see § 3.1.1). Presumably, automorphisms for which the asymptotics of entropies of averages depends on $\varepsilon$ are typical. The definition given here appears to be the most general one among all possible definitions related to the growth of the entropies of automorphisms or groups of automorphisms (in the amenable case). It would be interesting to extend the theory of scaling entropy to non-amenable groups; see § 3.5. Note that similar generalizations of the classical entropy theory have been proposed earlier, for example, Kirillov–Kushnirenko entropy (sequential entropy, see [31]), Katok–Thouvenot slow entropy (see [24]), Ferenczi’s measure-theoretic complexity (see [9]). However, the theory of metric triples contains potentially also non-entropy invariants of dynamical systems; the key question is how to express them in terms of numerical invariants of metric triples, for instance, in terms of matrix distributions.

1.2.4. The further use of metrics

The introduction of scaling entropy is only the first step as concerns the use of metrics in ergodic theory. Moreover, this step could be achieved using a sequence of Hamming semimetrics (and their averages) defined on cylinders of growing length in the symbolic presentation of the automorphism. Scaling entropy uses only the coarsest invariant of the metric, the asymptotic of the number of $\varepsilon$-balls that almost cover the measure space. The next step should consist in using measures to study more complicated characteristics of sequences of metric spaces, and to single out those properties that are invariants of an automorphism or a group of automorphisms. Apparently, this problem has not been treated, and it is likely that behind such a study there is intricate and interesting combinatorics of how continuous dynamics is approximated by finite constructions. Anyway, at first glance this approach differs significantly from the usual approximation theories. The question about invariants of non-Bernoulli K-automorphisms, discovered by Ornstein, remains unsolved for over 50 years. It is reasonable to assume that a geometric approach to the analysis of these invariants, which is based on using metrics, can help one to advance in this direction. This is suggested by the characteristic proposed by the first-named author, the secondary entropy of a filtration (see [67]).

1.2.5. Functions of several variables as a source of dynamic invariants

Another general idea is that the familiar technique of functional analysis — replacing the study of certain objects by the study of functions on these objects — has been used in the theory of dynamical systems in a very limited way (Koopman’s idea): with a dynamical system $\{T_g\colon g\in G\}$ one associates the group of operators

$$ \begin{equation*} \{U_g\colon f\mapsto U_g(f)(\,\cdot\,)=f(g^{-1}\,\cdot\,)\}, \end{equation*} \notag $$
where $f$ belongs to a certain space of functions of one variable. This is the essence of the spectral theory of dynamical systems, which provides spectral invariants for the system. However, the same technique can be used for functions of several variables, for example, for the space of functions of two variables which are not arbitrary however, but are, for instance, metrics. Thus, we open up an entirely new way to construct invariants of dynamical systems. In greater detail, using the theory of admissible metric triples described in Chap. 2, given a dynamical system with invariant measure, we can associate with it a group of operators in the space of metric triples. Thus, we have actually embedded above the group of measure-preserving automorphisms into the group of transformations of metric triples and verified that certain invariants of triples (the asymptotic behaviour of the entropies of averages) do not depend on the metric, and are therefore invariants of the original automorphism group of the measure space.

On the other hand, using metrics allows one to introduce new notions into the development of certain classical results. One of them is the notion of ‘virtual continuity’ of a measurable function of several variables, which is related to the problem of defining the restriction of such a function to elements of zero measure of a measurable partition. This problem has an automatic positive solution (mod 0) for any measurable function of one variable: this is what Rokhlin’s theorem (see [39]) actually means. However, functions of two or more variables cannot in general be restricted to elements of a measurable partition. A restriction exists for so-called virtually continuous functions, whose definition is purely measure-theoretic one (see [80] and Definition 2.7 in § 2.1.3). In particular, any admissible metric is virtually continuous and therefore has such a restriction; thus, it turns almost every element of any measurable partition to an mm-space. Virtual continuity does not rely on any local properties of functions (such as smoothness, and so on) and provides a new understanding of extension theorems for functions of several variables and of Sobolev’s embedding theorems (see [79]).

1.2.6. The general setting of the metric isomorphism problem taking metric into account: catalytic invariants

Consider the isomorphism problem for measure-preserving automorphisms, that is, the conjugacy problem in the group of classes of measure-preserving automorphisms coinciding mod 0. First we consider it in a slightly extended setting, namely, under the assumption that the group of automorphisms acts in a Lebesgue space $X$ with continuous measure $\mu$, on which we will also consider various admissible metrics $\rho$. In this space we fix an ergodic automorphism $T$ and an admissible metric $\rho$. Consider a kind of partition function of metrics and, more precisely, a normalized series in powers of $z\in [0,1)$, and assume that it is convergent (literally or in a generalized sense):

$$ \begin{equation*} \Omega_T(\rho,z)\equiv \Omega_T(z)=(1-z)\sum_{n=0}^{\infty}z^n\rho(T^n x,T^n y),\qquad z\in [0,1). \end{equation*} \notag $$
Thus, we regard the function $\Omega_T(\,\cdot\,)$ of $z$ as a metric that is a deformation of the metric $\rho$ (for $z=0$) under the action of $T$.

Note that for fixed $z$ every term is the result of application of the operator $z\cdot U_T\otimes U_T$ acting on the cone of admissible metrics (on the space of functions of two variables). Correspondingly, the sum of the series is the operator

$$ \begin{equation*} (\operatorname{Id}-\,z\cdot U_T \otimes U_T)^{-1}. \end{equation*} \notag $$

Consider the function $\Omega_T(z)$ in a neighbourhood of $z=1$. It is convenient to set $z=1-\delta$; then

$$ \begin{equation*} \Omega_{T}(\rho,z)= \delta\sum_{n=0}^{\infty}(1-\delta)^n \rho(T^n x,T^n y),\qquad 1>\delta \geqslant 0. \end{equation*} \notag $$

We will be interested in the behaviour of the deformation $\Omega_T(\,\cdot\,)$ in a neighbourhood of $\delta=0$, that is, $z=1$.

Assume that with every complete separable metric measure space $(X,\mu,\rho)$ we have associated an invariant $\Phi_{\rho}(\varepsilon)$ (with respect to isometries of the metric space) whose values are real functions of an argument common for all metric spaces, which we denote by $\varepsilon$ (an example of $\Phi_{\rho}(\varepsilon)$ is the function equal to the logarithm of the number of balls of radius $\varepsilon$ that cover almost the whole space). Now we assume that there is a deformation $\Omega_{T}(\rho,z)$ of metric spaces and consider functions $\Phi_{\rho}(z,\varepsilon)$ of two real variables $z\in [0,1)$ and $\varepsilon>0$. Finally, we introduce equivalence classes of functions of two variables depending on $\varepsilon$ and the deformation parameter $z$. Namely, first we form equivalence classes of functions of $z$ for fixed $\varepsilon$. If it turns out that such a class does not depend on $\varepsilon$, then it is taken as an invariant. If these classes are different for different $\varepsilon$, then we form coarser classes, combining all classes for different $\varepsilon$. Anyway, these classes are actually associated with the family of metrics $\Omega_{T}(\rho,z)$ determined by the partition function $\Omega$. We say that the equivalence class of the function of two variables $\Phi_{\rho}(z,\varepsilon)$ is a catalytic1 invariant of the automorphism $T$ if it does not depend on the initial metric $\rho=\rho_0$ and is thus associated with the automorphism $T$ alone. For more details, see [69].

In cases where the equivalence class depends on the initial metric, we obtain a natural equivalence relation on the initial metrics (by considering classes of metrics with the same invariant) and can speak about a relative catalytic invariant for a fixed class of initial metrics. Relative invariants are also of interest for the classification problem, which is more detailed in this case than the usual measure-theoretic isomorphism problem.

The main question here is what invariants of the metric (with respect to the isometries) can be used as productively as the entropy of a metric measure space. This question remains open.

Another possibility for expanding the idea of catalytic invariants is to consider not only invariants of the metric space itself, but also flag-type invariants of metric spaces $X_1 \supset X_2\supset \cdots \supset X_n$ in the same sense as above. There are certainly prospects here, but they require a more detailed study of metric spaces themselves and their invariants with respect to the isometries of $X_1$.

Chapter 2. Metric triples

2.1. Metric triples and admissibility

2.1.1. Measurable semimetrics, almost metrics, and correction theorems

The classical approach to mm-spaces $(X, \mu, \rho)$, where $X$ is a space with measure $\mu$ and metric $\rho$, is typically based on investigating various Borel measures $\mu$ on a fixed metric space $(X,\rho)$. However, as mentioned in Chap. 1, we consider the topic of metric measure spaces from a relatively new perspective. This point of view appears to have been introduced for the first time in [59] and [57]. We start with a fixed measure space $(X,\mathcal{A},\mu)$ (a Lebesgue–Rokhlin space, a standard probability space with continuous measure, that is, a space isomorphic to the interval $[0,1]$ with the Lebesgue measure) and explore various metrics $\rho$ on this space. The condition that connects the topological and measurable structures is the measurability of the metric $\rho$ as a function of two variables; such metrics will be referred to as measurable. Furthermore, we have to consider natural generalizations of metrics, semimetrics.2

In the context of the approach to metrics (and semimetrics) as measurable functions on $(X^2, \mu^2) = (X \times X, \mu \times \mu)$, which we have chosen, the concept of almost metric naturally arises, which is a function such that the defining metric relations are satisfied almost everywhere (but not necessarily everywhere).

Definition 2.1. A measurable non-negative function $\rho$ on $(X^2,\mu^2)$ is called an almost metric (or a metric mod 0) on $(X,\mu)$ if

(a) $\rho(x,y) = \rho(y,x)$ for $\mu^2$-almost all pairs $(x,y) \in X^2$;

(b) $\rho(x,z) \leqslant \rho(x,y) + \rho(y,z)$ for $\mu^3$-almost all triples $(x,y,z) \in X^3$.

For instance, if a sequence of measurable metrics converges in measure or almost everywhere, then the limit can a priori turn out to be only an almost metric. We present the following theorem on correction of almost metrics (see [86]).

Theorem 2.2 (correction theorem). If $\rho$ is an almost metric on $(X,\mu)$, then there exists a semimetric $\widetilde \rho$ on $(X,\mu)$ such that the equality $\rho = \widetilde \rho$ holds $\mu^2$-almost everywhere.

The proof of the correction theorem is based on two steps: identify the space $(X,\mu)$ with the circle $\mathbb{S}$ equipped with the Lebesgue measure $m$ and apply Lebesgue’s differentiation theorem. For an almost metric $\rho$ on $(\mathbb{S},m)$, as its correction, we can take the function

$$ \begin{equation*} \widetilde{\rho}(x,y)=\varlimsup_{T \to 0^+}\frac{1}{T^2}\int_0^T\, \int_0^T \rho(x+t,y+s)\,dt\,ds, \qquad x,y \in \mathbb{S}, \end{equation*} \notag $$
which coincides with $\rho$ almost everywhere and is a semimetric.

Theorem 2.2 allows us to restrict our considerations further to measurable semimetrics, rather than almost metrics. In particular, this theorem implies that the set of measurable semimetrics is closed with respect to convergence in measure and almost everywhere.

A development of correction ideas for functions of several variables can be found in [36].

2.1.2. Admissibility. The relationship between a measurable and a metric structure

In what follows we work mostly with so-called admissible semimetrics and metrics. One of the definitions of admissibility is separability on a subset of full measure.

Definition 2.3. A measurable (semi)metric $\rho$ on a measure space $(X,\mathcal{A}, \mu)$ is called admissible if there exists a subset $X_0\subset X$ such that $\mu(X_0)=1$ and the (semi)metric space $(X_0,\rho)$ is separable. The triple $(X,\mu,\rho)$ is also referred to as an admissible (semi)metric triple, or simply a metric triple.

The relation between the structures of a measurable and a metric space on $X$ is illustrated by the following statements. An admissible metric on $(X, \mathcal{A}, \mu)$ generates a Borel sigma-algebra $\mathcal{B}$, and the measure $\mu$ turns out to be Borel (defined on $\mathcal{B}$).

Theorem 2.4 (see [80]). If $\rho$ is an admissible metric on the space $(X,\mathcal{A},\mu)$, then $\mu$ is a Radon measure on the metric space $(X,\rho)$. The Borel sigma-algebra $\mathcal{B}$ generated by the metric $\rho$ on $X$ is a subalgebra of the original sigma-algebra $\mathcal{A}$ and is dense in it.

The inclusion $\mathcal{B}\subset \mathcal{A}$ follows from the fact that the open balls of radius $R$ are cross sections of the set $\{(x,y) \in X\times X\colon \rho(x,y)<R\}$, and thus they lie Due to separability (mod 0), any open set can be represented as a countable union of balls, and therefore it also lies in $\mathcal{A}$. The density of $\mathcal{B}$ in $\mathcal{A}$ arises from the maximal property of the Lebesgue sigma-algebra. For further details, we refer to [80].

A consequence of the Radon property of a measure is the following somewhat unexpected fact, which demonstrates, in a certain sense, the universality of an admissible metric triple.

Theorem 2.5 (see [80]). If $\rho_1$ and $\rho_2$ are two admissible metrics on a measure space $(X,\mu)$, then for any $\varepsilon>0$ there exists a subset $X_0$ of $ X$ such that $\mu(X_0) > 1-\varepsilon$ and the topologies induced by the metrics $\rho_1$ and $\rho_2$ on $X_0$ coincide.

From this theorem, we can deduce the following generalized Luzin theorem.

Theorem 2.6 (generalized Luzin theorem). Let $(X,\mu,\rho)$ be a metric triple, and $f$ be a measurable function on $(X,\mu)$. Then for any $\varepsilon>0$, there exists a subset $X_0 \subset X$ such that $\mu(X_0) > 1-\varepsilon$ and the function $f$ is continuous on $(X_0,\rho)$.

We mention that in [5] an alternative approach to these results was presented. Theorem 2.4 concerning the Radon property of a measure is derived from the generalized Luzin theorem.

We conclude the subsection with the following remark. With an admissible semimetric $\rho$ we associate a partition $\xi_\rho$ into sets of diameter zero: points $x,y \in X$ belong to the same element of the partition $\xi_\rho$ if and only if $\rho(x,y) = 0$. The admissibility of the semimetric $\rho$ guarantees that the partition $\xi_\rho$ is measurable. Thus, an admissible semimetric $\rho$ induces an admissible metric on the quotient space $X/\xi_\rho$.

2.1.3. Metrics and partitions

Let $\pi\colon(X,\mu)\to (Y,\nu)$ be a measurable mapping that transforms $\mu$ into $\nu$. Let $\xi$ be a partition into $\pi$-preimages of points. A classical result due to Rokhlin [39] claims the existence and uniqueness (mod 0) of conditional measures $\mu_y$ on the elements $\pi^{-1}(y)$ of this partition for $\nu$-almost all $y \in Y$, such that the measure $\mu$ is the integral of the measures $\mu_y$ with respect to $\nu$:

$$ \begin{equation*} \mu=\int_Y \mu_y\,d\nu(y). \end{equation*} \notag $$
With a measurable function $f$ on the space $(X,\mu)$ we can associate the system of restrictions $f_y=f\big|_{\pi^{-1}(y)}$ to the spaces $(\pi^{-1}(y),\mu_y)$, $y\in Y$. For $\nu$-almost every $y \in Y$ the restriction $f_y$ is a measurable function. When replacing the function $f$ by an equivalent function (coinciding with $f$ almost everywhere with respect to $\mu$), for $\nu$-almost all $y\in Y$ the traces of $f_y$ are replaced by $\mu_y$-equivalent traces. Thus, the restrictions of a measurable function to elements of a measurable partition are well defined. A similar question for multivariate functions was raised by Vershik: is it possible to consistently define the restrictions of a given almost everywhere defined multivariate function to elements of a measurable partition? The answer is negative in general. However, for so-called virtually continuous multivariate functions it is possible to provide a consistent definition of the restrictions to elements of a measurable partition. We present one possible definition of virtually continuous functions.

Definition 2.7. A measurable function $f$ on $(X^2,\mu^2)$ is called properly virtually continuous if there exists a subset $X'\subset X$ of full measure and an admissible metric $\rho$ on $(X,\mu)$ such that $f$ is continuous on $X'\times X'$ with respect to the metric $\rho\times \rho$. A measurable function $f$ on $(X^2,\mu^2)$ is called virtually continuous if it coincides with some properly virtually continuous function $\mu^2$-almost everywhere.

A simple example of a non-virtually continuous function is the indicator function of the set ‘above the diagonal’: the function $\chi_{\{x<y\}}$ on $X^2$, where $X = [0,1]$.

Theorem 2.8. If two proper virtually continuous functions $f$ and $g$ on $(X^2,\mu^2)$ coincide $\mu^2$-almost everywhere, then there exists a subset $X_0\subset X$ of full measure such that $f$ and $g$ coincide on the square $X_0^2$.

If $\xi$ is a partition into the preimages of points under a measurable mapping $\pi\colon (X,\mu) \to (Y,\nu)$, and $\{\mu_y\}_{y \in Y}$ is the corresponding system of conditional measures, then for $\nu$-almost every $y\in Y$ the restrictions of $f$ and $g$ to $\pi^{-1}(y)\times \pi^{-1}(y)$ coincide $\mu_y\times \mu_y$-almost everywhere.

The above theorem allows one to give a consistent definition of restrictions of a virtually continuous function as restrictions of its equivalent proper virtually continuous function.

Proposition 2.9. An admissible semimetric $\rho$ on a measure space $(X,\mu)$ is a properly virtually continuous function.

Indeed, an admissible semimetric $\rho$, considered as a function of two variables, is trivially continuous on $X\times X$ with respect to the semimetric $\rho\times \rho$. Therefore, the restrictions of $\rho$ to elements of a measurable partition $\xi$ are well defined. Moreover, for $\nu$-almost every $y\in Y$ the restriction of the semimetric $\rho$ to $\pi^{-1}(y)$ becomes an admissible semimetric on $(\pi^{-1}(y),\mu_y)$.

For more details on virtually continuous functions, we refer to [79] and [80]. In these papers the relationships between the concept of virtual continuity and classical questions of analysis are discussed, such as theorems regarding traces of functions in Sobolev spaces, nuclear operators and their kernels. Virtual continuity also appears in dual-type theorems like the Monge–Kantorovich duality theorems; see [80] and [5], and [4].

2.1.4. The cone of summable admissible semimetrics and the m-norm

In what follows we focus on summable admissible semimetrics, that is, admissible semimetrics $\rho$ with finite integral

$$ \begin{equation*} \int_{X\times X} \rho(x,y)\,d\mu(x)\,d\mu(y) < \infty. \end{equation*} \notag $$
Given a measure space $(X,\mu)$, let $\mathcal{A}dm(X,\mu)$ denote the set of all summable admissible semimetrics on $(X,\mu)$. In § 2.2.1 we show that this set forms a convex cone in $L^1(X^2,\mu^2)$. The group $\operatorname{Aut}(X,\mu)$ of automorphisms of the space $(X,\mu)$ acts on $\mathcal{A}dm(X,\mu)$ by translations. The study of the dynamics of metrics in Chap. 3 is essentially the study of this action. Orbits of this action consist of pairwise isomorphic metrics. In what follows we discuss these orbits and metric averages over them. In particular, we investigate the behaviour of invariants of automorphisms that arise in investigations of this action. To each metric $\rho$ in $\mathcal{A}dm(X,\mu)$ there corresponds a stabilizer, which is a group of measure-preserving isometries mod 0 of the space $(X,\rho)$.

The cone $\mathcal{A}dm(X,\mu)$ is not closed in $L^1(X^2,\mu^2)$, and its closure is the set of all (not necessarily admissible) summable semimetrics. When working with admissible semimetrics, it is convenient to use another norm induced in $L^1$ by the cone of all summable semimetrics. This norm was introduced in [78].

Definition 2.10. For a function $f \in L^1(X^2,\mu^2)$ we define its m-norm (finite or infinite) as follows:

$$ \begin{equation*} \|f\|_{\mathrm{m}}=\inf \bigl\{\|\rho\|_{L^1(X^2,\mu^2)}\colon \rho \text{ is a semimetric on } (X,\mu); \ |f| \leqslant \rho \ \, \mu^2\text{-a.e.}\bigr\}. \end{equation*} \notag $$
We denote by $\mathbb{M}(X,\mu)$ the subspace of $L^1(X^2,\mu^2)$ that contains the functions with finite m-norm.

Clearly, the m-norm dominates the standard norm in $L^1(X^2,\mu^2)$, so that convergence in the m-norm implies convergence in $L^1(X^2,\mu^2)$.

It is also evident that $\mathcal{A}dm(X,\mu) \subset \mathbb{M}(X,\mu)$. In § 2.2.2 we discuss the properties of the cone of admissible metrics and the m-norm. We show that the space $\mathbb{M}(X,\mu)$ and the cone $\mathcal{A}dm(X,\mu)$ are complete with respect to the m-norm.

Alongside the cone $\mathcal{A}dm(X,\mu)$, we also consider a subset of it, the cone $\mathcal{A}dm_+(X,\mu)$ of summable admissible metrics. It forms a dense subset of $\mathcal{A}dm(X,\mu)$.

2.2. Epsilon-entropy of a metric triple

2.2.1. Epsilon-entropy and a characterization of admissibility

One of the simplest functional characteristics describing a metric triple is the notion of $\varepsilon$-entropy, which goes back to Shannon (see Chap. 1).

Definition 2.11. Let $\rho$ be a measurable semimetric on $(X,\mu)$, and let $\varepsilon>0$. The $\varepsilon$-entropy $\mathbb{H}_\varepsilon(X,\mu,\rho)$ of the semimetric triple $(X,\mu,\rho)$ is defined as $\log k$, where $k$ is the minimum number (or infinity) such that the space $X$ can be represented as a union $X = X_0 \cup X_1\cup \dots \cup X_k$ of measurable sets such that $\mu(X_0)<\varepsilon$ and $\operatorname{diam}_{\rho}(X_j)<\varepsilon $ for all $j = 1,\dots, k$. For $\varepsilon\geqslant 1$ we set $\mathbb{H}_\varepsilon(X,\mu,\rho)=0$.

Sometimes it is important to consider a ‘non-diagonal’ variant of this notion with two parameters, $\varepsilon$ and $\delta$: in this case the condition $\mu(X_0)<\varepsilon$ is replaced by $\mu(X_0)<\delta$. In [72] such entropy is referred to as mm-entropy.

The property of a measurable semimetric to be admissible can easily be described in terms of its $\varepsilon$-entropy.

Lemma 2.12. A measurable semimetric is admissible if and only if its $\varepsilon$-entropy is finite for any $\varepsilon>0$.

Using this simple description, it is easy to understand that the sum of two admissible semimetrics is an admissible semimetric again. Indeed, the $\varepsilon$-entropy of the sum of semimetrics can be estimated as follows:

$$ \begin{equation*} \mathbb{H}_{2\varepsilon}(X,\mu,\rho_1+\rho_2) \leqslant \mathbb{H}_\varepsilon(X,\mu,\rho_1)+\mathbb{H}_\varepsilon(X,\mu,\rho_2). \end{equation*} \notag $$
Consequently, the set of all admissible semimetrics on the space $(X,\mu)$ and also the set $\mathcal{A}dm(X,\mu)$ are convex cones.

We can see from the definition that for a fixed metric triple $(X,\mu,\rho)$, the function $\varepsilon \mapsto \mathbb{H}_{\varepsilon}(X,\mu, \rho)$ is non-increasing, piecewise constant, and left continuous. Let

$$ \begin{equation*} \mathbb{H}_{\varepsilon+}(X,\mu, \rho)=\lim_{\delta \to 0^+} \mathbb{H}_{\varepsilon+\delta}(X,\mu, \rho). \end{equation*} \notag $$
Then the functions $\mathbb{H}_{\varepsilon+}(X,\mu, \rho)$ and $\mathbb{H}_{\varepsilon}(X,\mu,\rho)$ are lower and upper semicontinuous on the cone $\mathcal{A}dm(X,\mu)$ respectively. Moreover, the following estimate holds.

Lemma 2.13. Let $\rho_1, \rho_2 \in \mathcal{A}dm(X, \mu)$ such that $\|\rho_1-\rho_2\|_\mathrm{m} < \delta^2/4$. Then for any $\varepsilon > 0$,

$$ \begin{equation*} \mathbb{H}_{\varepsilon+\delta}(X,\mu,\rho_1) \leqslant \mathbb{H}_{\varepsilon}(X,\mu,\rho_2). \end{equation*} \notag $$

Another way to define the entropy of a metric triple is by approximating its measure by discrete measures in the Kantorovich metric.

Definition 2.14. Let $(X,\mu,\rho)$ be a metric triple with finite first moment (so that $\rho$ is integrable on $(X^2,\mu^2)$), and let $\varepsilon>0$. Set

$$ \begin{equation*} \mathbb{H}_\varepsilon^{\rm K}(X,\mu,\rho)=\inf\{ H(\nu) \colon d_{\rm K}(\mu,\nu)<\varepsilon\}, \end{equation*} \notag $$
where the infimum is taken over all discrete measures $\nu$ on the metric space $(X,\rho)$, $H(\nu)$ is the Shannon entropy, and $d_{\rm K}$ is the Kantorovich distance between measures on $(X,\rho)$ (see [23], [60], and [66]).

The following statements provide two-sided estimates connecting the above definitions of $\varepsilon$-entropy.

Lemma 2.15. Let $(X,\mu,\rho)$ be a metric triple with finite first moment. Let $0< \delta < \varepsilon$ be such that for any subset $A \subset X$, if $\mu(A)<\delta$, then

$$ \begin{equation} \int_{A\times X}\rho\,d(\mu\times \mu) < \varepsilon-\delta. \end{equation} \tag{2.1} $$
Then
$$ \begin{equation*} \exp\bigl(\mathbb{H}_\varepsilon^{\rm K}(X,\mu,\rho)\bigr) \leqslant \exp\bigl(\mathbb{H}_\delta (X,\mu,\rho)\bigr)+1. \end{equation*} \notag $$

Lemma 2.16. Let $(X,\mu,\rho)$ be a metric triple with finite first moment. Then the following estimate holds for any $\varepsilon>0$:

$$ \begin{equation} \mathbb{H}_{2\varepsilon}(X,\mu,\rho) \leqslant \frac{1}{\varepsilon} \bigl(\mathbb{H}^{\rm K}_{\varepsilon^2}(X,\mu,\rho)+1\bigr). \end{equation} \tag{2.2} $$

We present the proofs of Lemmas 2.15 and 2.16 in Appendix A.1.

The following theorem from [78] provides several equivalent reformulations of admissibility.

Theorem 2.17 (equivalent conditions for admissibility). Let $\rho$ be a measurable semimetric on $(X,\mu)$. Then the following statements are equivalent:

(a) the semimetric $\rho$ is admissible;

(b) for any $\varepsilon>0$, the $\varepsilon$-entropy $\mathbb{H}_\varepsilon(X,\mu,\rho)$ is finite;

(c) the measure $\mu$ can be approximated by discrete measures in the Kantorovich metric $d_{\rm K}$; in other words, for any $\varepsilon>0$ the $\varepsilon$-entropy $\mathbb{H}^K_\varepsilon(X,\mu,\rho)$ is finite;

(d) for $\mu$-almost every $x \in X$ and any $\varepsilon>0$ the ball with centre $x$ and radius $\varepsilon$ in the semimetric $\rho$ has positive measure;

(e) for any subset $A \subset X$ of positive measure the essential infimum of the function $\rho$ on $A\times A$ is zero.

The equivalence of statements (a)–(c) in the theorem has already been discussed. Statement (e) can conveniently be used as a criterion for the non-admissibility of a measurable semimetric.

We present another characterization from [78] of the admissibility of a semimetric. It is given in terms of pairwise distances between a random sequence of points.

Theorem 2.18. Let $\rho$ be a measurable semimetric on $(X,\mu)$. Let $(x_n)_{n=1}^\infty$ be a random sequence of points chosen independently with respect to the measure $\mu$.

(i) If the metric $\rho$ is admissible, then for any positive constant $\varepsilon$ the probability of the following event tends to zero as $n$ goes to infinity:

$$ \begin{equation} \begin{aligned} \, &\textit{there exists an index set } I \subset \{1,2,\dots,n\} \textit{ of size} \\ &\textit{at least } \varepsilon n \textit{ such that } \rho(x_i, x_j) > \varepsilon \textit{ for any distinct } i,j \in I. \end{aligned} \end{equation} \tag{2.3} $$

(ii) If the metric $\rho$ is not admissible, then there exists a positive constant $\varepsilon$ such that the probability of the event (2.3) tends to one.

In the conclusion of this section we present another theorem, which allows estimating the $\varepsilon$-entropy of a metric triple in terms of the $\varepsilon$-entropies of random finite subspaces of it.

Theorem 2.19. Let $(X, \mu, \rho)$ be a metric triple, and let $\{x_k\}_{k=1}^\infty$ be a sequence that is random with respect to the measure $\mu^\infty$. Let $(X_n, \mu_n, \rho_n)$ be a (random) finite metric triple, where $X_n = \{x_1, \dots, x_n\}$, $\mu_n$ is the uniform measure on $X_n$, and $\rho_n(x_i, x_j) = \rho(x_i, x_j)$. Then almost surely

(i) the $\varepsilon$-entropy of $\rho$ has the lower estimate

$$ \begin{equation*} \varlimsup_n\mathbb{H}_{\varepsilon}(X_n, \mu_n, \rho_n) \leqslant \mathbb{H}_{\varepsilon}(X, \mu, \rho); \end{equation*} \notag $$

(ii) the $\varepsilon$-entropy of $\rho$ has the upper estimate

$$ \begin{equation*} \varliminf_n\mathbb{H}_{\varepsilon}(X_n, \mu_n, \rho_n) \geqslant \mathbb{H}_{\varepsilon+}(X, \mu, \rho). \end{equation*} \notag $$

It should be noted that Theorem 2.19 essentially provides an estimate for the $\varepsilon$-entropy of a metric triple in terms of its matrix distribution (see § 2.3.1).

2.2.2. Convergence in the cone of admissible semimetrics

In this subsection we present a series of results from [78] which describe the properties of the space $\mathbb{M}(X,\mu)$ and the cone of admissible semimetrics $\mathcal{A}dm(X,\mu)$, equipped with the m-norm and the norm from $L^1(X^2,\mu^2)$.

Lemma 2.20. The space $\mathbb{M}(X,\mu)$ is complete in the m-norm.

Lemma 2.21. Let a sequence of summable semimetrics $\rho_n$ on $(X,\mu)$ converge to a function $\rho$ in the m-norm. If for each $\varepsilon>0$ the estimate $\mathbb{H}_\varepsilon(X,\mu,\rho_n)<+\infty$ holds for sufficiently large $n$, then $\rho$ is an admissible semimetric.

Corollary 2.22. The limit of a sequence of admissible semimetrics in the m-norm is an admissible semimetric. The cone $\mathcal{A}dm(X,\mu)$ of admissible semimetrics is closed and complete in the m-norm.

The following lemma states that the limit in $L^1(X^2,\mu^2)$ of a sequence of admissible semimetrics with uniformly bounded $\varepsilon$-entropies is an admissible semimetric.

Lemma 2.23. Let $M \subset \mathcal{A}dm(X,\mu)$ be such that for each $\varepsilon>0$ the set $\{\mathbb{H}_\varepsilon(X,\mu,\rho)\colon\rho \in M\}$ is bounded. Then the closure of the set $M$ with respect to the norm of the space $L^1(X^2,\mu^2)$ lies in the cone $\mathcal{A}dm(X,\mu)$.

Theorem 2.24. Let a sequence of summable semimetrics $\rho_n$ converge to an admissible semimetric $\rho$ in $L^1(X^2,\mu^2)$. Then $\rho_n$ converges to $\rho$ in the m-norm.

Corollary 2.25. On the cone $\mathcal{A}dm(X,\mu)$ the topology induced by the m-norm coincides with the topology induced by the standard norm from $L^1(X^2,\mu^2)$.

2.2.3. Compactness and precompactness in the cone of admissible semimetrics

According to Corollary 2.25, a set of summable admissible semimetrics is compact in the m-norm if and only if it is compact in $L^1(X^2,\mu^2)$. The theorems from [78] presented in this subsection provide a criterion for the precompactness of the family of admissible semimetrics in the m-norm.

Theorem 2.26. A set $M \subset \mathcal{A}dm(X,\mu)$ is precompact in the m-norm if and only if:

(a) uniform integrability: the set $M$ is uniformly integrable on $(X^2,\mu^2)$;

(b) uniform admissibility: for any $\varepsilon>0$ there exist $k \geqslant 0$ and a partition of $X$ into sets $X_j$, $j=1,\dots,k$, such that for each semimetric $\rho \in M$ there exists a set $A \subset X$ such that $\mu(A)<\varepsilon$ and $\operatorname{diam}_{\rho}(X_j\setminus A) < \varepsilon$ for all $j=1,\dots,k$.

It is worth noting that, according to the Dunford–Pettis theorem, uniform integrability is a criterion for precompactness in the weak topology in $L^1$.

Corollary 2.27. If $M\subset \mathcal{A}dm(X,\mu)$ is precompact in the m-norm, then its closure in $L^1(X^2,\mu^2)$ coincides with the closure in the m-norm and belongs to $\mathcal{A}dm(X,\mu)$. Moreover, for any $\varepsilon>0$ the supremum $\sup\{\mathbb{H}_\varepsilon(X,\mu,\rho)\colon \rho \in M\}$ is finite.

It turns out that for convex sets consisting of admissible semimetrics the last assertion of Corollary 2.27 is also a sufficient condition for precompactness in the m-norm.

Theorem 2.28. A convex set $M \subset \mathcal{A}dm(X,\mu)$ is precompact in the m-norm if and only if the supremum $\sup\{\mathbb{H}_\varepsilon(X,\mu,\rho)\colon \rho \in M\}$ is finite for any $\varepsilon>0$.

The criteria for precompactness mentioned above have an important application in dynamics. They are used to prove a criterion for the spectrum of a measure-preserving transformation to be discrete (see Theorem 3.15).

2.3. Metric triples: classification, matrix distribution, and distance between triples

So far, when speaking about a metric triple we have had in mind a metric on a fixed standard probability space $(X,\mu)$. In this section we temporarily shift away from this paradigm and discuss metric triples without fixing a specific standard probability space with continuous measure.

Two summable metric triples $(X_1,\mu_1,\rho_1)$ and $(X_2,\mu_2,\rho_2)$ are isomorphic if there exists an isomorphism of measure spaces $(X_1,\mu_1)$ and $(X_2,\mu_2)$ that maps the metric $\rho_1$ to the metric $\rho_2$ (mod 0). In this section we also discuss the classification of admissible metric triples up to isomorphisms. When we return to a fixed space $(X,\mu)$ and the cone $\mathcal{A}dm_+(X,\mu)$ of admissible metrics on it, we will actually speak about orbits under the action of the automorphism group $\operatorname{Aut}(X,\mu)$. As mentioned in the first chapter, it was proved independently by Gromov [14] and Vershik [55] that the classification problem for metric triples is ‘smooth’ in the following sense: there exists a complete system of invariants which characterizes an admissible metric triple up to isomorphisms, namely, the matrix distribution. This complete system of invariants has not just turned out to be simple and intuitive but has also become a significant tool for investigations of metric spaces.

In § 2.3.3 we present two natural ways to measure the distinction between non-isomorphic metric triples quantitatively: we introduce two distances and discuss their properties.

2.3.1. A classification of metric spaces with measure and matrix distribution

In this section we provide a detailed exposition of what we briefly mentioned in § 1.1.3.

For a fixed metric triple $(X,\mu,\rho)$ and a natural number $n$, let us choose randomly and independently $n$ points $x_1,\dots, x_n$ in the space $X$ in accordance with the distribution $\mu$. Consider the distance matrix between these points $(\rho(x_i,x_j))_{i,j=1}^n$, which is a random distance matrix on $n$ points. Let $\mathfrak{D}_n = \mathfrak{D}_n(X,\mu,\rho)$ denote the resulting distribution on the space of $n\times n$ matrices. The measure $\mathfrak{D}_n$ is the image of the measure $\mu^n$ (defined on $X^n$) under the mapping

$$ \begin{equation*} F_n \colon X^n \to M_n, \quad (x_1,\dots,x_n) \mapsto (\rho(x_i,x_j))_{i,j=1}^n. \end{equation*} \notag $$

Definition 2.29. The measure $\mathfrak{D}_n$ is called the matrix distribution of dimension $n$ of the metric triple $(X,\mu,\rho)$.

The following properties of finite-dimensional matrix distributions follow directly from the definition.

Remark 2.30. For a fixed metric triple $(X,\mu,\rho)$ the distributions $\mathfrak{D}_n$ have the following properties:

(a) $\mathfrak{D}_n$ is concentrated on the space of distance matrices of size $n\times n$, denoted by $R_n$;

(b) $\mathfrak{D}_n$ is invariant under the action of the symmetric group $S_n$ by simultaneous permutations of rows and columns;

(c) $\mathfrak{D}_n$ is the projection of the distribution $\mathfrak{D}_{n+1}$ onto the $n\times n$ matrices formed by the first $n$ rows and columns;

The symbol $\mathfrak{D}_\infty = \mathfrak{D}_\infty(X,\mu,\rho)$ denotes the projective limit of the finite- dimensional distributions $\mathfrak{D}_n$, which is a distribution on the space $M_\infty$ of $\mathbb{N}\times\mathbb{N}$ matrices. This measure is concentrated on the space of infinite distance matrices (symmetric matrices with non-negative real coefficients satisfying the triangle inequality), denoted by $R_\infty$. Moreover, the measure $\mathfrak{D}_\infty$ is invariant under the action of the group $S^\infty$, which permutes columns and rows of matrices simultaneously (briefly, the infinite diagonal symmetric group). The measure $\mathfrak{D}_\infty$ is the image of the Bernoulli measure $\mu^\infty$ (defined on $X^\infty$) under the mapping

$$ \begin{equation} F_\infty \colon X^\infty \to M_\infty,\quad (x_j)_{j=1}^\infty \mapsto (\rho(x_i,x_j))_{i,j=1}^\infty. \end{equation} \tag{2.4} $$
A random matrix with respect to the distribution $\mathfrak{D}_\infty$ is the distance matrix $(\rho(x_i,x_j))_{i,j=1}^\infty$, where the sequence of points $(x_i)_{i=1}^\infty$ in $X$ is chosen randomly and independently in accordance with the distribution $\mu$.

Definition 2.31 (see [55]). The measure $\mathfrak{D}_\infty$ is called the matrix distribution of the metric triple $(X,\mu,\rho)$.

A theorem stating that the matrix distribution fully determines a metric triple up to isomorphism was established by Gromov and Vershik independently.

Gromov: The set of measures $\mathfrak{D}_n$, $n\in \mathbb{N}$, forms a complete system of invariants for the metric triple $(X,\mu,\rho)$. That is, a necessary and sufficient condition for the equivalence of two triples is the coincidence of the corresponding measures $\mathfrak{D}_n$ for all $n$.

Vershik: The measure $\mathfrak{D}_\infty$ on the space of infinite distance matrices is a complete invariant for the metric triple $(X,\mu,\rho)$. In other words, two admissible non-degenerate metrics are isomorphic if and only if their matrix distributions coincide.

The equivalence of the two conclusions is obvious. The first statement was proved by Gromov, and the proof is based on rather specialized analytic considerations. After the theorem was communicated to Vershik, the latter provided an entirely different proof based on a simple ergodic theorem (Borel’s strong law of large numbers for a sequence of independent random variables). An analysis of both proofs was performed in [14]. We present the second proof. Both proofs were obtained in the late 1990s and were published in [14] and [59], respectively.

Theorem 2.32 (Gromov and Vershik). Two metric triples $(X_1,\mu_1,\rho_1)$ and $(X_2,\mu_2,\rho_2)$ are isomorphic if and only if the matrix distributions $\mathfrak{D}_\infty(X_1,\mu_1,\rho_1) $ and $\mathfrak{D}_\infty(X_2,\mu_2,\rho_2) $ of these triples coincide.

Proof. The necessity of the condition is evident. To prove sufficiency we show that the metric triple $(X,\mu,\rho)$ is uniquely determined by its matrix distribution.

Recall that we can assume that the metric space $(X,\rho)$ is complete, and the measure $\mu$ is non-degenerate, that is, its support coincides with the whole space (there are no non-empty open sets of measure zero). From this and the ergodic theorem it follows that almost any sequence of points with respect to $\mu^{\infty}$ is dense in $(X,\rho)$. Consequently, the space that is the closure (more precisely, the completion) of almost any sequence is the whole of $X$. It remains to reconstruct the measure on the space. Using the same ergodic theorem, the measure of any ball, and even of the intersection of a finite set of balls centred at points in our sequence, can uniquely be recovered as the density of the points in this sequence lying in such an intersection. As is known, the algebra of sets spanned by the set of all (or almost all) balls of arbitrary radius in a separable metric space is dense in the algebra of all measurable sets. Therefore, the value of the measure on this algebra determines uniquely the measure on the whole space $X$. Thus, we have reconstructed the metric triple from its matrix distribution.

Remark 2.33. The conclusion of Theorem 2.32 fails if we replace admissible metrics by admissible semimetrics. A counterexample can be provided by an admissible semimetric that does not distinguish some pairs of points and a metric obtained from it by taking the quotient by the pairs of points with zero distance.

Also, the conclusion of Theorem 2.32 fails if we abandon the condition of metric admissibility. Metrics obtained from the aforementioned semimetrics by adding a constant have equal matrix distributions but are not isomorphic.

2.3.2. A characterization of matrix distributions

The Gromov–Vershik theorem (Theorem 2.32) classifies metric triples exhaustively in terms of matrix distributions, that is, probability measures on the set $R_\infty$ of infinite distance matrices. However, a question that is essential for the final resolution of classification problems and remains open is to describe possible values of the invariants; in this case, it is about describing measures that can be the matrix distributions of mm-spaces. We noted before that the distributions must be invariant and ergodic with respect to the action of the diagonal symmetric group. But this condition is not sufficient. It should be mentioned that the set of all invariant measures on the set of infinite symmetric matrices (not necessarily distance matrices) with respect to the diagonal group was found by Aldous [2], and its intersection with the ergodic measures on $R_\infty$ is much wider than the set of matrix distributions. The problem we consider consists in the classification of metrics as measurable functions of two variables with respect to the simultaneous action on both variables by the group of transformations preserving the measure. Under certain conditions, this is precisely the property that distinguishes matrix distributions from a wider class of measures on infinite matrices.

Let us note the following obvious property of metrics $\rho(\,\cdot\,{,}\,\cdot\,)$ as measurable functions of two variables. A measurable symmetric function of two variables is called pure if the mapping $x \mapsto \rho(x, \cdot\,)$ is injective mod 0, that is, for almost all pairs $x \ne x'$ the corresponding functions of one variable $f(x,\cdot\,)$ and $f(x',\cdot\,)$ are not almost everywhere equal. It is obvious that a metric is a pure function (a semimetric is not necessarily so). It is not difficult to prove the following property.

Lemma 2.34. The following two properties of a metric triple $(X,\mu,\rho)$ are equivalent.

(a) The group of $\mu$-preserving isometries mod 0 on the space $(X,\rho)$ is trivial (consists of the identity transformation). In this case we say that the admissible metric is irreducible.

(b) The map $F_\infty$ from $(X^\infty,\mu^\infty)$ to $M_\infty$ is an isomorphism onto its image (see (2.4)).

Thus, the matrix distribution of an irreducible metric is an isomorphic image of the Bernoulli measure. Thus, our task is to describe isomorphic images of Bernoulli measures under the mapping $F_\infty$. These images, as measures on distance matrices, will be referred to as simple measures. The internal description of simple measures is associated with a more detailed consideration of sigma-subalgebras on which the measures are defined, and we do not dwell on this here. In [71] such a description was given in a similar case, namely, for not necessarily symmetric functions of several variables. On the other hand, in [68], for measures on enumerations of partially ordered sets, a concept equivalent to the concept of measure dimension on matrices was introduced; simplicity corresponds to dimension $1$. In connection with Aldous’s theorem, it is relevant to mention that the absence of the notion of dimension (simplicity) makes it difficult to understand why invariant measures in this theorem split into two different classes; the distinction is in the sigma-subalgebras on which the measures are defined. A detailed overview of the relevant concepts and their applications will be presented elsewhere.

In conclusion of this subsection we present a theorem that provides an entropy- based description of matrix distributions of metric triples.

Let $\varepsilon>0$. For a finite distance matrix of size $n\times n$ we define its $\varepsilon$-entropy as the $\varepsilon$-entropy of an $n$-point space with the uniform measure and the metric defined by this matrix. We say that an infinite distance matrix $A=(a_{i,j})_{i,j=1}^\infty$ is entropy admissible if for each $\varepsilon>0$ the $\varepsilon$-entropies of its corner minors $n\times n$, that is, matrices $(a_{i,j})_{i,j=1}^n$, are uniformly bounded with respect to $n$. We say that the matrix $A$ is summable if there exists a finite limit

$$ \begin{equation*} \lim_{n\to \infty} \frac{1}{n^2}\sum_{i=1}^n\,\sum_{j=1}^n a_{i,j}. \end{equation*} \notag $$

Theorem 2.35 (entropy characterisation of the matrix distributions of admissible metrics). The matrix distribution $\mathfrak{D}_\infty(X,\mu,\rho)$ of a summable metric triple $(X,\mu,\rho)$ is a measure on $R_\infty$ that is ergodic with respect to simultaneous permutations of rows and columns and is concentrated on the summable entropy admissible matrices. Conversely, any ergodic measure $D$ concentrated on summable entropy admissible matrices is the matrix distribution of some summable metric triple $(X,\mu,\rho)$.

We comment on the proof of this theorem. The summability and entropy admissibility of $\mathfrak{D}_\infty$-almost every matrix follows from the law of large numbers and Theorem 2.19. Conversely, let $A=(a_{i,j})_{i,j=1}^\infty$ be a summable entropy admissible distance matrix (as is almost every point in the support of measure $D$). For each $n$ let $(X_n,\mu_n,\rho_n)$ be an $n$-point semimetric space defined by the matrix $(a_{i,j})_{i,j=1}^n$ and equipped with a uniform measure. The summability and entropy admissibility assumptions on the matrix $A$ allow us to prove that the sequence of semimetric triples $(X_n,\mu_n,\rho_n)$ is precompact with respect to the special metric $\operatorname{Dist}_{\rm m}$ (see Definition 2.36 and Theorem 2.38 below). Let an admissible semimetric triple $(X,\mu,\rho)$ be a limit point of this sequence of triples. Then the matrix distribution $\mathfrak{D}_\infty(X,\mu,\rho)$ is the weak limit of a subsequence of matrix distributions $\mathfrak{D}_\infty(X_n,\mu_n,\rho_n)$, thus it turns out to be equal to the original measure $D$ because of its ergodicity.

2.3.3. Two metrics on metric triples

Let $(X_1,\mu_1,\rho_1)$ and $(X_2,\mu_2,\rho_2)$ be two semimetric triples. We consider two ways to measure the distance between them, both in the spirit of the Gromov–Hausdorff distance. The first is to embed both measure spaces into a space $(X,\mu)$ and minimize the distance between the semimetrics in the m-norm.

Definition 2.36. The distance $\operatorname{Dist}_{\rm m}$ between two semimetric triples $(X_1,\mu_1,\rho_1)$ and $(X_2,\mu_2,\rho_2)$ is defined as the infimum, over all possible couplings $(X,\mu)$ of the measure spaces $(X_1,\mu_1)$ and $(X_2,\mu_2)$ (with projections $\psi_1 \colon X \to X_1$ and $\psi_2 \colon X\to X_2$), of the m-distances between the semimetrics $\rho_1\circ \psi_1$ and $\rho_2\circ \psi_2$ on the space $(X,\mu)$:

$$ \begin{equation*} \begin{aligned} \, &\operatorname{Dist}_{\rm m}\bigl((X_1,\mu_1,\rho_1),(X_2,\mu_2,\rho_2)\bigr) \\ &\qquad=\inf\bigl\{\|\rho_1\circ \psi_1-\rho_2\circ \psi_2\|_\mathrm{m} \mid \psi_{1,2} \colon(X,\mu) \to (X_{1,2},\mu_{1,2})\bigr\}. \end{aligned} \end{equation*} \notag $$

The second natural way to measure the distance between semimetric triples is to embed isometrically both semimetric spaces into a semimetric space $(X,\rho)$ and minimize the distance between the measures, for instance, using the Kantorovich metric $d_{\rm K}$.

Definition 2.37. The distance $\operatorname{Dist}_{\rm K}$ between two semimetric triples $(X_1,\mu_1,\rho_1)$ and $(X_2,\mu_2,\rho_2)$ is defined as the infimum over all possible isometric embeddings $\phi_1\colon (X_1,\rho_1) \to (X,\rho)$ and $\phi_2\colon (X_2,\rho_2) \to (X,\rho)$ of the Kantorovich distances on the space $(X,\rho)$ between the measures $\phi_1(\mu_1)$ and $\phi_2(\mu_2)$:

$$ \begin{equation*} \begin{aligned} \, &\operatorname{Dist}_{\rm K}\bigl((X_1,\mu_1,\rho_1),(X_2,\mu_2,\rho_2)\bigr) \\ &\qquad=\inf\bigl\{d_{\rm K}(\phi_1(\mu_1),\phi_2(\mu_2))\mid \phi_{1,2}\colon(X_{1,2},\rho_{1,2}) \to (X,\rho)\bigr\}. \end{aligned} \end{equation*} \notag $$

The following theorem states that the two distances described above are equivalent.

Theorem 2.38. For any integrable semimetric triples $(X_1,\mu_1,\rho_1)$ and $(X_2,\mu_2,\rho_2)$, the following inequalities hold:

$$ \begin{equation*} \begin{aligned} \, \operatorname{Dist}_{\rm K}\bigl((X_1,\mu_1,\rho_1),(X_2,\mu_2,\rho_2)\bigr) &\leqslant \operatorname{Dist}_{\rm m} \bigl((X_1,\mu_1,\rho_1),(X_2,\mu_2,\rho_2)\bigr) \\ \textit{and} \qquad \operatorname{Dist}_{\rm m}\bigl((X_1,\mu_1,\rho_1),(X_2,\mu_2,\rho_2)\bigr) &\leqslant 2\operatorname{Dist}_{\rm K} \bigl((X_1,\mu_1,\rho_1),(X_2,\mu_2,\rho_2)\bigr). \end{aligned} \end{equation*} \notag $$

The proof of Theorem 2.38 (as well as of Theorem 2.40 below) is provided in Appendix A.2.

Remark 2.39. The functions $\operatorname{Dist}_{\rm K}$ and $\operatorname{Dist}_{\rm m}$ are metrics on the set of equivalence classes of admissible integrable metric triples. In particular, they are equal to zero if and only if two admissible metric triples are isomorphic. The convergence of a sequence of admissible metric triples in either of these metrics implies the weak convergence of their matrix distributions.

The set of all admissible semimetric triples is complete with respect to $\operatorname{Dist}_{\rm K}$ and $\operatorname{Dist}_{\rm m}$.

The functions $\operatorname{Dist}_{\rm K}$ and $\operatorname{Dist}_{\rm m}$ are semimetrics on the set of equivalence classes of admissible integrable semimetric triples. For instance, the distance between an admissible semimetric triple and a metric triple obtained from it by taking the quotient by the sets of zero diameters is zero.

The following theorem is an analogue of Theorem 2.26 and provides a criterion for the precompactness of a family of admissible semimetric triples in the $\operatorname{Dist}_{\rm m}$-metric (and $\operatorname{Dist}_{\rm K}$-metric).

Theorem 2.40. A set $M = \{(X_i,\mu_i,\rho_i)\colon i \in I\}$ of summable admissible semimetric triples is precompact in the $\operatorname{Dist}_{\rm m}$-metric (and $\operatorname{Dist}_{\rm K}$-metric) if and only if:

(a) the semimetrics $\rho_i$ are uniformly integrable:

$$ \begin{equation*} \lim_{R \to \infty}\,\sup_{i\in I} \int_{\rho_i > R} \rho_i \, d\mu_i^2=0; \end{equation*} \notag $$

(b) for any $\varepsilon>0$ the $\varepsilon$-entropies are uniformly bounded:

$$ \begin{equation*} \sup_{i \in I} \mathbb{H}_\varepsilon(X_i,\mu_i,\rho_i) <\infty. \end{equation*} \notag $$

A slightly different version of Theorem 2.40 can be found in [13], where various distances between metric triples and their properties are considered; also see [46].

In [11] the authors investigated the Lipschitz properties of mappings associating the finite-dimensional matrix distribution $\mathfrak{D}_n$, $n\geqslant 1$, with a metric triple. To do this, they introduced a specific distance on the space of square matrices $M_n$ of dimension $n$, and the distance between distributions was defined using the Prokhorov metric, which metrizes the weak topology. Let us formulate a result of similar kind for infinite-dimensional matrix distributions $\mathfrak{D}_\infty$.

It should be noted that if $\rho$ is a summable semimetric on the space $(X,\mu)$, then the random distance matrix $A$ is almost surely summable, meaning it has a finite mean:

$$ \begin{equation*} \operatorname{av}(A):=\lim_{n \to \infty}\frac{1}{n^2}\sum_{i=1}^n\, \sum_{j=1}^n A_{i,j}=\int_{X^2}\rho\, d\mu^2. \end{equation*} \notag $$
For two summable distance matrices $A$ and $B$ we define the distance between them as follows:
$$ \begin{equation*} \operatorname{mdist}(A,B)=\inf\{\operatorname{av}(D)\colon |A_{i,j}-B_{i,j}|\leqslant D_{i,j}\ \forall\,i,j \in \mathbb{N}\}, \end{equation*} \notag $$
where the infimum is taken over all possible summable distance matrices $D$ that element-wise majorize the difference between $A$ and $B$. We note a similarity between the definition of the semimetric $\operatorname{mdist}$ and the m-norm (see Definition 2.10).

Using the distance $\operatorname{mdist}$ on the space of summable distance matrices, we can use the corresponding Kantorovich distance between the matrix distributions.

Theorem 2.41. The Kantorovich distance between matrix distributions of two metric triples does not exceed the distance $\operatorname{Dist}_{\rm m}$ between these triples.

2.3.4. The Urysohn universal metric space

An outstanding discovery made by Urysohn (1898–1924) and published in his posthumous paper [47] of 1927 was the construction of a universal complete separable metric space. This space is now referred to as the Urysohn space. It is the unique (up to isometry) complete separable metric space $(\mathbb{U},\rho_{\rm U})$ possessing two properties:

(a) universality, meaning that any separable metric space can be isometrically embedded into $\mathbb{U}$;

(b) homogeneity, which means that for any two isometric compact subsets of $\mathbb{U}$ and any isometry between them there exists an extension of this isometry to an isometry of the whole space onto itself.

Largely forgotten for many years, starting from the 2000s, this space has become an object of study for many mathematicians. We highlight one of the many results. One of the significant properties of the Urysohn space is its ‘typicality’ in the following sense.

Theorem 2.42 (see [61]). Consider the set $\mathrm{M}$ of all metrics on a countable set N (for example, on the set of natural numbers) and equip it with the natural weak topology. Then an everywhere dense $G_{\delta}$ (that is, ‘typical’) subset of $\mathrm{M}$ consists of metrics for which the completion of the set N with respect to this metric is isometric to the Urysohn space $(\mathbb{U},\rho_{\rm U})$.

The proof of this fact refines the construction of the Urysohn space: this space is constructed using a recursive procedure defining a distance matrix (the Urysohn matrix); see [61]. Of course, by defining a probabilistic Borel non-degenerate continuous measure $\mu$ on the Urysohn space we obtain a metric triple $(\mathbb{U},\mu,\rho_{\rm U})$. This raises a question: how large is the part of the space of all metric triples $\mathcal{A}dm_+(X,\mu)$ on a Lebesgue space $(X,\mu)$ that consists of the spaces isomorphic to the Urysohn space?

Consider the weak topology on the space of metric triples. To do this, we identify each metric triple with its matrix distribution (a probability measure on the space $R_\infty$ of distance matrices). The topology on the matrix distributions is the weak topology on the space of measures on $R_\infty$, and it induces the weak topology on the triples. According to Theorem 2.42 mentioned above, the set of Urysohn matrices (defining a metric space whose completion is the Urysohn space) is a dense $G_\delta$ set in $R_\infty$. Hence the set of probability measures concentrated on the Urysohn matrices is a dense $G_\delta$ subset of the space of probability measures on $R_\infty$. It seems that the set of matrix distributions concentrated on Urysohn matrices is a dense $G_\delta$ subset of the set of matrix distributions of metric triples.

The question about the type of metric spaces that form a typical set in the space of triples with respect to the norm topology is open.

A natural and highly important question is how to define non-degenerate continuous measures on the Urysohn space. Are there any distinguished measures among them, like the Wiener measure in the space $C(0,1)$ of continuous functions on $[0,1]$ (notably, this space is universal but not homogeneous)? This question is open and related to another significant question, of whether there exists any distinguished structure in the space $\mathbb{U}$. Recall (see [7]) that a structure of an abelian continuous (not locally compact) group can be introduced in $\mathbb{U}$, although, unfortunately, not in a unique way. Each non-degenerate measure on $\mathbb{U}$ corresponds to a matrix distribution, which allows us to pose the following question: find matrix distributions concentrated on Urysohn matrices. In fact, this problem reduces to constructing an ergodic measure concentrated on the set of Urysohn distance matrices and invariant under the infinite symmetric group.

Chapter 3. Dynamics on admissible metrics

3.1. Scaling entropy

We assume the reader to be familiar with the basics of the classical entropy theory (see, for example, [26], [29], [35], and [41]). As we mentioned above, we study a new variant of entropy theory based on the dynamics of admissible metrics. In fact, the usage of a metric was mentioned in a paper by Shannon but has never been developed further and has been forgotten. The introduction of a metric helps one to distinguish new properties of an automorphism and allows one to extend Kolmogorov’s theory to zero-entropy automorphisms. Similar ideas were proposed in [9] and [24]; also see the survey [21]. Following a definition due to Vershik, we formulate the theory of scaling entropy, the beginnings of which were stated in [62]–[64].

3.1.1. The definition of scaling entropy. Asymptotic classes

Let $T$ be an automorphism of a standard probability space $(X,\mu)$. Recall that for a summable admissible semimetric $\rho \in \mathcal{A}dm(X,\mu)$ we denote by $T^{-1}\rho$ its translation by the automorphism $T$ and by $T_{\rm av}^n\rho$ the average of its first $n$ translations:

$$ \begin{equation*} T_{\rm av}^n\rho(x,y)=\frac{1}{n}\sum_{i=0}^{n-1}\rho(T^i x,T^i y). \end{equation*} \notag $$
In this chapter we study the asymptotic behaviour of the sequence of $\varepsilon$-entropies of semimetric triples $(X,\mu,T_{\rm av}^n\rho)$. Recall that the $\varepsilon$-entropy $\mathbb{H}_\varepsilon(X,\mu,\rho)$ of a semimetric triple is the logarithm of the minimum possible number of balls of radius $\varepsilon$ that cover the whole space $X$ up a set of measure less than $\varepsilon$ (see Definition 2.11). Since we are only interested in the asymptotics of such functions, we consider them up to asymptotic equivalence in the following sense.

For two sequences $h = \{h_n\}$ and $h' = \{h'_n\}$ of non-negative numbers we write $h \preceq h'$ if $h_n=O(h'_n)$, and we write $h_n \asymp h'_n$ if both the relations $h\preceq h'$ and $h' \preceq h$ hold; in this case we say that the sequences $h$ and $h'$ are equivalent.

Definition 3.1. Given two functions $\Phi,\Psi \colon \mathbb{R}_+\times \mathbb{N} \to \mathbb{R}_+$, we write $\Phi \preceq \Psi$ if for any $\varepsilon>0$ there exists $\delta>0$ such that

$$ \begin{equation*} \Phi(\varepsilon,n) \preceq \Psi(\delta,n), \qquad n \to \infty. \end{equation*} \notag $$
In this case we say that $\Psi$ asymptotically dominates $\Phi$.

We call $\Phi$ and $\Psi$ equivalent and write $\Phi \asymp \Psi$ if $\Psi \preceq \Phi \preceq \Psi$. We denote by $[\Phi]$ the equivalence class of the given function $\Phi$ with respect to the relation $\asymp$, and we call it the asymptotic class of $\Phi$. The relation $\preceq$ extends naturally to equivalence classes and forms a partial order on the set of asymptotic classes.

Let us note that the equivalence class of a function $\Phi$ is in fact defined in two steps: first, for fixed $\varepsilon$ we consider the family of all sequences equivalent to $\Phi(\varepsilon, n)$ as $n$ tends to infinity, and then we identify all functions $\Phi$ with a given least upper bound of such families as $\varepsilon$ tends to zero (see § 3.3.2). Hence the class $[\Phi]$ can be viewed as a germ of the asymptotics of $\Phi$ at the point $(0, +\infty)$.

Given a semimetric $\rho \in \mathcal{A}dm(X,\mu)$, we consider the function $\Phi_\rho \colon \mathbb{R}_+\times \mathbb{N} \to \mathbb{R}_+$ defined by

$$ \begin{equation} \Phi_\rho(\varepsilon,n)=\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho). \end{equation} \tag{3.1} $$

The following theorem states that the asymptotic class of $\Phi_\rho$ does not depend on the choice of the admissible metric $\rho$ or, more generally, of the generating semimetric. A semimetric $\rho$ on $(X,\mu)$ is called generating for $T$ (or $T$-generating) if its translations by the action of $T$ separate points mod 0, that is, for a subset $X_0\subset X$ of full measure, for any $x,y \in X_0$ there exists $n$ such that $T^n\rho(x,y)>0$. Clearly, any metric is a generating semimetric. Note that the definition of a generating semimetric is a direct analogue of a generating partition in classical entropy theory (see [41]). The existence of countable and finite generating partitions was investigated in [41] and [30]. The following theorem, proved in [84], corresponds naturally to the Kolmogorov–Sinai theorem, which states that Kolmogorov entropy does not depend on a generating partition.

Theorem 3.2. Let $\rho_1, \rho_2 \in \mathcal{A}dm(X,\mu)$ be $T$-generating semimetrics. Then the classes $[\Phi_{\rho_1}]$ and $[\Phi_{\rho_2}]$ coincide.

The proof of Theorem 3.2 is based on the following lemma.

Lemma 3.3. Let $\rho_1,\rho_2 \in \mathcal{A}dm(X,\mu)$. If $\rho_1$ is $T$-generating then for any $\varepsilon>0$ there exists $\delta>0$ such that

$$ \begin{equation*} \mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho_2) \preceq \mathbb{H}_\delta(X,\mu,T_{\rm av}^n\rho_1), \qquad n \to \infty. \end{equation*} \notag $$

An outline of the proof of Lemma 3.3 and a more general version of this lemma which makes it possible to refine the invariant under consideration are presented in § 3.7.1.

Theorem 3.2 allows us to give the following definition of the scaling entropy of a dynamical system $(X,\mu,T)$.

Definition 3.4. The scaling entropy $\mathcal{H}(X,\mu, T)$ of a system $(X,\mu,T)$ is the asymptotic class $[\Phi_\rho]$ for some (hence any) $T$-generating semimetric $\rho \in \mathcal{A}dm(X,\mu)$.

We emphasize that the class $\mathcal{H}(X,\mu,T)$ is preserved by isomorphisms and is a measure-theoretic invariant of dynamical systems. Let us also note that scaling entropy is a monotone function with respect to taking a quotient.

Remark 3.5. Assume that a system $(X_2,\mu_2,T_2)$ is a quotient of another system $(X_1,\mu_1,T_1)$. Then $\mathcal{H}(X_2,\mu_2, T_2) \preceq \mathcal{H}(X_1,\mu_1, T_1)$.

An example of computation of the scaling entropy: the Bernoulli shift

The invariant definition of scaling entropy in terms of measurable metrics makes it possible in many cases to establish connections between measure-theoretic and topological dynamics (see, for example, [44], [51], and [50]). However, for an explicit computation of the scaling entropy of particular dynamical systems it is often useful to choose a generating partition and the corresponding cut semimetric (which is generating too). In this case the computation of our invariant reduces to considering a sequence of finite-dimensional cubes endowed each with the Hamming distance and the projection of a certain stationary measure $\mu$ onto the first $n$ coordinates. As an example of such a computation, we consider the classical Bernoulli shift on the binary alphabet.

Theorem 3.6. The shift map on the space of all binary sequences $X = \{0,1\}^\mathbb{Z}$ with the Bernoulli measure $\mu$ with parameter $1/2$ has scaling entropy $\mathcal{H} = [n]$.

Proof. Since the scaling entropy does not depend on the choice of the metric, it is sufficient for our computation to consider only the cut semimetric $\rho$ corresponding to the partition into the preimages of the zeroth coordinate. That is, for two sequences $x,y \in \{0,1\}^\mathbb{Z}$ the distance between $x$ and $y$ wih respect to the semimetric $\rho$ is $|x_0 - y_0|$.

The average $T_{\rm av}^n \rho$ is the Hamming distance corresponding to the first $n$ coordinates. Therefore, the semimetric triple $(\{0,1\}^\mathbb{Z}, \mu, T_{\rm av}^n \rho)$, up to taking the quotient by the sets of diameter $0$, is isomorphic to the binary cube of dimension $n$ with the uniform measure and Hamming distance.

Fix $\varepsilon \in (0,1/2)$. The measure of a ball of radius $\varepsilon$ does not exceed $2^{-c(\varepsilon)n }$ by the central limit theorem for the Bernoulli scheme. Hence

$$ \begin{equation*} c(\varepsilon) n \leqslant \mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho) \leqslant n, \end{equation*} \notag $$
where the upper bound is straightforward — the logarithm of the number of elements in the binary cube of dimension $n$. Therefore, the asymptotic class of $\Phi_\rho(\varepsilon, n) = \mathbb{H}_\varepsilon(X,\mu, T_{\rm av}^n\rho)$ coincides with the class of the function $(\varepsilon,n) \mapsto n$. Hence $\mathcal{H} = [n]$.

3.1.2. The definition of scaling entropy in terms of the Kantorovich distance

In the definition of scaling entropy, in place of the $\varepsilon$-entropy $\mathbb{H}_{\varepsilon}(X,\mu,\rho)$ we could use the quantity $\mathbb{H}^K_{\varepsilon}(X,\mu,\rho)$ defined using an approximation in the Kantorovich distance of the measure $\mu$ by discrete measures (see Definition 2.14). However, by doing so we obtain the same invariant: Lemma 2.15 and Lemma 2.16 show that the asymptotic behaviour of $\mathbb{H}^K_{\varepsilon}(X,\mu, T_{\rm av}^n\rho)$ and $\mathbb{H}_{\varepsilon}(X,\mu,T_{\rm av}^n\rho)$ is the same. Hence we have the following proposition.

Proposition 3.7. Let $T$ be a measure-preserving transformation of a measure space $(X,\mu)$ and $\rho \in \mathcal{A}dm(X,\mu)$ be a $T$-generating semimetric. Then

$$ \begin{equation*} \mathbb{H}^{\rm K}_{\varepsilon}(X,\mu,T_{\rm av}^n\rho) \in \mathcal{H}(X,\mu,T). \end{equation*} \notag $$

Proof. We find some $\varepsilon,\delta > 0$ satisfying the assumptions of Lemma 2.15. Notice that inequality (2.1) holds for these $\varepsilon$ and $\delta$, for all averages $T_{\rm av}^n\rho$ of the semimetric $\rho$. Applying Lemmas 2.15 and 2.16 to $T_{\rm av}^n\rho$ we obtain
$$ \begin{equation*} \mathbb{H}^{\rm K}_{\varepsilon}(X,\mu,T_{\rm av}^n\rho) \leqslant 2\mathbb{H}_{\delta}(X,\mu,T_{\rm av}^n\rho) \leqslant \frac{4}{\delta}\bigl(\mathbb{H}^{\rm K}_{\delta^2/4} (X,\mu,T_{\rm av}^n\rho)+1\bigr). \end{equation*} \notag $$
Therefore, $\mathbb{H}^K_{\varepsilon}(X,\mu,T_{\rm av}^n\rho) \in [\Phi_\rho] = \mathcal{H}(X,\mu,T)$.

3.1.3. The definition of scaling entropy in terms of a partition function

Instead of the usual average of a metric over $n$ iterations of a transformation $T$, we could consider the partition function from § 1.2.6 (also see [69]), that is, a weighted average with exponentially decaying weights. Recall that

$$ \begin{equation*} \Omega_T(\rho,z)\equiv \Omega_T(z)=(1-z)\sum_{n=0}^{\infty} z^n \rho(T^n x, T^n y),\qquad z\in [0,1). \end{equation*} \notag $$
Then we can consider an entropy function defined similarly to (3.1):
$$ \begin{equation*} \widetilde\Phi_\rho(\varepsilon,z)= \mathbb{H}_\varepsilon(x,\mu,\Omega_T(\rho,z)). \end{equation*} \notag $$
Following Definition 3.1 we can consider the asymptotic class $[\widetilde\Phi_\rho]$. Then, following the proof of Theorem 3.2 it is not difficult to show that this class does not depend on the choice of a generating semimetric $\rho \in \mathcal{A}dm(X, \mu)$ and is a measure-theoretic invariant. However, this invariant provides nothing essentially new: the class $[\widetilde\Phi_\rho]$ is fully determined by the scaling entropy $\mathcal{H}(T)$ as follows.

Proposition 3.8. Let $\rho \in \mathcal{A}dm(X, \mu)$ be a generating semimetric. Then the function $\widetilde\Phi_\rho(\varepsilon,1-1/n)$ belongs to the class $\mathcal{H}(T)$.

The proof of Proposition 3.8 is given in Appendix A.3.

3.2. Scaling entropy sequence

Before we discuss the properties of scaling entropy in the general case, we focus on an important particular case where the asymptotic behaviour of the $\varepsilon$-entropies of averages does not essentially depend on $\varepsilon$. In this case our invariant can be simplified significantly, and it becomes a class of asymptotically equivalent sequences, which we call the scaling entropy sequence. We call an automorphism that has such a sequence stable. Bernoulli shifts, as well as all transformations with positive Kolmogorov entropy, transformations with pure point spectrum, and many others (see § 3.2.2), are stable. In fact, in the pioneering papers [62]–[64] only the case of a stable transformation was considered, and it was conjectured to be the general case. As we will see in § 3.3.1, there exist ergodic automorphisms that are not stable. Despite that, the scaling entropy sequence plays an important role in the theory of scaling entropy which we present here. The case of a stable transformation was studied in [62]–[64], [78], [84], [85], and [87].

3.2.1. The definition of a scaling sequence. Stable classes

Definition 3.9. We call an asymptotic class $\mathcal{H}$ stable if it contains a function $\Phi(\varepsilon, n)$ that does not depend on $\varepsilon$, that is, $\Phi(\varepsilon,n) = h_n$. A dynamical system $(X,\mu, T)$ is called stable if the class $\mathcal{H}(X,\mu, T)$ is stable.

Note that for two functions $\Phi(\varepsilon, n) = h_n$ and $\Phi'(\varepsilon, n) = h'_n$ relations $\preceq$ and $\asymp$ are satisfied if and only if they hold for the corresponding sequences $h = \{h_n\}$ and $h' = \{h'_n\}$.

Definition 3.10. Let $\rho \in \mathcal{A}dm(X,\mu)$ be a summable generating admissible semimetric on $(X,\mu)$. A non-decreasing sequence $h$, $h = \{h_n\}_{n \in \mathbb{N}}$, of positive numbers is called scaling for $(X,\mu,T,\rho)$ if for any sufficiently small positive $\varepsilon$

$$ \begin{equation*} \mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho) \asymp h_n. \end{equation*} \notag $$
It makes sense to consider the whole class of equivalent sequences. Indeed, if a sequence $h$ is a scaling sequence, then any other sequence $h'$ is scaling if and only if $h' \asymp h$. We denote the class of scaling sequences by $\mathcal{H}_{\rm seq}(X,\mu,T,\rho)$.

The class $\mathcal{H}_{\rm seq}(X,\mu,T,\rho)$ is a section of the asymptotic class $[\Phi_\rho]$ by the set of all sequences. Therefore, the following theorem which was proved in [84], is a consequence of Theorem 3.2 (the Invariance Theorem).

Theorem 3.11. Let $\rho_1, \rho_2 \in \mathcal{A}dm(X,\mu)$ be admissible $T$-generating semimetrics. Then

$$ \begin{equation*} \mathcal{H}_{\rm seq}(X,\mu,T,\rho_1)=\mathcal{H}_{\rm seq}(X,\mu,T,\rho_2). \end{equation*} \notag $$

Theorem 3.11 allows us to give the following definition of the scaling entropy sequence of a dynamical system $(X,\mu,T)$.

Definition 3.12. A sequence $h =\{h_n\}$ is called a scaling sequence of a system $(X,\mu,T)$ if $h \in \mathcal{H}_{\rm seq}(X,\mu,T,\rho)$ for some (hence any) $T$-generating semimetric $\rho \in \mathcal{A}dm(X,\mu)$. We denote the class of all scaling sequences of $(X,\mu,T)$ by $\mathcal{H}_{\rm seq}(X,\mu,T)$.

Let us emphasize that the class $\mathcal{H}_{\rm seq}(X,\mu,T)$ of scaling sequences is a measure- theoretic invariant of dynamical systems. However, unlike the class $\mathcal{H}$, the class of sequences $\mathcal{H}_{\rm seq}$ can be empty for some systems, as we will see in § 3.3.1.

3.2.2. Possible values of a scaling sequence

One of the first naturally arising questions is as follows: what values can the above invariant assume, that is, what sequences can be scaling sequences for some dynamical systems? The following two theorems, proved in [85] and [87], give a description of all possible values of the scaling sequence.

Theorem 3.13. If the class $\mathcal{H}_{\rm seq}(X,\mu,T)$ of all scaling sequences is non-empty, then it contains a non-decreasing subadditive sequence of positive numbers.

Theorem 3.14. Any subadditive non-decreasing sequence of positive numbers is a scaling sequence for some ergodic system $(X,\mu,T)$.

In [85], an explicit construction of an automorphism with prescribed scaling sequence was provided, which uses the adic transformation on the graph of ordered pairs and special central measures on this graph (also see [75]–[77] and § 3.4.3 in this survey). Theorem 3.14 provides a rich family of automorphisms with prescribed entropy properties and is a key result in the theory of scaling entropy, as well as a useful tool in applications.

Note that Theorem 3.14 shows a significant difference between the scaling entropy and complexity of a topological dynamical system: the complexity function $p(n)$ is either bounded or $p(n) \geqslant n$, while the scaling entropy can have an arbitrary prescribed asymptotic behaviour. However, the inequality $\mathcal{H} \preceq [\log p(n)]$ holds for any invariant measure.

In [10] and [24] possible values of similar invariants of slow entropy type were investigated. We discuss the relationship between these invariants and scaling entropy in § 3.7.2.

We denote by $\mathrm{Subadd}$ the set of equivalence classes of subadditive non-decreasing sequences with respect to equivalence relation $\asymp$. The relation $\preceq$ is naturally defined on such classes and forms a partial order on $\mathrm{Subadd}$, making it an upper semilattice.

The least element of $\mathrm{Subadd}$ is the equivalence class of the constant sequence $h_n=1$, and the greatest element is the equivalence class of a linear function $h_n=n$. The following two theorems, proved in [64], [78], and [84], describe the dynamical systems whose scaling sequence assumes these extreme values.

Theorem 3.15. Let $T$ be an automorphism of a measure space $(X,\mu)$. Then the following properties are equivalent:

(a) $T$ has a pure point spectrum;

(b) the sequence $h_n=1$ is a scaling sequence for $(X,\mu,T)$;

(c) there exists a $T$-invariant admissible metric $\rho \in \mathcal{A}dm(X,\mu)$.

Theorem 3.16. The sequence $h_n=n$ is a scaling sequence for $(X,\mu,T)$ if and only if Kolmogorov entropy is positive: $h(T)>0$.

In particular, systems with the maximal and minimal growth of scaling entropy are stable.

3.3. Properties of scaling entropy

3.3.1. Example of a non-stable system

For some time, the question of the existence of a scaling sequence for any ergodic automorphism $T$ of a standard probability space $(X, \mu)$ remained open. An example of such an ergodic automorphism $T$ and an admissible metric $\rho$ on $(X, \mu)$ for which the scaling sequence in the sense of Definition 3.10 does not exist was constructed in [48]. The reason for this phenomenon is that for different $\varepsilon>0$, the growth rate of $\mathbb{H}_\varepsilon(X, \mu, T_{\rm av}^n\rho)$ with respect to $n$ can be significantly different: for each $\varepsilon>0$ there exists $\delta>0$ such that

$$ \begin{equation*} \lim_{n \to \infty} \frac{\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho)} {\mathbb{H}_\delta(X,\mu,T_{\rm av}^n\rho)}=0. \end{equation*} \notag $$

However, the construction of such an automorphism requires the existence of stable systems that satisfy Theorem 3.14. Let us select a family $h^{(k)} = {h_n^{(k)}}_{n \in \mathbb{N}}$, $k \in \mathbb{N}$, of increasing subadditive sequences in such a way that the following holds for each $k$:

$$ \begin{equation*} h_n^{(k)}=o(h_n^{(k+1)}), \qquad n \to \infty. \end{equation*} \notag $$
For each $k\in \mathbb{N}$ find an ergodic automorphism $T_k$ of a probability space $(X_k, \mu_k)$ such that $h^{(k)} \in \mathcal{H}_{\rm seq}(X_k,\mu_k,T_k)$. Such an automorphism exists by Theorem 3.14. The following theorem, proved in [48], guarantees the existence of unstable automorphisms.

Theorem 3.17. Let $(X,\mu,T)$ be an ergodic joining of a family of systems $(X_k,\mu_k,T_k)$, $k \in \mathbb{N}$. Then, the class $\mathcal{H}_{\rm seq}(X,\mu,T)$ of scaling entropy sequences for the system $(X,\mu,T)$ is empty.

The main idea of the proof is that if the system $(X,\mu,T)$ has a scaling sequence $h$, then this sequence is the least upper bound for the family of sequences $h^{(k)}$, $k \in \mathbb{N}$. However, the set $\mathrm{Subadd}$ does not contain the least upper bound for a strictly increasing sequence of elements, and therefore the class $\mathcal{H}_{\rm seq}(X,\mu,T)$ is empty. This example explains the meaning of Definition 3.1: the asymptotic class of a function $\Phi$ can be identified with the least upper bound of the classes of sequences $\Phi(\varepsilon, \,\cdot\,)$.

3.3.2. Possible values of scaling entropy. The semilattice of functions

In this section we study the asymptotic classes that can be the scaling entropy of a dynamical system. The following two theorems, proved in [48], provide a complete description of the possible values of scaling entropy.

Theorem 3.18. For any system $(X,\mu,T)$, in the scaling entropy class $\mathcal{H}(X,\mu,T)$ one can always find a function $\Phi \colon \mathbb{R}_+\times \mathbb{N} \to \mathbb{R}_+$ with the following properties:

(a) $\Phi(\,\cdot\,, n)$ is non-increasing for any $n \in \mathbb{N}$;

(b) $\Phi(\varepsilon, \,\cdot\,)$ is non-decreasing and subadditive for any $\varepsilon>0$.

Theorem 3.19. For any function $\Phi \colon \mathbb{R}_+\times \mathbb{N} \to \mathbb{R}_+$ satisfying properties (a) and (b) from the previous theorem there exists an ergodic system $(X,\mu,T)$ such that $\Phi \in \mathcal{H}(X,\mu,T)$.

The proofs of Theorems 3.18 and 3.19 rely crucially on the corresponding results for the stable case (Theorems 3.13 and 3.14). It is worth noting that an analogue of Theorem 3.19 for actions of amenable groups (see § 3.5) is unknown to the authors: the question of describing the set of possible values of scaling entropy for group actions remains open.

One can consider the partially ordered set of equivalence classes of all functions of two variables satisfying the monotonicity conditions 1) and 2) from Theorem 3.18. This set forms an upper semilattice. The semilattice $\mathrm{Subadd}$ of subadditive non-decreasing sequences is naturally embedded into this lattice. Unlike $\mathrm{Subadd}$, the semilattice of functions under consideration possesses the following property: any countable subset of it has a least upper bound. Moreover, this semilattice of functions is the minimal semilattice containing $\mathrm{Subadd}$ with this property, as each scaling entropy $[\Phi(\,\cdot\,{,}\,\cdot\,)]$ is the least upper bound for the countable set of sequences $h^m=\{\Phi(1/m,n)\}_n$, $m \in \mathbb{N}$.

Proposition 3.20. Let $(X_k,\mu_k,T_k)$ be a (finite or countable) sequence of systems and $(X,\mu,T)$ be their joining. Then $\mathcal{H}(X,\mu,T)$ is the least upper bound of the sequence $\mathcal{H}(X_k,\mu_k,T_k)$.

3.3.3. Generic scaling entropy

The group $\operatorname{Aut}(X,\mu)$ of all automorphisms of the space $(X,\mu)$ equipped with the weak topology is a Polish topological space. This allows us to examine the genericity of the automorphisms with given properties. It turns out that the scaling entropy of a generic automorphism $T\in \operatorname{Aut}(X,\mu)$ is not comparable to an arbitrary fixed function (except for the trivial extreme cases of bounded and linear growth). The following theorem was proved in [49].

Theorem 3.21. Let $\Phi(\varepsilon,n)$ be a function that decreases in $\varepsilon$ for each $n$ and is increasing, unbounded, and sublinear in $n$ for each $\varepsilon>0$. Then the set of automorphisms whose scaling entropy is not comparable to $\Phi$ is comeager in $\operatorname{Aut}(X,\mu)$.

Remark 3.22. Theorem 3.21 can be reformulated as two statements: the set of automorphisms $T$ satisfying $\mathcal{H}(T) \prec \Phi$ is meager, and the set of automorphisms $T$ satisfying $\mathcal{H}(T) \succ \Phi$ is also meager.

The proof of Theorem 3.21 uses the connection between scaling entropy and Kirillov–Kushnirenko sequential entropy (see [31]) and the results in [42] on the genericity of infinite Kirillov–Kushnirenko entropy. We discuss the connection between scaling entropy and sequential entropy in § 3.7.2. We mention that related results on generic values of related invariants were obtained in [1] and [3].

3.3.4. Scaling entropy and ergodic decomposition

The scaling entropy of a non-ergodic system can grow more rapidly than the scaling entropy of all ergodic components. Indeed, consider a classical example of a non-ergodic transformation of the torus (see, for instance, [31]):

$$ \begin{equation*} (x,y)\mapsto (x,y+x),\quad\text{where}\quad x, y \in \mathbb{T}^2. \end{equation*} \notag $$
The ergodic components of this transformation are rotations of the circle, which have a bounded scaling entropy. However, the system itself has unbounded scaling entropy $\mathcal{H} = [\log n]$. Nevertheless, there exists an estimate in the opposite direction.

Proposition 3.23. Let $(X,\mu,T)$ be a dynamical system, and let

$$ \begin{equation*} \mu=\int \mu_\alpha\,d\nu(\alpha) \end{equation*} \notag $$
be its decomposition into ergodic components. Let the sequence $h = (h_n)$ be such that for any $\alpha$ in some set of positive measure,
$$ \begin{equation*} \mathcal{H}(X,\mu_\alpha,T) \succeq h_n. \end{equation*} \notag $$
Then $\mathcal{H}(X,\mu,T) \succeq h_n$.

The proof of Proposition 3.23 is given in Appendix A.4.

3.4. Examples: computing the scaling entropy

In this section we present several results on the explicit computation of the scaling entropy for several automorphisms. It is worth mentioning that finding the scaling entropy can in general be a challenging computation. We state several open questions on computing the scaling entropy for some well-known transformations, such as the Pascal automorphism (see [65]), in Appendix B.

We mentioned in § 3.2.2 that transformations with positive Kolmogorov entropy and only they have scaling entropy $\mathcal{H} = [n]$, while transformations with pure point spectrum and only they have a bounded scaling entropy, that is, $\mathcal{H} = [1]$.

3.4.1. Substitution dynamical systems

Let $A$ be an alphabet of finite size, $|A| > 1$. We denote by $A^*$ the set of all words of finite length over $A$. A substitution is an arbitrary mapping $\xi\colon A \to A^*$. The mapping $\xi$ naturally extends to a mapping $\xi\colon A^* \to A^*$ and, moreover, to a mapping $\xi\colon A^{\mathbb{N}} \to A^{\mathbb{N}}$, where $A^{\mathbb{N}}$ is the space of one-sided sequences of elements of $A$, equipped with the standard product topology. We assume that the substitution $\xi$ is such that there exists an infinite word $u\in A^{\mathbb{N}}$ invariant under $\xi$: $\xi(u) = u$. Let $T\colon A^{\mathbb{N}} \to A^{\mathbb{N}}$ be the left shift. A substitution dynamical system is defined as the pair $(X_\xi,T)$, where $X_\xi$ is the closure of the orbit of the point $u$ under the action of the transformation $T$. A substitution is called primitive if for some $n \in \mathbb{N}$ and all $\alpha, \beta \in A$ the character $\beta$ appears in the word $\xi^n(\alpha)$. If the substitution $\xi$ is primitive, then there exists a unique $T$-invariant Borel probability measure $\mu^\xi$ on the compact topological space $X_\xi$.

We say that $\xi$ is a substitution of constant length if there exists a natural number $q \in \mathbb{N}$ such that $|\xi(\alpha)| = q$ for any $\alpha \in A$. The height $h(\xi)$ of a substitution is defined as the largest natural number $k$ coprime with $q$ such that if $u_n = u_0$, then $k \mid n$. The column number $c(\xi)$ is defined by

$$ \begin{equation*} c(\xi)=\min \bigl\{|\{\xi^k(\alpha)_i \colon \alpha \in A \}| \colon k \in \mathbb{N}, i < q^k \bigr\}. \end{equation*} \notag $$
For more details about substitution dynamical systems, see, for example, [38]. The scaling entropy of a substitution dynamical system corresponding to a substitution of constant length was computed in [84].

Theorem 3.24. Let $\xi$ be an injective primitive substitution of constant length. Then $\mathcal{H}(X_\xi, \mu^\xi, T) = [\log n]$ if $c(\xi) \ne h(\xi)$, and $\mathcal{H}(X_\xi, \mu^\xi, T) = [1]$, if $c(\xi) = h(\xi)$.

One special case of substitution systems described by Theorem 3.24 is the Morse automorphism.

Corollary 3.25. The Morse automorphism has scaling entropy $\mathcal{H}(T) = [\log n]$.

The Chacon automorphism is not a substitution of constant length, however, it also exhibits a logarithmic scaling entropy. The following theorem follows from the results in [9].

Theorem 3.26. The Chacon automorphism has scaling entropy $\mathcal{H}(T) = [\log n]$.

The connections between substitutions and stationary adic transformations were studied in [73]. One can also find there examples of adic realizations of substitution dynamical systems, including the Chacon automorphism.

3.4.2. Horocycle flows

Another example of a classical automorphism with logarithmic scaling entropy is a horocycle flow. Related entropy-like invariants for classical flows were considered in [20], [22], and [31]. The following theorem follows from the results in [22].

Theorem 3.27. The horocycle flow on a compact surface with constant negative curvature has scaling entropy $\mathcal{H}(T) = [\log n]$.

It is noteworthy that, despite the significant difference in scaling entropy, the horocycle flow and Bernoulli automorphism share the same multiple Lebesgue spectrum.

Also note that, for transformations with logarithmic scaling entropy, it makes sense to compute a refinement of our invariant, exponential scaling entropy (see § 3.7.1).

3.4.3. The adic transformation on the graph of ordered pairs

Adic transformations (Vershik automorphisms), including the adic transformation on the graph of ordered pairs, were considered in [53], [54], [75]–[77], [85], and [51]. The construction presented below is a key to Theorems 3.14 and 3.19. It provides an explicit realization of any subadditive and increasing scaling entropy sequence.

Consider an infinite graded graph $\Gamma = (V, E)$. The vertex set $V$ of the graph $\Gamma$ is a disjoint union of the sets $V_n = \{0,1\}^{2^n}$, where $n \geqslant 0$. The edge set $E$ is defined, together with a colouring $\mathfrak{c} \colon E \to \{0,1\}$, as follows. Let $v_n \in V_n$ and $v_{n+1} \in V_{n+1}$. An edge $e = (v_n, v_{n+1})$ belongs to $E$ if the word $v_n$ is a prefix or a suffix of the word $v_{n+1}$, and it is labelled with the symbol $0$ or $1$ respectively.

A Borel measure on the space $X$ of all infinite paths in the graph $\Gamma$ is called central if, given a fixed tail of a path, and all of its starting points are equiprobable.

Let us define the adic transformation $T$ on the space of paths $X$. Let $x = \{e_i\}_{i=1}^\infty$ be an infinite path. We find the smallest $n$ such that $\mathfrak{c}(e_n) = 0$. We define the path $T(x) = \{u_i\}$ as follows. For $i \geqslant n+1$, we have $u_i = e_i$; $\mathfrak{c}(u_n) = 1$, and $\mathfrak{c}(u_i) = 0$ for all $i < n$. For any central measure $\mu$ the transformation $T$ is an automorphism of the measure space $(X, \mu)$.

Fix a sequence $\sigma = \{\sigma_n\}$ of zeros and ones. We construct the corresponding central measure $\mu^\sigma$ on the space $X$. A Borel measure $\mu$ on $X$ is uniquely determined by a system of measures $\mu_n$ on the cylindrical sets corresponding to finite paths of length $n$. In terms of $\mu_n$, the centrality of the measure $\mu$ means that for any $n$ the measure $\mu_n$ depends only on the terminal point of the path. Let $\nu_n$ be the projection of $\mu_n$ onto the vertex set $V_n$ in accordance with the terminal point of the path. The system of measures $\nu_n$ determines uniquely the central measure $\mu$.

Let us construct a sequence of sets $V_n^\sigma$, where $V_n^\sigma \subset V_n$. We set $V_0^\sigma = V_0$. For $n \geqslant 1$ we define

$$ \begin{equation*} V_n^\sigma=\begin{cases} \{ab \colon a,b \in V_{n-1}^\sigma\} & \text{if}\ \sigma_n=1; \\ \{aa \colon a \in V_{n-1}^\sigma\} & \text{if}\ \sigma_n=0. \end{cases} \end{equation*} \notag $$
Let $\nu_n^\sigma$ denote the uniform measure on the set $V_n^\sigma \subset V_n$. The measure $\mu^\sigma$ constructed from this system is consistently defined and central.

Theorem 3.28. The adic transformation on the paths of the graph of ordered pairs with measure $\mu^\sigma$ has scaling entropy

$$ \begin{equation*} \mathcal{H}(T)=[2 ^{\sum_{i=0}^{\log n} \sigma_i}]. \end{equation*} \notag $$

3.5. Scaling entropy of a group action

In this section we present some introductory facts and examples related to the generalization of the notion of scaling entropy for actions of general discrete groups. The classical entropy theory can be applied to a large extent to actions of amenable groups (see [35]). The aforementioned theory of scaling entropy can only be partially extended from the case of a single automorphism to group actions. Most of the results we discuss in this subsection deal with amenable groups. However, we provide a definition of scaling entropy for actions of arbitrary countable groups which is invariant under changes of the admissible generating semimetric. The scaling entropy for group actions was studied in [85], [87], [51], and [50]. Related invariants of probability measure preserving actions of amenable groups were also considered in [24] and [33].

3.5.1. Definition of scaling entropy of a group action

Consider a countable group $G$ acting by automorphisms on the standard probability space $(X,\mu)$.

Definition 3.29. Equipment of a countable group $G$ is a sequence $\sigma = \{G_n\}_{n \in \mathbb{N}}$ of finite subsets of the group such that $|G_n| \to +\infty$. We denote a group with fixed equipment by $(G,\sigma)$.

For a semimetric $\rho$ on $(X,\mu)$ and a subset $H \subset G$ we denote by $H^{\rm av} \rho$ the averaging of the semimetric $\rho$ over the shifts by elements $g \in H$:

$$ \begin{equation*} H^{\rm av} \rho (x,y)=\frac{1}{|H|}\sum_{g \in H} \rho(gx,gy). \end{equation*} \notag $$

Given an action of an equipped group $G$ with equipment $\sigma=\{G_n\}$ on the standard probability space $(X,\mu)$ and a semimetric $\rho \in \mathcal{A}dm(X,\mu)$, we define the function $\Phi_\rho$ on $\mathbb{R}_+\times \mathbb{N}$ similarly to formula (3.1):

$$ \begin{equation*} \Phi_\rho(\varepsilon, n)=\mathbb{H}_\varepsilon(X,\mu,G_n^{\rm av}\rho). \end{equation*} \notag $$

For group actions the cases of admissible metrics and semimetrics are different. For the averages of admissible metrics, an analogue of Lemma 3.3 holds, which was proved in [85]. The corresponding statement for semimetrics (Theorem 3.35) is presented in what follows.

Lemma 3.30. Let $\rho_1,\rho_2 \in \mathcal{A}dm(X,\mu)$. If $\rho_1$ is a metric, then for any $\epsilon>0$ there exists $\delta>0$ such that

$$ \begin{equation*} \Phi_{\rho_2}(\varepsilon,n) \preceq \Phi_{\rho_1}(\delta,n), \qquad n \to \infty. \end{equation*} \notag $$
In other words,
$$ \begin{equation*} \Phi_{\rho_2} \preceq \Phi_{\rho_1}. \end{equation*} \notag $$

Corollary 3.31. If $\rho_1,\rho_2 \in \mathcal{A}dm(X,\mu)$ are metrics, then $\Phi_{\rho_1} \asymp \Phi_{\rho_2}$.

Thus, as before, we can define the scaling entropy of an action of an equipped group $(G,\sigma)$ on $(X,\mu)$.

Definition 3.32. The scaling entropy of an action of an equipped group $(G,\sigma)$ on $(X,\mu)$ is the equivalence class $[\Phi_{\rho}]$ for some (hence any) metric $\rho \in \mathcal{A}dm(X,\mu)$. We denote this class by $\mathcal{H}(X,\mu,G, \sigma)$.

Let us emphasize that the asymptotic class $\mathcal{H}(X,\mu, G, \sigma)$ is a measure-theoretic invariant of an action.

Definition 3.33. We say that equipment $\sigma = \{G_n\}_{n \in \mathbb{N}}$ is suitable if for any $g \in \bigcup G_n$ and any $\delta>0$ there exists a number $k \in \mathbb{N}$ such that for each $n$ there exist elements $g_1, \dots, g_k \in G$ such that

$$ \begin{equation*} \biggl|gG_n \setminus \bigcup_{j=1}^{k}G_ng_j \biggr| \leqslant \delta|G_n|. \end{equation*} \notag $$

Remark 3.34. Equipment is suitable if it is

Note that not every group has suitable equipment by subsets that together generate the whole group. For instance, the free group $F_A$ over an infinite alphabet $A$ does not have such equipment.

We say that a semimetric $\rho \in \mathcal{A}dm(X,\mu)$ is generating for the action of an equipped group $(G,\sigma)$ (or $G$-generating) if its translations $g^{-1}\rho$, $g \in \bigcup G_n$, by elements of the union of the equipment $\sigma$ separate points in some subset of full measure. In the case of an amenable group $G$ equipped with a Følner sequence, we call a semimetric $\rho$ generating if its translations, taken together, separate points in some subset of full measure. The following theorem was proved in [85].

Theorem 3.35. Let $\sigma = \{G_n\}$ be suitable equipment of a group $G$. Let $\rho_1,\rho_2 \in \mathcal{A}dm(X,\mu)$. If $\rho_1$ is a $G$-generating semimetric, then

$$ \begin{equation*} \Phi_{\rho_2} \preceq \Phi_{\rho_1}. \end{equation*} \notag $$
In particular,
$$ \begin{equation*} [\Phi_{\rho_1}]=\mathcal{H}(X,\mu,G,\sigma). \end{equation*} \notag $$

3.5.2. Properties of the scaling entropy of a group action

The natural question is whether the scaling entropy of a group action depends on the choice of equipment. A simple observation shows that a small change of equipment does not change the scaling entropy.

Remark 3.36. If the equipment $\sigma_1=\{G_n^{(1)}\}$ and the equipment $\sigma_2=\{G_n^{(2)}\}$ of a group $G$ are such that

$$ \begin{equation*} |G_n^{(1)}\mathbin{\Delta} G_n^{(2)}|=o(|G_n^{(1)}|), \qquad n \to \infty, \end{equation*} \notag $$
then the scaling entropies of the actions of $G$ equipped with $\sigma_1$ and $\sigma_2$ coincide:
$$ \begin{equation*} \mathcal{H}(X,\mu,G,\sigma_1)=\mathcal{H}(X,\mu,G,\sigma_2). \end{equation*} \notag $$

There is a simple upper bound for the scaling entropy.

Theorem 3.37. Let the group $G$ with equipment $\sigma=\{G_n\}$ act by automorphisms on a measure space $(X,\mu)$. Then for $\Phi \in \mathcal{H}(X,\mu,G,\sigma)$ the following inequality holds for every $\varepsilon>0$:

$$ \begin{equation*} \Phi(\varepsilon,n) \preceq |G_n|, \qquad n \to \infty. \end{equation*} \notag $$

In the case of an amenable group $G$ with Følner equipment $\sigma$ the previous theorem admits a refinement.

Theorem 3.38. Let the amenable group $G$ equipped with a Følner sequence $\sigma=\{G_n\}$ act by automorphisms on a measure space $(X,\mu)$. Then for $\Phi \in \mathcal{H}(X,\mu,G,\sigma)$ the asymptotic relation

$$ \begin{equation*} \Phi(\varepsilon, n)=o(|G_n|), \qquad n \to \infty, \end{equation*} \notag $$
holds for any $\varepsilon>0$ if and only if the Kolmogorov entropy of the action of $G$ on $(X,\mu)$ is zero.

For a finitely generated group the existence of a compact free action is equivalent to the group being residually finite. If the group is also amenable, then the compactness of the action is equivalent to the boundedness of the scaling entropy. The following generalization of Theorem 3.15 was proved in [81].

Theorem 3.39. Let the amenable group $G$ equipped by a Følner sequence $\sigma=\{G_n\}$ act by automorphisms on a measure space $(X,\mu)$. Then $\mathcal{H}(X,\mu,G,\sigma) = [1]$ if and only if the action of $G$ on $(X,\mu)$ is compact.

3.5.3. The scaling entropy of a generic action

Given a countable group $G$, the set of all of its measure-preserving actions $A(X, \mu, G)$ on a Lebesgue space $(X,\mu)$ forms a Polish topological space, allowing us to discuss the generic properties of actions of $G$. For more details on the theory of generic group actions, see [25]. The following theorem, proved in [50], generalizes a similar result concerning the absence of non-trivial upper bounds for the scaling entropy of a generic automorphism to the case of an arbitrary amenable group.

Theorem 3.40. Let $G$ be an amenable group and $\sigma=\{F_n\}$ be a Følner sequence of $G$. Let $\phi(n) = o(|F_n|)$ be a sequence of positive numbers. Then the set of actions $\alpha \in A(X, \mu, G)$ for which $\mathcal{H}(\alpha, \sigma) \not\prec \phi$ contains a dense $G_\delta$ subset.

The proof of Theorem 3.40 follows, with some adjustments, the proof of Theorem 3.21 and uses results from [42]. Related results for similar invariants in the context of generic extensions were obtained in [33].

Note that a direct analogue of Theorem 3.19, that is, a full description of the possible values of scaling entropy for group actions is not known to the authors. However, a weak version of this theorem follows from Theorem 3.40: for any sequence $\phi(n) = o(|F_n|)$ there exists an ergodic action of the group $G$ with scaling entropy $\mathcal{H}$ that grows more rapidly than $\phi$ along some subsequence. For non-periodic amenable groups, explicit constructions of such measure-preserving actions can be obtained using coinduction from an action of a subgroup to the action of the ambient group (see [51]). Also note that explicit constructions of such actions for arbitrary amenable groups are not known to the authors.

Unlike non-trivial upper bounds for the scaling entropy of a generic system, the absence of lower bounds requires certain conditions on the group. In particular, a sufficient condition is the existence of a compact free action of the group. Thus, the following theorem, proved in [50], holds.

Theorem 3.41. Let $G$ be a residually finite amenable group and $\sigma=\{F_n\}$ be a Følner sequence of $G$. Let $\phi(n)$ be a sequence of positive numbers increasing to infinity. Then the set of actions $\alpha \in A(X, \mu, G)$ for which $\mathcal{H}(\alpha, \sigma) \not\succ \phi$ contains a dense $G_\delta$-subset.

We show in § 3.5.4 that in order for a group $G$ to have no non-trivial lower bounds on the scaling entropy of a generic action it is necessary to impose certain conditions on $G$.

3.5.4. A scaling entropy growth gap

The following theorem, proved in [50], shows that there exist non-residually finite amenable groups for which the conclusion of Theorem 3.41 is not true.

Theorem 3.42. Let $G = \operatorname{SL}(2, \overline{\mathbb{F}}_p)$ be the group of all $2\times2$ matrices over the algebraic closure of the finite field $\mathbb{F}_p$, where $p > 2$. Let $\mathbb{F}_{p} = \mathbb{F}_{q_0} \subset \mathbb{F}_{q_1} \subset \cdots$ be a sequence of finite extensions that, taken together, cover the whole of $\overline{\mathbb{F}}_p$, and let $\sigma = \{\operatorname{SL}(2, \mathbb{F}_{q_n})\}$ be the equipment of $G$ by a sequence of increasing finite subgroups. Then any free action $\alpha \in A(X, \mu, G)$ satisfies

$$ \begin{equation*} \mathcal{H}(\alpha,\sigma) \succeq \log q_n. \end{equation*} \notag $$

The proof of Theorem 3.42 is based on the theory of growth in finite groups $\operatorname{SL}(2, \mathbb{F}_{q_n})$ and, in particular, on Helfgott’s theorem and its generalizations (see [15] and [37]), as well as on the representation theory of these groups (see [43] and [17]).

Definition 3.43. We say that an amenable group $G$ equipped with a Følner sequence $\sigma$ has a scaling entropy growth gap if there exists an increasing sequence $\phi(n)$ tending to infinity such that for any free action $\alpha$ of $G$ we have

$$ \begin{equation*} \mathcal{H}(\alpha,\sigma) \succeq \phi(n). \end{equation*} \notag $$

Proposition 3.44. For an amenable group, the property of having a scaling entropy growth gap does not depend on the choice of a Følner sequence.

Thus, the property of having a scaling entropy growth gap is a group property of an amenable group. It is unknown to the authors whether the group $S_\infty$ of all finite permutations of the natural numbers or infinite finitely generated simple amenable groups (see, for example, [18]) have this property.3

3.6. Problem of a universal zero-entropy system

Universal systems in various contexts were studied by many authors in numerous papers; see, for example, [8], [44], [75]–[77], [48], and [50]. We follow the definition proposed in [8] and [44]. Let $\mathcal{S}$ be a class of measure-preserving actions of an amenable group $G$. A topological system $(X, G)$ is called universal for the class $\mathcal{S}$ if for any invariant measure $\mu$ on $X$ the system $(X, \mu, G)$ belongs to $\mathcal{S}$ and, conversely, any system in $\mathcal{S}$ can be realized using some invariant measure $\mu$ on $X$. In [44] Serafin addresses the question, going back to Weiss, about the existence of a universal dynamical system for the class $\mathcal{S}$ consisting of all actions with zero measure-theoretic entropy. In [44] the negative answer was given in the case $G = \mathbb{Z}$. Serafin pointed out there that his approach, based on the theory of symbolic coding and the theory of algorithmic complexity, did not yield the required result for arbitrary amenable groups.

The theory of scaling entropy allows us to give a negative answer to Weiss’s question for all amenable groups. The following result was obtained in [51] and [50].

Theorem 3.45. No infinite amenable group $G$ admits a universal zero entropy system.

The main role in proving this result is played by a special series of group actions that satisfy certain conditions on the growth of scaling entropy.

Definition 3.46. We say that a group $G$ with equipment $\sigma=\{G_n\}$ admits actions of almost complete growth if for any non-negative function $\phi(n) = o (|G_n|)$ there exists an ergodic system $(X, \mu, G)$ such that the following relations hold for any $\Phi \in \mathcal{H} (X, \mu, G, \sigma)$ and any sufficiently small $\varepsilon > 0$:

$$ \begin{equation*} \Phi(\varepsilon,n) \not \preceq \phi(n)\quad\text{and}\quad \Phi(\varepsilon,n)=o(|G_n|). \end{equation*} \notag $$

The existence of such actions is a sufficient condition for the absence of a universal system of zero entropy. Actions of almost complete growth for a non-periodic amenable group and any Følner sequence can be constructed explicitly (see [51]), using Vershik automorphisms on the graph of ordered pairs (see [75]–[77]) and the coinduction operation from a subgroup $\mathbb{Z}$ to the whole of $G$. For arbitrary amenable groups explicit constructions of such actions are not known to the authors. However, Theorem 3.40 guarantees the genericity of actions of almost full growth and therefore their existence in the general case.

Theorem 3.47. Any infinite amenable group $G$ equipped with a Følner sequence $\sigma = \{G_n\}$ admits actions of almost complete growth.

3.7. Exponential scaling entropy and other related invariants

In this section we discuss several invariants related to scaling entropy.

3.7.1. Exponential scaling entropy

It turns out that Lemma 3.3, which plays a crucial role in the theory of scaling entropy, can be refined in the following way. A similar asymptotic relation holds not only for the function $\Phi_\rho(\varepsilon,n) = \mathbb{H}_\varepsilon(x,\mu, T_{\rm av}^n\rho)$, but also for $\exp(\Phi_\rho(\varepsilon,n))$, that is, for the size of the minimal $\varepsilon$-net of the semimetric $T_{\rm av}^n\rho$ on a set of measure $1-\varepsilon$, rather than for its logarithm (see Definition 2.11).

Lemma 3.48. Let $\rho_1,\rho_2 \in \mathcal{A}dm(X,\mu)$. If $\rho_1$ is $T$-generating, then for any $\varepsilon>0$ there exists $\delta>0$ such that:

$$ \begin{equation} \exp(\Phi_{\rho_2}(\varepsilon,n))\preceq\exp(\Phi_{\rho_1}(\delta,n)),\qquad n \to \infty. \end{equation} \tag{3.2} $$

Proof. First we show that relation (3.2) holds for the semimetric $\rho_2 = T_{\rm av}^k\rho_1$.

Lemma 3.49. For any positive integer number $k$ and positive $\varepsilon$, there exists a positive integer $N$ such that

$$ \begin{equation} \mathbb{H}_\varepsilon\bigl(X,\mu,T_{\rm av}^n(T_{\rm av}^k\rho_1)\bigr) \leqslant\mathbb{H}_{\varepsilon/4}(X,\mu,T_{\rm av}^n\rho_1),\qquad n > N. \end{equation} \tag{3.3} $$

Proof. Indeed,
$$ \begin{equation} T_{\rm av}^n(T_{\rm av}^k\rho_1)(x,y) \leqslant T_{\rm av}^n\rho_1(x,y)+ \frac{1}{n} \sum_{i=n}^{n+k-1} T^{-i}\rho_1(x,y). \end{equation} \tag{3.4} $$

The last term on the right-hand side of (3.4) is bounded by $(k/n)\|\rho_1\|_{\rm m}$ in the m-norm. Therefore, by Lemma 2.13, for sufficiently large $n$,

$$ \begin{equation*} \mathbb{H}_\varepsilon\bigl(X,\mu,T_{\rm av}^n(T_{\rm av}^k\rho_1)\bigr) \leqslant\mathbb{H}_\varepsilon\biggl(X,\mu,T_{\rm av}^n\rho_1+ \frac{1}{n} \sum_{i=n}^{n+k-1}T^{-i}\rho_1\biggr) \leqslant \mathbb{H}_{\varepsilon/4}(X,\mu,T_{\rm av}^n\rho_1). \end{equation*} \notag $$
The lemma is proved.

Note that the proof of Lemma 3.49 works without any changes in the case of an amenable group equipped with a Følner sequence.

Lemma 3.49 ensures that relation (3.2) holds for the admissible metric

$$ \begin{equation*} \rho_2=\rho=\sum_{i=0}^\infty\frac{1}{2^i}\,T^{-i}\rho_1. \end{equation*} \notag $$
Next, we follow the arguments in [84] with some refinements. The set $\mathcal{M}$ of all semimetrics $\rho_2 \in \mathcal{A}dm(X,\mu)$ that satisfy the (3.2) is closed in the m-norm by Lemma 2.13. Let us show that the set $\mathcal{\widetilde M}$ of all semimetrics $\omega$ that for all $x,y\in X$ satisfy the inequality
$$ \begin{equation*} \omega(x,y) \leqslant C(\omega)\rho(x,y) \end{equation*} \notag $$
is dense in $(\mathcal{A}dm(X,\mu), \|\,\cdot\,\|_m)$. Indeed, as shown in [84], any admissible integrable semimetric can be approximated in m-norm by semimetrics which can be dominated by finite sums of cut semimetrics. Any cut semimetric can be approximated by a semimetric of the form $df = |f(x) - f(y)|$ for some function $f \in L^1(X,\mu)$ that is Lipschitz with respect to the metric $\rho$. Clearly, $d[f] \in \mathcal{\widetilde M}$, and therefore $\mathcal{\widetilde M}$ is dense in $(\mathcal{A}dm(X,\mu), \|\,\cdot\,\|_m)$. However, $\mathcal{\widetilde M} \subset \mathcal{M}$, implying that $\mathcal{M} = \mathcal{A}dm(X,\mu)$.

Lemma 3.48, as before, ensures the independence of the class $[\exp(\Phi_\rho)]$ of the $T$-generating semimetric $\rho$ and allows us to give the following definition.

Definition 3.50. The exponential scaling entropy of the system $(X,\mu,T)$ is the equivalence class $[\exp(\Phi_\rho)]$ for some (hence any) $T$-generating semimetric $\rho \in \mathcal{A}dm(X,\mu)$. We denote this equivalence class by $\mathcal{H}_{\exp}(X,\mu,T)$.

Note that Lemma 3.48 holds for probability measure preserving actions of amenable groups equipped with a Følner sequence. For an action $(X,\mu, G)$ of an amenable group $G$ with a Følner sequence $\lambda=\{F_n\}$ we define its exponential scaling entropy $\mathcal{H}_{\exp}(X,\mu,G,\lambda)$ as the equivalence class of the function

$$ \begin{equation*} \exp(\Phi_\rho(\varepsilon,n))= \exp(\mathbb{H}_\varepsilon(x,\mu, G_{\rm av}^n\rho)) \end{equation*} \notag $$
for some (hence any) generating semimetric $\rho$.

The exponential scaling entropy $\mathcal{H}_{\exp}(X,\mu, T)$ is a finer invariant of a dynamical system than the ordinary scaling entropy which was discussed before. For instance, the Bernoulli shift with Kolmogorov entropy $h > 0$ has exponential scaling entropy

$$ \begin{equation*} \mathcal{H}_{\exp} (X,\mu,T)=[e^{(1-\varepsilon) n h}]. \end{equation*} \notag $$
On the other hand, the ordinary scaling entropy $\mathcal{H}(X,\mu,T) = [n]$ does not provide the exact value of Kolmogorov entropy. Similarly, the exponential scaling entropy of a transformation with positive entropy $h$ is the class $[e^{(1-\varepsilon) n h}]$. For transformations with infinite entropy, $\mathcal{H}_{\exp}=[e^{n/\varepsilon}]$.

This example also shows that even for the Bernoulli shift the class $\mathcal{H}_{\exp}(X,\mu,T)$ does not contain functions independent of $\varepsilon$. On the other hand, for transformations with pure point spectrum the class $\mathcal{H}_{\exp}$ consists of bounded functions and contains $\Phi(\varepsilon, n) = 1$. It is natural to assume that this is the only possible case where the class $\mathcal{H}_{\exp}$ is stable. The ‘instability’ of exponential scaling entropy makes it harder to compute.

The case of positive Kolmogorov entropy shows that exponential scaling entropy can provide an efficient refinement of the ordinary scaling entropy in the stable case. However, in the general case this refinement can be insignificant or even not present at all. If, given a non-stable system $(X,\mu, T)$, for $\Phi \in \mathcal{H}(X,\mu, T)$ and any $\varepsilon > 0$ there exists $\delta > 0$ such that $\Phi(\varepsilon, n)=o(\Phi(\delta, n))$, then the class $\mathcal{H}_{\exp}$ is completely determined by the class $\mathcal{H}$, since $[\exp(\Phi(\varepsilon,n))]$ does not depend on the choice of a representative $\Phi$ in $\mathcal{H}$.

Establishing the properties of scaling entropy in the exponential variant is more challenging due to the complexity of its computation. A natural generalization of Theorem 3.18 concerning the monotonicity and subadditivity would be a theorem about the monotonicity and submultiplicativity of exponential scaling entropy. The proof of the monotonicity of scaling entropy does not change in the exponential case.

Theorem 3.51. For any system $(X,\mu,T)$, in the class $\mathcal{H}_{\exp}(X,\mu,T)$ of exponential scaling entropy one can always find a function $\Phi \colon \mathbb{R}_+ \times \mathbb{N} \to \mathbb{R}_+$ with the following properties:

(a) $\Phi(\,\cdot\,, n)$ is non-increasing for every $n \in \mathbb{N}$;

(b) $\Phi(\varepsilon, \,\cdot\,)$ is non-decreasing for every $\varepsilon>0$.

Note that not every function $\Phi$ (or, more precisely, the asymptotic class of not every $\Phi$) that is increasing and submultiplicative in $n$ (and decreasing in $\varepsilon$) can be obtained as the exponential scaling entropy of a measure-preserving transformation. An example of such a function is

$$ \begin{equation*} \Phi(\varepsilon,n)=\exp(n+(1-\varepsilon)n^{1/2}). \end{equation*} \notag $$
Indeed, the exponential growth of $\mathcal{H}_{\exp}(T)$ implies a positive Kolmogorov entropy $h$ of the automorphism $T$, and therefore $\mathcal{H}_{\exp}(T)$ is the class $[e^{(1-\varepsilon) n h}] \ne [\Phi]$. Thus, the direct analogue of Theorem 3.19 does not hold in the exponential case. The problem of a complete description of possible values of the class $\mathcal{H}_{\exp}$ remains open. However, a generic exponential scaling entropy has the same properties as a generic ordinary scaling entropy: for any subexponential sequence $\phi(n)$ increasing to infinity a generic transformation has $\mathcal{H}_{\exp}$ not comparable to $\phi$.

3.7.2. Connections with other invariants

In the survey [21] several measure- theoretic invariants similar to scaling entropy were discussed. All of them are effective for automorphisms with zero Kolmogorov entropy. We mention some of them in the context of their connections to scaling entropy. The entropy dimension (Ferenczi–Park, see [10]) and slow entropy (Katok–Tufano, see [24]) are based on the same idea of considering the asymptotic behaviour of $\varepsilon$-entropy as in scaling entropy. However, the invariant is defined by comparing a growing sequence with a fixed scale of growing sequences. Sequential entropy (Kirillov–Kushnirenko entropy, see [31]) is based on the idea of computing the entropies of refinements of a partition under the action of a certain sequence of shifts. The term ‘scaled entropy’ appeared in [88], where it was used for a related but different notion (see also § 5.3 of [21]).

Entropy dimension

Ferenczi, Park, and other authors introduced and studied the notion of entropy dimension.

Let $T \colon (X,\mu) \to (X,\mu)$ be a measure-preserving transformation. For a finite measurable partition $\alpha$ of the space $(X,\mu)$ and positive $\varepsilon>0$ consider the cut semimetric $\rho_\alpha$ generated by $\alpha$. The upper entropy dimension $\overline D(X, \mu,T)$ is defined as follows:

$$ \begin{equation} \begin{aligned} \, \notag \overline D(\alpha,\varepsilon)&=\sup\biggl\{s \in [0,1]\colon \varlimsup_{n \to \infty} \frac{\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho_\alpha)}{n^s}>0\biggr\}, \\ \notag \overline D(\alpha)&=\lim_{\varepsilon \to 0}\overline D(\alpha,\varepsilon), \\ \overline D(X,\mu,T)&=\sup_\alpha \overline D(\alpha), \end{aligned} \end{equation} \tag{3.5} $$
where the supremum is computed over all finite measurable partitions $\alpha$.

The lower entropy dimension $\underline D(X, \mu,T)$ is defined in a similar way:

$$ \begin{equation} \begin{aligned} \, \notag \underline D(\alpha, \varepsilon)&=\sup\biggl\{s \in [0,1]\colon \varliminf_{n \to \infty} \frac{\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho_\alpha)}{n^s}>0\biggr\}, \\ \notag \underline D(\alpha)&= \lim_{\varepsilon \to 0}\underline D(\alpha,\varepsilon), \\ \underline D(X,\mu,T)&=\sup_\alpha\underline D(\alpha). \end{aligned} \end{equation} \tag{3.6} $$
In the case where the upper and lower entropy dimensions coincide, this number is called the entropy dimension of the system. The following analogue of the Kolmogorov–Sinai theorem holds: the suprema in formulae (3.5) and (3.6) are attained at generating partitions.

It is easy to see that the upper and lower entropy dimensions can be computed from the scaling entropy, that is, the equivalence class $[\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho)]$ for a generating semimetric $\rho$. However, the converse is not true. The entropy dimension indicates the position of the scaling entropy on the scale of power functions. For more details about the properties of entropy dimension we refer the reader to § 5.4 of [21] and the papers cited there.

Slow entropy

Katok and Thouvenot [24] introduced the following definition of slow entropy.

Let $\mathbf{a}=\{a_n(t)\}_{n \geqslant 0, t>0}$ be a family of positive increasing sequences tending to infinity and increasing in $t$ (a scale).

The upper slow entropy of a system $(X,\mu,T)$ with respect to the scale $\mathbf{a}$ is defined as follows. Given a finite measurable partition $\alpha$ of the space $(X,\mu)$ and $\varepsilon>0$, consider the corresponding cut semimetric $\rho_\alpha$ and the set

$$ \begin{equation*} \overline B(\varepsilon, \alpha)=\biggl\{t>0 \colon \varlimsup_{n\to \infty} \frac{\exp(\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho_\alpha))}{a_n(t)} >0\biggr\}\cup \{0\}. \end{equation*} \notag $$
Then set
$$ \begin{equation*} \overline{\operatorname{ent}}_{\mathbf{a}}(T,\alpha)= \lim_{\varepsilon \to 0} \, \sup \overline B(\varepsilon,\alpha) \end{equation*} \notag $$
and
$$ \begin{equation*} \overline{\operatorname{ent}}_{\mathbf{a}}(X,\mu,T)= \sup_\alpha \overline{\operatorname{ent}}_{\mathbf{a}}(T,\alpha), \end{equation*} \notag $$
where the supremum is computed over all finite measurable partitions $\alpha$.

The quantity $\overline{\operatorname{ent}}_{\mathbf{a}}(X,\mu,T)$ is called the upper slow entropy of the system $(X,\mu,T)$ with respect to the scale $\mathbf{a}$.

In a similar way, the lower slow entropy $\underline{\operatorname{ent}}_{\mathbf{a}}(X,\mu,T)$ is defined by

$$ \begin{equation*} \begin{gathered} \, \underline B(\varepsilon,\alpha)=\biggl\{t>0 \colon \varliminf_{n\to \infty} \frac{\exp(\mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^n\rho_\alpha))}{a_n(t)} >0\biggr\}\cup \{0\}, \\ \underline{\operatorname{ent}}_{\mathbf{a}}(T,\alpha)= \lim_{\varepsilon \to 0} \, \sup \underline B(\varepsilon,\alpha), \\ \underline{\operatorname{ent}}_{\mathbf{a}}(X,\mu,T)= \sup_\alpha\,\underline{\operatorname{ent}}_{\mathbf{a}}(T,\alpha). \end{gathered} \end{equation*} \notag $$

Slow entropy is related directly to exponential scaling entropy. By comparing the class $\mathcal{H}_{\exp}$ with the scale $\mathbf{a}$, one can compute the quantities $\overline{\operatorname{ent}}_{\mathbf{a}}$ and $\underline{\operatorname{ent}}_{\mathbf{a}}$. Conversely, knowing the values of slow entropy for all possible scales $\mathbf{a}$ allows one to distinguish between systems with different $\mathcal{H}_{\exp}$ classes. Thus, slow entropy (more precisely, the collection of slow entropies with respect to all possible scales $\mathbf{a}$) distinguishes the same dynamical systems as exponential scaling entropy. However, no countable set of scales $\mathbf{a}$ is sufficient to fully recover the class $\mathcal{H}_{\exp}$.

Section 4 of [21] is dedicated to various properties of slow entropy and examples of its computation.

Sequential entropy

Sequential entropy was introduced by Kushnirenko [31]. Let $A = \{a_k\}_{k=1}^\infty$ be a fixed increasing sequence of natural numbers. The entropy $h_A(T)$ of the automorphism $T$ on the space $(X,\mu)$ is defined as follows:

$$ \begin{equation*} \begin{gathered} \, h_A(T,\alpha)=\varlimsup_{k \to \infty} \frac{1}{n}H\biggl(\,\bigvee_{j=1}^k T^{-a_{j}}\alpha\biggr), \\ h_A(T)=\sup_{\alpha}h_A(T,\alpha), \end{gathered} \end{equation*} \notag $$
where the supremum is computed over all finite measurable partitions $\alpha$.

It was proved in [31] that an automorphism $T$ has a pure point spectrum if and only if $h_A(T) = 0$ for each sequence $A$. Comparing this with Theorem 3.15, we conclude that the boundedness of scaling entropy is equivalent to this condition.

It turns out that in a certain ‘neighbourhood’ of a pure point spectrum there exist two-sided estimates that relate scaling entropy to sequential entropy.

Theorem 3.52. For any increasing sequence of integers $A$ there exists an unbounded increasing sequence $h = \{h_n\}_{n}$ such that for any automorphism $T$ of the space $(X,\mu)$, if the relation $\mathcal{H}(T,X,\mu) \prec h$ holds, then $h_A(T) = 0$.

Conversely, for any unbounded increasing sequence $h = \{h_n\}_{n}$ there exists a sequence $A$ such that if $h_A(T) = 0$ then $\mathcal{H}(T,X,\mu) \prec h$.

For more details on the properties of sequential entropy, we refer the reader to § 3 of [21].

Appendix A. Several proofs

A.1. The proofs of Lemmas 2.15 and 2.16

Proof of Lemma 2.15. Let $n=\exp\bigl(\mathbb{H}_\delta(X,\mu,\rho)\bigr)$, and let
$$ \begin{equation*} X=X_0\cup X_1\cup\dots \cup X_n \end{equation*} \notag $$
be the corresponding partition: $\mu(X_0)<\delta$ and $\operatorname{diam}_\rho(X_j)<\delta$ for $j=1,\dots, n$. We choose an arbitrary point $x_0 \in X$ and points $x_j \in X_j$ for $j=1,\dots, n$ and consider the discrete measure
$$ \begin{equation*} \nu=\sum_{j=0}^n \mu(X_j)\delta_{x_j} \end{equation*} \notag $$
and the transport plan $\gamma$ that transports the sets $X_j$ to $x_j$. Clearly, $\gamma$ is a pairing of measures $\mu$ and $\nu$, and we have
$$ \begin{equation*} \int_{X \times X} \rho\, d\gamma=\sum_{j=0}^n \int_{X_j}\rho(x,x_j)\, d\mu(x) \leqslant \int_{X_0}\rho(x,x_0)\, d\mu(x)+\delta. \end{equation*} \notag $$
The average value of the right-hand side, when we choose $x_0 \in X$ randomly in accordance with the measure $\mu$, is
$$ \begin{equation*} \int_{X_0\times X}\rho\, d(\mu\times\mu)+\delta < \varepsilon \end{equation*} \notag $$
in view of condition (2.1). Therefore, for an appropriate choice of $x_0$ we can obtain the inequality
$$ \begin{equation*} d_{\rm K}(\mu,\nu) \leqslant \int_{X \times X} \rho\, d\gamma < \varepsilon. \end{equation*} \notag $$
It remains to observe that $H(\nu) \leqslant \log(n+1)$.

Before Lemma 2.16, let us prove the following auxiliary lemma.

Lemma A.1. Let $P =(p_j)_{j=1}^N$ be a finite probability vector and

$$ \begin{equation*} H(P)=\sum_{j=1}^N p_j\log\frac{1}{p_j} \end{equation*} \notag $$
be its entropy. Let $\delta\in (0,1)$, and let $F=\exp\bigl((H(P)+1)/\delta\bigr)$. Then there exists a subset $J \subset \{1,\dots, N\}$ such that
$$ \begin{equation*} |J| \leqslant F \quad\textit{and}\quad \sum_{j \in J}p_j \geqslant 1-\delta. \end{equation*} \notag $$

Proof. Without loss of generality we can assume that the numbers $p_j$ decrease: $p_1\geqslant p_2\geqslant \dots \geqslant p_N$. Let $m$ be the smallest integer for which
$$ \begin{equation*} \sum_{j=1}^m p_j \geqslant 1-\delta. \end{equation*} \notag $$
Suppose that the lemma is false; then $m>F$. Consequently, $1/F> p_m \geqslant p_{m+1}$. Thus,
$$ \begin{equation*} \delta \geqslant \sum_{j=m+1}^{N} p_j > \delta-\frac{1}{F}\,. \end{equation*} \notag $$
Therefore,
$$ \begin{equation*} H(P) > \sum_{j=m+1}^{N} p_j \log\frac{1}{p_j} \geqslant \sum_{j=m+1}^{N} p_j \log\frac{1}{p_{m+1}} \geqslant \biggl(\delta-\frac{1}{F}\biggr)\log F \geqslant \delta\log F-1, \end{equation*} \notag $$
which contradicts the definition of $F$.
Proof of Lemma 2.16. Let $\nu$ be a discrete measure such that $d_{\rm K}(\mu,\nu)<\varepsilon^2$. Using Lemma A.1 for the distribution of $\nu$ and $\delta = \varepsilon$, we find $m \leqslant \exp\bigl((H(\nu)+1)/\varepsilon\bigr)$ and a set $M = \{x_1,\dots, x_m\} \subset X$ for which $\nu(M) \geqslant 1-\varepsilon$.

Let $\pi_1$ and $\pi_2$ be the projections from $X\times X$ onto the first and second factors, respectively. Let $\gamma$ be an optimal transport plan for the measures $\mu$ and $\nu$, that is a probability measure on $X\times X$ such that $\pi_1(\gamma) = \mu$, $\pi_2(\gamma) = \nu$, and

$$ \begin{equation*} \int_{X\times X}\rho \,d\gamma=d_{\rm K}(\mu,\nu) < \varepsilon^2. \end{equation*} \notag $$
Consider the set $A=\{(x,y) \in X\times X \colon \rho(x,y)<\varepsilon\}$. By Chebyshev’s inequality, $\gamma(A)\geqslant 1-\varepsilon$. Notice that $\gamma(\pi_2^{-1}(M)) = \nu(M) \geqslant 1-\varepsilon$; thus,
$$ \begin{equation*} \gamma(A\cap \pi_2^{-1}(M)) \geqslant 1-2\varepsilon, \end{equation*} \notag $$
and consequently
$$ \begin{equation*} \mu(\{x \in X \colon \rho(x,M)<\varepsilon\}) \geqslant \gamma(\{A\cap \pi_2^{-1}(M)\}) \geqslant 1-2\varepsilon. \end{equation*} \notag $$
Therefore,
$$ \begin{equation*} \mathbb{H}_{2\varepsilon}(X,\mu,\rho) \leqslant \log|M| \leqslant \frac{H(\nu)+1}{\varepsilon}\,. \end{equation*} \notag $$
By minimizing the right-hand side over the discrete measures $\nu$ satisfying $d_{\rm K}(\mu,\nu)<\varepsilon^2$ we arrive at inequality (2.2).

A.2. The proofs of Theorems 2.38 and 2.40

Theorem 2.38 is a consequence of the lemma below.

Lemma A.2. (i) Let $\rho_1$ and $\rho_2$ be two measurable semimetrics on $(X,\mu)$. Then for any $\delta>0$ there exists a semimetric space $(Y,\rho)$ and isometries $\phi_1\colon (X,\rho_1) \to (Y,\rho)$ and $\phi_2\colon (X,\rho_2) \to (Y,\rho)$, such that

$$ \begin{equation*} d_{\rm K}\bigl(\phi_1(\mu_1),\phi_2(\mu_2)\bigr) \leqslant \|\rho_1-\rho_2\|_{\rm m}+\delta. \end{equation*} \notag $$

(ii) Let $\mu_1$ and $\mu_2$ be two probability measures on a semimetric space $(Y,\rho)$. Then there exists a probability space $(X,\mu)$ and measurable mappings $\psi_1,\psi_2\colon X \to Y$ such that $\psi_1(\mu) = \mu_1$, $\psi_2(\mu) = \mu_2$, and

$$ \begin{equation*} \|\rho\circ\psi_1-\rho\circ\psi_2\|_{\rm m} \leqslant 2d_{\rm K}(\mu_1,\mu_2). \end{equation*} \notag $$

Proof. (i) Let $D$ be a semimetric on $(X,\mu)$ such that
$$ \begin{equation*} |\rho_1(x,y)-\rho_2(x,y)| \leqslant D(x,y) \quad \text{a.e.} \end{equation*} \notag $$
and $\displaystyle\int_{X^2}D \, d\mu^2 \leqslant \|\rho_1- \rho_2\|_{\rm m}+\delta$. We find a function $f$ on $(X,\mu)$ such that
$$ \begin{equation} D(x,y) \leqslant f(x)+f(y) \end{equation} \tag{A.1} $$
and
$$ \begin{equation} \int_X f\, d\mu \leqslant \int_{X^2} D\, d\mu^2. \end{equation} \tag{A.2} $$
In order to satisfy (A.1) we can choose $f = D(\,\cdot\,, x_0)$ for any $x_0 \in X$. Selecting $x_0$ so that the integral of $D(\,\cdot\,, x_0)$ is minimized, we obtain (A.2).

Consider the set $Y = X \times \{1,2\}$ and define a semimetric $\rho$ on $Y$ by

$$ \begin{equation*} \begin{aligned} \, \rho\bigl((x,i),(y,i)\bigr)&=\rho_i(x,y), \qquad i=1,2, \quad x,y \in X, \\ \text{and} \qquad \rho\bigl((x,2),(y,1)\bigr)&= \operatorname*{ess\,inf}_{z \in X}(\rho_2(x,z)+ f(z)+\rho_1(z,y)),\qquad x,y \in X. \end{aligned} \end{equation*} \notag $$
We define isometries $\phi_i\colon (X,\rho_i) \to (Y,\rho)$, $i=1,2$, by $\phi_i(x) = (x,i)$, $x \in X$. The Kantorovich distance $d_{\rm K}$ between the images of the measures $\mu_i$ under the mappings $\phi_i$ does not exceed $\displaystyle\int_X f\,d\mu \leqslant \|\rho_1-\rho_2\|_{\rm m}+\delta$.

(ii) Let $\mu$ be a measure on $X = Y^2$ that is a coupling of $\mu_1$ and $\mu_2$ realizing the Kantorovich transport plan between these measures:

$$ \begin{equation*} \int_{Y^2} \rho\, d\mu=d_{\rm K}(\mu_1,\mu_2). \end{equation*} \notag $$
Let $\psi_1$ and $\psi_2$ be the projections of $X=Y^2$ onto the first and second factors. Then the semimetrics $\rho_i = \rho\circ \psi_i$, $i=1,2$, on $X$ are given by the formulae
$$ \begin{equation*} \rho_i\bigl((x_1,x_2),(y_1,y_2)\bigr)=\rho(x_i,y_i). \end{equation*} \notag $$
We define a semimetric $D$ on $X$ as follows:
$$ \begin{equation*} D\big((x_1,x_2),(y_1,y_2)\big)=\rho(x_1,x_2)+\rho(y_1,y_2),\qquad (x_1,x_2) \ne (y_1,y_2). \end{equation*} \notag $$
Then $|\rho_1-\rho_2| \leqslant D$, hence
$$ \begin{equation*} \|\rho_1-\rho_2\|_{\rm m} \leqslant \int_{X^2} D\,d\mu^2= 2\int_{Y^2} \rho\,d\mu=2 d_{\rm K}(\mu_1,\mu_2). \end{equation*} \notag $$

Proof of Theorem 2.40. In one direction the statement of the theorem can be proved quite easily. Assume that the set $M$ is precompact. Fix $\varepsilon>0$ and find a finite subset $J\subset I$ such that the triples $\{(X_j,\mu_j,\rho_j)\colon i \in J\}$ form a finite $(\varepsilon^2/32)$-net in the metric $\operatorname{Dist}_{\rm m}$. For each $i \in I$ we find $j \in J$ such that
$$ \begin{equation*} \operatorname{Dist}_{\rm m}\bigl((X_i,\mu_i,\rho_i),(X_j,\mu_j,\rho_j)\bigr) < \frac{\varepsilon^2}{32}\,. \end{equation*} \notag $$
Therefore, there exists a coupling $(X,\mu)$ and projections $\psi_{i,j}\colon (X,\mu) \to (X_{i,j},\mu_{i,j})$ such that
$$ \begin{equation} \|\rho_i\circ \psi_i-\rho_j\circ \psi_j\|_\mathrm{m} < \frac{\varepsilon^2}{32}\,. \end{equation} \tag{A.3} $$
Then
$$ \begin{equation*} \mathbb{H}_\varepsilon(X_i,\mu_i,\rho_i)= \mathbb{H}_\varepsilon(X,\mu,\rho_i\circ \psi_i)\leqslant \mathbb{H}_{\varepsilon/4}(X,\mu,\rho_j\circ \psi_j)= \mathbb{H}_{\varepsilon/4}(X_j,\mu_j,\rho_j). \end{equation*} \notag $$
Hence
$$ \begin{equation*} \sup_{i \in I}\mathbb{H}_\varepsilon(X_i,\mu_i,\rho_i) \leqslant \max_{j \in J}\mathbb{H}_{\varepsilon/4}(X_j,\mu_j,\rho_j)<\infty; \end{equation*} \notag $$
and condition (b) in Theorem 2.40 is proved.

Let us prove uniform integrability. For $i \in I$ and the corresponding $j \in J$ let $d_i$ be a semimetric on $(X,\mu)$ such that

$$ \begin{equation} \rho_i\circ \psi_i \leqslant \rho_j\circ \psi_j+d_i\quad\text{and} \quad \int_{X^2} d_i \, d\mu^2 < \frac{\varepsilon^2}{32}\,. \end{equation} \tag{A.4} $$
For any $R>0$, let $L_{i,j,R} = \{\rho_i\circ \psi_i > 2R\} \subset X^2$. Then
$$ \begin{equation} \begin{aligned} \, \notag \mu^2(L_{i,R}) &\leqslant \mu^2(\{\rho_j\circ \psi_j>R\})+\mu^2(\{d_i>R\}) \\ &\leqslant \frac{1}{R} \biggl(\int_{X^2}\rho_j\circ \psi_j \, d\mu^2+ \frac{\varepsilon^2}{32} \biggr)= \frac{1}{R}\biggl(\int_{X_j^2}\rho_j \, d\mu_j^2+ \frac{\varepsilon^2}{32}\biggr). \end{aligned} \end{equation} \tag{A.5} $$
Let
$$ \begin{equation*} Q(\varepsilon)=\max_{j\in J} \int_{X_j^2}\rho_j \, d\mu_j^2+ \frac{\varepsilon^2}{32}\,. \end{equation*} \notag $$
Then we obtain
$$ \begin{equation*} \begin{aligned} \, \int_{\rho_i>2R} \rho_i\, d\mu_i^2&= \int_{L_{i,R}}\rho_i\circ \psi_i \,d\mu^2 \leqslant \int_{L_{i,R}}\rho_j\circ \psi_j \,d\mu^2+\frac{\varepsilon^2}{32} \\ &\leqslant\sup\biggl\{\int_{L} \rho_j \, d\mu_j^2\colon L \subset X_j^2,\ \mu_j^2(L)\leqslant\frac{Q(\varepsilon)}{R}\biggr\}+ \frac{\varepsilon^2}{32}\,. \end{aligned} \end{equation*} \notag $$
The right-hand side of the last inequality converges to $\varepsilon^2/32$ as $R$ tends to infinity (for each $j$, thus uniformly over the finite set $J$). Therefore,
$$ \begin{equation*} \varlimsup_{R \to \infty}\,\sup_{i \in I}\int_{\rho_i>2R}\rho_i\, d\mu_i^2 \leqslant\frac{\varepsilon^2}{32}\,. \end{equation*} \notag $$
The left-hand side of the inequality obtained does not depend on $\varepsilon$, so it is zero. Condition (a) of uniform integrability is thus proved.

The proof of the theorem in the opposite direction is more complicated and involves more reasoning. Suppose that conditions (a) and (b) are satisfied. For each $\varepsilon>0$ we want to find a finite $\varepsilon$-net in $M$ with respect to the metric $\operatorname{Dist}_{\rm m}$. Using the condition of uniform integrability, we can choose $R>0$ such that $\displaystyle\int_{\rho_i>R}\rho_i\,d\mu_i^2<\frac{\varepsilon}{2}$ for all $i \in I$. Consequently,

$$ \begin{equation*} \|\rho_i-\min(\rho_i,2R)\|_{\rm m}\leqslant \int_{\rho_i>R}\rho_i\,d\mu_i^2<\frac{\varepsilon}{2}\,. \end{equation*} \notag $$
Thus, it suffices to find a finite $(\varepsilon/2)$-net with respect to the metric $\operatorname{Dist}_{\rm m}$ in the set of triples $\{(X_i,\mu_i,\min(\rho_i,2R))\}_{i \in I}$. Therefore, without loss of generality, we can assume that the semimetrics $\rho_i$ are uniformly bounded. In view of homogeneity we can assume that all the $\rho_i$ do not exceed 1.

Lemma A.3. Let $\varepsilon>0$ be fixed, and let $\sup_{i\in I}\mathbb{H}_\varepsilon(X_i,\mu_i,\rho_i) <\infty$. Then on the standard probability space $(X,\mu)$ one can find a finite partition $\xi = (A_1,\dots, A_n)$ with the following property: for each $i \in I$ there exists a homomorphism of measure spaces $\psi_i \colon (X,\mu) \to (X_i,\mu_i)$ and a set $B_i \in X$, $\mu(B_i)<2\varepsilon$, such that the sets $A_k\setminus B_i$, $k=1,\dots, n$, have diameters less than $\varepsilon$ in the semimetric $\rho_i\circ \psi_i$.

Proof. Let $N$ be such that $\mathbb{H}_\varepsilon(X_i,\mu_i,\rho_i) \leqslant \log(N)$ for all $i \in I$. For each $i \in I$, there exists a measurable finite partition $\xi^i = \{A_1^i,\dots,A_N^i\}$ of the space $(X_i,\mu_i)$ and a set $E^i \subset X_i$, $\mu_i(E^i)< \varepsilon$, such that the diameters of the sets $A_j^i \setminus E^i$ in the semimetric $\rho_i$ are smaller than $\varepsilon$. Let $P^i = (\mu_i(A_k^i))_{k=1}^N$ be the probability vector corresponding to the partition $\xi^i$.

Let $P=(p_1,\dots, p_N)$ be a fixed probability vector of length $N$. On the space $(X,\mu)$ we choose a partition $\xi_P = (A_1,\dots, A_N)$ into $N$ parts with measures $\mu(A_k)= p_k$. If the vector $P^i=(\mu(A_k^i))_{k=1}^N$ satisfies $\sum_{k=1}^N |p_k - \mu(A_k^i)|<\varepsilon$, then the space $(X_i,\mu_i)$ can be realized on $X$ in such a way that the partition $\xi^i$ differs only slightly from $\xi_P$. Specifically, one can find a homomorphism of measure spaces $\psi_i\colon (X,\mu) \to (X_i,\mu_i)$ such that

$$ \begin{equation*} \sum_{k=1}^N \mu(\psi^{-1}(A_k^i)\mathbin{\Delta} A_k)<\varepsilon. \end{equation*} \notag $$
Then there exists a set $B_i \subset X$ such that $\mu(B_i) < 2\varepsilon$ and the diameters of the sets $A_k\setminus B_i$ in the semimetric $\rho_i \circ \psi_i$ are less than $\varepsilon$.

The set of vectors $\{P^i\colon i \in I\}$ in a finite-dimensional space is bounded, thus we can find a finite $\varepsilon$-net, say $\{P_1, \dots, P_m\}$, with respect to the $L^1$ metric in this set. For each element $P_j$ of this $\varepsilon$-net we construct a partition $\xi_{P_j}$ on $(X,\mu)$ corresponding to it. As the required partition $\xi$, we can take a refinement of the partitions $\xi_{P_j}$. The lemma is proved.

The rest of the proof of the theorem follows the proof of the corresponding implication of Theorem 2.26. Using Lemma A.3 we work with the semimetrics $\widetilde{\rho}_i = \rho_i\circ \psi_i$ on $(X,\mu)$ and the partition $\xi = \{A_1,\dots, A_n\}$. We show that one can find a finite $(7\varepsilon)$-net with respect to the m-norm in the set $\{\widetilde{\rho}_i\colon i \in I\}$. To do this we will demonstrate that these semimetrics can be approximated in the m-norm by a bounded set in the finite-dimensional space of semimetrics that are constant on the sets $A_j\times A_k$, $j,k \in\{1,\dots,n\}$.

Let $\widetilde{\rho} \in \{\widetilde{\rho}_i\colon i \in I\}$. We find a set $B\subset X$, $\mu(B)<2\varepsilon$, such that each set $A_j \setminus B$ has diameter less than $\varepsilon$ in the semimetric $\widetilde{\rho}$. If necessary, by adding a subset of measure zero to $B$ we can assume that each set $A_j\setminus B$ is either empty or of positive measure. Choose a point $x_j$ in each $A_j$ such that the functions $\widetilde{\rho}(\,\cdot\,, x_j)$ are measurable on $X$, and if $\mu(A_j \setminus B)>0$, then $x_j \in A_j \setminus B$. We define a semimetric $\overline{\rho}$ on $X$ as follows: for $x\ne y$, if $x \in A_k$ and $y \in A_j$, then we set $\overline{\rho}(x,y) = \widetilde{\rho}(x_k,x_j)$. If $x, y \notin B$, then, obviously,

$$ \begin{equation*} |\widetilde\rho(x,y)-\overline\rho(x,y)| < 2\varepsilon. \end{equation*} \notag $$
Since $\overline{\rho}$ and $\widetilde{\rho}$ are pointwise bounded by 1, for all $x,y \in X$ we have the estimate
$$ \begin{equation*} |\widetilde\rho(x,y)-\overline\rho(x,y)|\leqslant \bigl(2\varepsilon+ \chi_{B}(x)+\chi_B(y)-\chi_{B}(x)\chi_B(y)\bigr)\chi_{\{x\ne y\}}. \end{equation*} \notag $$
On the right-hand side of the last inequality there is a semimetric whose integral does not exceed $2\varepsilon + 2\mu(B) < 6 \varepsilon$, hence
$$ \begin{equation*} \|\widetilde\rho(x,y)-\overline\rho(x,y)\|_{\rm m}<6\varepsilon. \end{equation*} \notag $$

We have shown that each semimetric $\widetilde{\rho}_i$, $i \in I$, can be approximated by the corresponding semimetric $\overline{\rho}_i$ to within $6\varepsilon$ in the m-norm. Moreover, all semimetrics $\overline{\rho}_i$ are contained in a finite-dimensional space and are uniformly bounded, thus forming a precompact set. Therefore, we can find a finite $(7\varepsilon)$-net with respect to the m-norm in the set $\{\widetilde{\rho}_i\}_{i \in I}$. The theorem is proved.

A.3. Proof of Proposition 3.8

The proof of the inequality in one direction is quite straightforward:

$$ \begin{equation*} \Omega_T\biggl(\rho,1-\frac{1}{n}\biggr) \geqslant \frac{1}{e} T_{\rm av}^n\rho. \end{equation*} \notag $$
Thus,
$$ \begin{equation*} \mathbb{H}_\varepsilon\biggl(X,\mu,\Omega_T\biggl(\rho,1- \frac{1}{n}\biggr)\biggr) \geqslant \mathbb{H}_{e\varepsilon}(X,\mu,T_{\rm av}^n\rho). \end{equation*} \notag $$
Hence $\widetilde\Phi_\rho(\varepsilon,1-1/n) \succeq \Phi_\rho$.

To obtain the reverse estimate, for each $\varepsilon > 0$ we find a constant $c = c(\varepsilon, \rho)$ such that for any natural number $n$,

$$ \begin{equation*} \biggl\|(1-z)\sum_{k > cn}z^k T^{-k}\rho\biggr\|_{\rm m} < \frac{\varepsilon^2}{32}\,, \end{equation*} \notag $$
where $z=1-1/n$. Then, due to Lemma 2.13,
$$ \begin{equation*} \mathbb{H}_\varepsilon \biggl(X,\mu,\Omega_T\biggl(\rho,1-\frac{1}{n}\biggr)\biggr) \leqslant\mathbb{H}_{\varepsilon/4} \biggl(X,\mu,(1-z)\sum_{k=0}^{cn} z^n T^{-k}\rho\biggr) \leqslant \mathbb{H}_{\varepsilon/(4c)}(X,\mu,T_{\rm av}^{cn}\rho), \end{equation*} \notag $$
where $z=1-1/n$ again. However, because of the subadditivity of scaling entropy (see § 3.3.2), the function $\Psi(\varepsilon, n)=\Phi_\rho(\varepsilon/(4c),cn)$ is equivalent to the function $\Phi_\rho$. Thus, $\widetilde\Phi_\rho(\varepsilon,1-1/n) \preceq \Phi_\rho$, and therefore,the function $\widetilde\Phi_\rho(\varepsilon,1-1/n)$ belongs to the class $\mathcal{H}(T)$.

A.4. Proof of Proposition 3.23

Let $E$ be a set of positive measure such that for any $\alpha \in E$, $\mathcal{H}(X,\mu_\alpha,T)$ grows not slower than $h$.

Suppose the contrary. Then there exists a subsequence $n_j$ such that $\Phi(\varepsilon, n_j) \prec h_{n_j}$ for any $\Phi \in \mathcal{H}(X, \mu, T)$ and any $\varepsilon > 0$. Consider an admissible metric $\rho$ on $X$ and an index $n_j$. Let $X_0, X_1, \dots X_k$ be sets that realize the $\varepsilon$-entropy of the triple $(X, \mu, T_{\rm av}^{n_j} \rho)$. Then

$$ \begin{equation*} \varepsilon > \mu(X_0)=\int\mu_\alpha(X_0)\,d\nu(\alpha). \end{equation*} \notag $$
It is clear that there exists a positive constant $r $ depending only on $\nu(E)$ such that the inequality $\mu_\alpha(X_0) < \varepsilon/r$ holds on some set $E_j \subset E$ of measure $r$. For such $\alpha$ we have
$$ \begin{equation*} \mathbb{H}_{\varepsilon/r}(X,\mu_\alpha,T_{\rm av}^{n_j}\rho) \leqslant \mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^{n_j} \rho). \end{equation*} \notag $$

The measure of $\alpha$ such that $\alpha \in E_j$ for infinitely many indices is positive. We choose such an $\alpha$ and a subsequence of indices $j_m$ for which $\alpha \in E_{j_m}$. Then we obtain

$$ \begin{equation*} h_{n_{j_m}} \preceq \mathbb{H}_{\varepsilon/r}(X,\mu_\alpha,T_{\rm av}^{n_{j_m}}\rho)\leqslant \mathbb{H}_\varepsilon(X,\mu,T_{\rm av}^{n_{j_m}}\rho) \prec h_{n_{j_m}}. \end{equation*} \notag $$
This is a contradiction.

Appendix B. Some open questions

We list several problems related to the new concepts referred to in this survey, without aiming for a comprehensive coverage of the topics. Some of these problems have already been mentioned in the main text.

1. The theory of mm-spaces, the classification of metric triples, and problems about matrix distributions

Perhaps the most important general question is as follows: to what extent does the matrix distribution, regarded as a measure on the space of distance matrices, enable one to describe various properties of mm-spaces (spaces with measure and metric)?

On the one hand, as we have seen, matrix distribution is a complete invariant up to measure-preserving isometries. However, the practical use of it as a tool for studying spaces must be developed further. In particular, the question of what can be said about the random spectra of matrix distributions for the most natural mm-spaces is still open, even though it was posed a long time ago (see § 1.1.4). A similar question can be posed for metric triples: can a metric triple and, in particular, a metric, be recovered from the asymptotic behaviour of the random spectra of the consecutive minors of a matrix distribution? Most likely, the answer is in the negative, but it is interesting to identify the properties of a triple that are ‘spectral’, that is, depend only on the spectrum. Here it is appropriate to recall the extensive literature on the spectral geometry of graphs, metric spaces, and so on. However, the task mentioned above is fundamentally different, as we consider random spectra, that is, stochastic, rather than individual characteristics of the sets of eigenvalues of its minors. This statistics is fundamentally different from the statistics of random Gaussian matrices, that is, from the semi-circular law and similar ones.

The limiting distributions of spectra for the most natural manifolds are also interesting and currently unknown. These include spheres and Stiefel manifolds, as well as non-compact manifolds with probability measures: see the forthcoming paper [74].

2. The calculation of scaling entropy

A significant part of this survey is devoted to the relatively new concept of scaling entropy. However, we do not know yet how to compute it even in the most natural cases. It is important to keep in mind that the unbounded growth of scaling entropy indicates the presence of a continuous part in the spectrum of an automorphism, which can sometimes be very difficult to establish directly. This question was posed by Vershik for adic automorphisms and, specifically, for the Pascal automorphism, one of the first non-trivial examples of adic transformations defined by Vershik in the late 1970s (see [53], [54], and [65]). Subsequently, it was found out that this transformation had been used by Kakutani [19] in a problem about partitions. An attempt to compute the scaling entropy for the Pascal automorphism was made in [32]. However, the unbounded growth of scaling entropy has not been established yet, despite the efforts of many mathematicians. Thus, it has not been proved that the Pascal automorphism has a purely continuous spectrum. The title of [65] expresses the confidence that this is true.

On the other hand, calculations have been carried out for the Morse and Chacon transformations, that is, substitutions (which are stationary adic shifts on infinite graphs: see [73]); see the details in § 3.4.1. As a generalization of the result for the Morse transformation, it could be interesting to find the scaling entropy for more general skew products over transformations with discrete spectra.

The scaling entropy has not yet been computed for numerous adic transformations on graph paths (Young, Fibonacci, and others). It is also of interest to compute the scaling entropy of Gaussian automorphisms with simple singular or purely singular spectra (see [12] and [52]). Surprisingly, the technique of approximations (ranks) still does not help in this case.

3. Description of an unstable automorphism

A description of unstable automorphisms (see § 3.2.1) is of undoubted interest. In particular, it would be of interest to provide an example of a symbolic model for some unstable automorphism.

4. Developing the theory of scaling entropy for countable groups

The theory of scaling entropy for actions of amenable groups, which we described in § 3.5, must be developed further. For non-amenable groups virtually nothing is known beyond the definition itself; it is not even known whether the invariant in question is non-trivial. In particular, an undoubtedly interesting question is the relationship between the definition of scaling entropy presented in § 3.5 and other entropy definitions for such groups.

The scaling entropy growth gap phenomenon described in § 3.5.4 seems to be important and interesting. It would be intriguing to understand for which amenable groups this phenomenon occurs. In particular, does it take place for the infinite symmetric group4?

5. Non-Bernoulli automorphisms with completely positive entropy

For non-Bernoulli automorphisms with completely positive entropy (also called K-automorphisms), which were introduced by Ornstein in the 1970s, there are still no observable invariants. These invariants should be related to non-entropy asymptotic invariants of stationary metric compact sets. As of now, such invariants are unknown.

6. Catalytic and relative invariants

The scheme for constructing catalytic absolute or relative invariants which was described in § 1.2.6 (also see § 3.1.3) has been realized so far only in the form of scaling entropy. In this case, the invariant is absolute (that is, does not depend on the metric). No other examples are currently known. This is explained by the fact that, besides $\varepsilon$-entropy, we lack a developed theory of invariants of compact metric spaces or mm-spaces themselves. In particular, there are no invariants of compact metric spaces enjoying some symmetry (for instance, invariant under an automorphism). There is no reason to doubt the existence of such invariants. This is suggested by the non-Bernoulli systems with positive Kolmogorov entropy mentioned in our survey.

Acknowledgement

The authors are grateful to an anonymous referee for their useful remarks and comments. We also thank Natalia Tsilevich for her help in translating the manuscript into English.


Bibliography

1. T. Adams, “Genericity and rigidity for slow entropy transformations”, New York J. Math., 27 (2021), 393–416  mathscinet  zmath
2. D. J. Aldous, “Exchangeability and related topics”, École d'été de probabilités de Saint-Flour XIII – 1983, Lecture Notes in Math., 1117, Springer, Berlin, 1985, 1–198  crossref  mathscinet  zmath
3. T. Austin, E. Glasner, J.-P. Thouvenot, and B. Weiss, “An ergodic system is dominant exactly when it has positive entropy”, Ergodic Theory Dynam. Systems, 2022, 1–15, Publ. online  crossref
4. V. I. Bogachev, “Kantorovich problem of optimal transportation of measures: new directions of research”, Russian Math. Surveys, 77:5 (2022), 769–817  mathnet  crossref  mathscinet  zmath
5. V. I. Bogachev, A. N. Kalinin, and S. N. Popova, “On the equality of values in the Monge and Kantorovich problems”, J. Math. Sci. (N. Y.), 238:4 (2019), 377–389  mathnet  crossref  mathscinet  zmath
6. E. Bogomolny, O. Bohigas, and C. Schmit, “Spectral properties of distance matrices”, J. Phys. A, 36:12 (2003), 3595–3616  crossref  mathscinet  zmath  adsnasa
7. P. J. Cameron and A. M. Vershik, “Some isometry groups of the Urysohn space”, Ann. Pure Appl. Logic, 143:1-3 (2006), 70–78  crossref  mathscinet  zmath
8. T. Downarowicz and J. Serafin, “Universal systems for entropy intervals”, J. Dynam. Differential Equations, 29:4 (2017), 1411–1422  crossref  mathscinet  zmath  adsnasa
9. S. Ferenczi, “Measure-theoretic complexity of ergodic systems”, Israel J. Math., 100 (1997), 187–207  crossref  mathscinet  zmath
10. S. Ferenczi and K. K. Park, “Entropy dimensions and a class of constructive examples”, Discrete Contin. Dyn. Syst., 17:1 (2007), 133–141  crossref  mathscinet  zmath
11. S. Gadgil and M. Krishnapur, “Lipschitz correspondence between metric measure spaces and random distance matrices”, Int. Math. Res. Not. IMRN, 2013:24 (2013), 5623–5644  crossref  mathscinet  zmath
12. I. V. Girsanov, “Spectra of dynamical systems generated by stationary Gaussian processes”, Dokl. Akad. Nauk SSSR, 119:5 (1958), 851–853 (Russian)  mathnet  mathscinet  zmath
13. A. Greven, P. Pfaffelhuber, and A. Winter, “Convergence in distribution of random metric measure spaces ($\Lambda$-coalescent measure trees)”, Probab. Theory Related Fields, 145:1-2 (2009), 285–322  crossref  mathscinet  zmath
14. M. Gromov, Metric structures for Riemannian and non-Riemannian spaces, Transl. from the French, Progr. Math., 152, Birkhäuser Boston, Inc., Boston, MA, 1999, xx+585 pp.  mathscinet  zmath
15. H. A. Helfgott, “Growth and generation in $\operatorname{SL}_2(\mathbb Z/p\mathbb Z)$”, Ann. of Math. (2), 167:2 (2008), 601–623  crossref  mathscinet  zmath
16. L. Hogben and C. Reinhart, “Spectra of variants of distance matrices of graphs and digraphs: a survey”, Matematica, 1:1 (2022), 186–224  crossref  mathscinet
17. H. E. Jordan, “Group-characters of various types of linear groups”, Amer. J. Math., 29:4 (1907), 387–405  crossref  mathscinet  zmath
18. K. Juschenko and N. Monod, “Cantor systems, piecewise translations and simple amenable groups”, Ann. of Math. (2), 178:2 (2013), 775–787  crossref  mathscinet  zmath
19. S. Kakutani, “A problem of equidistribution on the unit interval $[0,1]$”, Measure theory (Oberwolfach 1975), Lecture Notes in Math., 541, Springer, Berlin, 1976, 369–375  crossref  mathscinet  zmath
20. A. Kanigowski, “Slow entropy for some smooth flows on surfaces”, Israel J. Math., 226:2 (2018), 535–577  crossref  mathscinet  zmath
21. A. Kanigowski, A. Katok, and D. Wei, Survey on entropy-type invariants of sub-exponential growth in dynamical systems, 2020, 47 pp., arXiv: 2004.04655v1
22. A. Kanigowski, K. Vinhage, and D. Wei, “Slow entropy of some parabolic flows”, Comm. Math. Phys., 370:2 (2019), 449–474  crossref  mathscinet  zmath  adsnasa
23. L. V. Kantorovich, “On the translocation of masses”, Dokl. Akad. Nauk SSSR, 37:7–8 (1942), 227–229  zmath; English transl. in J. Math. Sci. (N. Y.), 133:4 (2006), 1381–1382  mathnet  crossref  mathscinet  zmath
24. A. Katok and J.-P. Thouvenot, “Slow entropy type invariants and smooth realization of commuting measure-preserving transformations”, Ann. Inst. H. Poincaré Probab. Statist., 33:3 (1997), 323–338  crossref  mathscinet  zmath  adsnasa
25. A. S. Kechris, Global aspects of ergodic group actions, Math. Surveys Monogr., 160, Amer. Math. Soc., Providence, RI, 2010, xii+237 pp.  crossref  mathscinet  zmath
26. A. N. Kolmogorov, “New metric invariant of transitive dynamical systems and automorphisms of Lebesgues spaces”, Selected works of A. N. Kolmogorov, v. III, Math. Appl. (Soviet Ser.), 27, Kluwer Acad. Publ., Dordrecht, 1993, 57–61  mathnet  crossref  mathscinet  zmath
27. A. N. Kolmogorov, “The theory of transmission of information”, Selected works of A. N. Kolmogorov, v. III, Math. Appl. (Soviet Ser.), 27, Kluwer Acad. Publ., Dordrecht, 1993, 6–32  crossref  mathscinet  zmath
28. V. Koltchinskii and E. Giné, “Random matrix approximation of spectra of integral operators”, Bernoulli, 6:1 (2000), 113–167  crossref  mathscinet  zmath
29. I. P. Cornfeld, S. V. Fomin, and Ya. G. Sinai, Ergodic theory serial Grundlehren Math. Wiss., 245, Springer-Verlag, New York, 1982, x+486 pp.  crossref  mathscinet  zmath
30. W. Krieger, “On entropy and generators of measure-preserving transformations”, Trans. Amer. Math. Soc., 149 (1970), 453–464  crossref  mathscinet  zmath
31. A. G. Kushnirenko, “On metric invariants of entropy type”, Russian Math. Surveys, 22:5 (1967), 53–61  mathnet  crossref  mathscinet  zmath  adsnasa
32. A. A. Lodkin, I. E. Manaev, and A. R. Minabutdinov, “Asymptotic behavior of the scaling entropy of the Pascal adic transformation”, J. Math. Sci. (N. Y.), 174:1 (2011), 28–35  mathnet  crossref  mathscinet  zmath
33. A. Lott, “Zero entropy actions of amenable groups are not dominant”, Ergodic Theory Dynam. Systems, 2023, 1–16, Publ. online  crossref
34. L. Motto Ros, “Can we classify complete metric spaces up to isometry?”, Boll. Unione Mat. Ital., 10:3 (2017), 369–410  crossref  mathscinet  zmath
35. D. S. Ornstein and B. Weiss, “Entropy and isomorphism theorems for actions of amenable groups”, J. Analyse Math., 48:1 (1987), 1–141  crossref  mathscinet  zmath
36. F. Petrov, “Correcting continuous hypergraphs”, St. Petersburg Math. J., 28:6 (2017), 783–787  mathnet  crossref  mathscinet  zmath
37. L. Pyber and E. Szabó, “Growth in finite simple groups of Lie type”, J. Amer. Math. Soc., 29:1 (2016), 95–146  crossref  mathscinet  zmath
38. M. Queffélec, Substitution dynamical systems – spectral analysis, Lecture Notes in Math., 1294, 2nd ed., Springer-Verlag, Berlin, 2010, xvi+351 pp.  crossref  mathscinet  zmath
39. V. A. Rohlin (Rokhlin), On the fundamental ideas of measure theory, Amer. Math. Soc. Transl., 1952, no. 71, Amer. Math. Soc., Providence, RI, 1952, 55 pp.  mathnet  mathscinet  zmath
40. V. A. Rokhlin, “Metric classification of measurable functions”, Uspekhi Mat. Nauk, 12:2(74) (1957), 169–174 (Russian)  mathnet  mathscinet  zmath
41. V. A. Rokhlin, “Lectures on the entropy theory of measure-preserving transformations”, Russian Math. Surveys, 22:5 (1967), 1–52  mathnet  crossref  mathscinet  zmath  adsnasa
42. V. V. Ryzhikov, “Compact families and typical entropy invariants of measure-preserving actions”, Trans. Moscow Math. Soc., 82 (2021), 117–123  mathnet  crossref  mathscinet  zmath
43. J. Schur, “Untersuchungen über die Darstellung der endlichen Gruppen durch gebrochene lineare Substitutionen”, J. Reine Angew. Math., 132 (1907), 85–137  crossref  mathscinet  zmath
44. J. Serafin, “Non-existence of a universal zero-entropy system”, Israel J. Math., 194:1 (2013), 349–358  crossref  mathscinet  zmath
45. C. E. Shannon, “A mathematical theory of communication”, Bell System Tech. J., 27:3, 4 (1948), 379–423, 623–656  crossref  mathscinet  zmath
46. K.-T. Sturm, The space of spaces: curvature bounds and gradient flows on the space of metric measure spaces, 2020 (v1 – 2012), 88 pp., arXiv: 1208.0434
47. P. S. Urysohn, “Sur un espace métrique universel”, Bull. Sci. Math. (2), 51 (1927), 43–64, 74–96  zmath
48. G. A. Veprev, “Scaling entropy of unstable systems”, Representation theory, dynamical systems, combinatorial methods. XXXI, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 498, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2020, 5–17  mathnet  mathscinet  zmath; G. A. Veprev, “Scaling entropy of unstable systems”, J. Math. Sci. (N. Y.), 255:2 (2021), 109–118  crossref
49. G. A. Veprev, “The scaling entropy of a generic action”, Representation theory, dynamical systems, combinatorial methods. XXXIII, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 507, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2021, 5–14  mathnet  mathscinet  zmath; English transl. in J. Math. Sci. (N. Y.), 261:5 (2022), 595–600  crossref
50. G. Veprev, Non-existence of a universal zero entropy system via generic actions of almost complete growth, 2022, 12 pp., arXiv: 2209.01902
51. G. Veprev, “Non-existence of a universal zero entropy system for non-periodic amenable group actions”, Israel J. Math., 253 (2023), 715–743  crossref
52. A. M. Vershik, “A spectral and metrical isomorphism of certain normal dynamical systems”, Dokl. Akad. Nauk SSSR, 144:2 (1962), 255–257 (Russian)  mathnet  mathscinet
53. A. M. Vershik, “Uniform algebraic approximation of shift and multiplication operators”, Soviet Math. Dokl., 24 (1981), 97–100  mathnet  mathscinet  zmath
54. A. M. Vershik, “A theorem on the Markov periodic approximation in ergodic theory”, Boundary-value problems of mathematical physics and related problems of function, Zap. Nauchn. Sem. Leningrad. Otdel. Mat Inst. Steklov., 115, Nauka, Leningrad branch, Leningrad, 1982, 72–82  mathnet  mathscinet  zmath; English transl. in J. Soviet Math., 28:5 (1985), 667–674  crossref
55. A. M. Vershik, “The universal Urysohn space, Gromov metric triples and random metrics on the natural numbers”, Russian Math. Surveys, 53:5 (1998), 921–928  mathnet  crossref  mathscinet  zmath  adsnasa
56. A. M. Vershik, “A random metric space is the universal Urysohn space”, Dokl. Math., 66:3 (2002), 421–424  mathnet  mathscinet  zmath
57. A. M. Vershik, Distance matrices, random metrics and Urysohn space, The MPIM preprint series, no. 2002-8, Max-Planck-Institut für Mathematik, Bonn, 2002, 19 pp. https://archive.mpim-bonn.mpg.de/id/eprint/2119/
58. A. M. Vershik, “Classification of measurable functions of several variables and invariantly distributed random matrices”, Funct. Anal. Appl., 36:2 (2002), 93–105  mathnet  crossref  mathscinet  zmath
59. A. M. Vershik, “Random and universal metric spaces”, Fundamental mathematics today, Moscow Center for Continuous Mathematical Education, Moscow, 2003, 54–88 (Russian)  mathscinet  zmath
60. A. M. Vershik, “Kantorovich metric: initial history and little-known applications”, Representation theory, dynamical systems. XI, Special issue, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 312, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2004, 69–85  mathnet  mathscinet  zmath; English transl. in J. Math. Sci. (N. Y.), 133:4 (2006), 1410–1417  crossref
61. A. M. Vershik, “Random metric spaces and universality”, Russian Math. Surveys, 59:2 (2004), 259–295  mathnet  crossref  mathscinet  zmath  adsnasa
62. A. M. Vershik, “Information, Entropy, Dynamics”, Mathematics of the XXth Century: A View from St. Petersburg, Moscow Center for Continuous Mathematical Education, Moscow, 2010, 47–76 (Russian)
63. A. M. Vershik, “Dynamics of metrics in measure spaces and their asymptotic invariants”, Markov Process. Related Fields, 16:1 (2010), 169–184  mathscinet  zmath
64. A. M. Vershik, “Scaling entropy and automorphisms with pure point spectrum”, St. Petersburg Math. J., 23:1 (2012), 75–91  mathnet  crossref  mathscinet  zmath
65. A. M. Vershik, “The Pascal automorphism has a continuous spectrum”, Funktsional. Anal. i Prilozhen., 45:3 (2011), 173–186  mathnet  crossref  mathscinet  zmath
66. A. M. Vershik, “Long history of the Monge–Kantorovich transportation problem”, Math. Intelligencer, 35:2 (2013), 1–9  crossref  mathscinet  zmath
67. A. M. Vershik, “The theory of filtrations of subalgebras, standardness, and independence”, Russian Math. Surveys, 72:2 (2017), 257–333  mathnet  crossref  mathscinet  zmath  adsnasa
68. A. M. Vershik, “One-dimensional central measures on numberings of ordered sets”, Funct. Anal. Appl., 56:4 (2022), 251–256  mathnet  crossref  zmath
69. A. M. Vershik, “Classification of measurable functions of several arguments and matrix distributions”, Functsional. Anal. Prikozhen., 57:4 (2023), 46–59 (Russian)  mathnet  crossref
70. A. M. Vershik and U. Haböck, “Compactness of the congruence group of measurable functions in several variables”, Computational methods and algorithms. XIX, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 334, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2006, 57–67  mathnet  mathscinet  zmath; J. Math. Sci. (N. Y.), 141:6 (2007), 1601–1607  crossref
71. A. M. Vershik and U. Haböck, “On the classification problem of measurable functions in several variables and on matrix distributions”, Probability and statistics. 22, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 441, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2015, 119–143  mathnet  mathscinet  zmath; J. Math. Sci. (N. Y.), 219:5 (2016), 683–699  crossref
72. A. M. Vershik and M. A. Lifshits, “On mm-entropy of a Banach space with Gaussian measure”, Theory Probab. Appl., 68:3 (2023), 431–439  mathnet  crossref
73. A. M. Vershik and A. N. Livshits, “Adic models of ergodic transformations, spectral theory, substitutions, and related topics”, Representation theory and dynamical systems, Adv. Soviet Math., 9, Amer. Math. Soc., Providence, RI, 1992, 185–204  crossref  mathscinet  zmath
74. A. M. Vershik and F. V. Petrov, “Limit spectral measures of the matrix distributions of the metric triples”, Funktsional. Anal. i Prilozhen., 57:2 (2023), 106–110 (Russian)  mathnet
75. A. M. Vershik and P. B. Zatitskii, “Universal adic approximation, invariant measures and scaled entropy”, Izv. Math., 81:4 (2017), 734–770  mathnet  crossref  mathscinet  zmath  adsnasa
76. A. M. Vershik and P. B. Zatitskii, “On a universal Borel adic space”, Representation theory, dynamical systems, combinatorial methods. XXIX, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 468, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2018, 24–38  mathnet  mathscinet  zmath; English transl. in J. Math. Sci. (N. Y.), 240:5 (2019), 515–524  crossref
77. A. M. Vershik and P. B. Zatitskiy (Zatitskii), “Combinatorial invariants of metric filtrations and automorphisms; the universal adic graph”, Funct. Anal. Appl., 52:4 (2018), 258–269  mathnet  crossref  mathscinet  zmath
78. A. M. Vershik, P. B. Zatitskiy (Zatitskii), and F. V. Petrov, “Geometry and dynamics of admissible metrics in measure spaces”, Cent. Eur. J. Math., 11:3 (2013), 379–400  mathscinet  zmath
79. A. M. Vershik, P. B. Zatitskiy (Zatitskii), and F. V. Petrov, “Virtual continuity of measurable functions of several variables and embedding theorems”, Funct. Anal. Appl., 47:3 (2013), 165–173  mathnet  crossref  mathscinet  zmath
80. A. M. Vershik, P. B. Zatitskiy (Zatitskii), and F. V. Petrov, “Virtual continuity of measurable functions and its applications”, Russian Math. Surveys, 69:6 (2014), 1031–1063  mathnet  crossref  mathscinet  zmath  adsnasa
81. Tao Yu, Guohua Zhang, and Ruifeng Zhang, “Discrete spectrum for amenable group actions”, Discrete Contin. Dyn. Syst., 41:12 (2021), 5871–5886  crossref  mathscinet  zmath
82. P. B. Zatitskiy (Zatitskii), “On a scaling entropy sequence of a dynamical system”, Funct. Anal. Appl., 48:4 (2014), 291–294  mathnet  crossref  mathscinet  zmath
83. P. B. Zatitskii, Scaling entropy sequence as a metric invariant of dynamical systems, PhD thesis, St. Petersburg Department of Steklov Mathematical Institute, St Petersburg, 2014, 87 pp. (Russian)
84. P. B. Zatitskiy (Zatitskii), “Scaling entropy sequence: invariance and examples”, Representation theory, dynamical systems, combinatorial methods. XXIV, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 432, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2015, 128–161  mathnet  zmath; English transl. in J. Math. Sci. (N. Y.), 209:6 (2015), 890–909  crossref  mathscinet
85. P. B. Zatitskiy (Zatitskii), “On the possible growth rate of a scaling entropy sequence”, Representation theory, dynamical systems, combinatorial methods. XXV, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 436, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2015, 136–166  mathnet  mathscinet  zmath; English transl. in J. Math. Sci. (N. Y.), 215:6 (2016), 715–733  crossref
86. P. B. Zatitskiy (Zatitskii) and F. V. Petrov, “Correction of metrics”, Representation theory, dynamical systems, combinatorial methods. XX, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 390, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2011, 201–209  mathnet  mathscinet  zmath; English transl. in J. Math. Sci. (N. Y.), 181:6 (2012), 867–870  crossref
87. P. B. Zatitskiy (Zatitskii) and F. V. Petrov, “On the subadditivity of a scaling entropy sequence”, Representation theory, dynamical systems, combinatorial methods. XXV, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI), 436, St. Petersburg Department of Steklov Mathematical Institute, St. Petersburg, 2015, 167–173  mathnet  mathscinet  zmath; English transl. in J. Math. Sci. (N. Y.), 215:6 (2016), 734–737  crossref
88. Yun Zhao and Ya. Pesin, “Scaled entropy for dynamical systems”, J. Stat. Phys., 158:2 (2015), 447–475  crossref  mathscinet  zmath  adsnasa; “Erratum to: Scaled entropy for dynamical systems”, 162:6 (2016), 1654–1660  crossref  mathscinet  zmath  adsnasa

Citation: A. M. Vershik, G. A. Veprev, P. B. Zatitskii, “Dynamics of metrics in measure spaces and scaling entropy”, Russian Math. Surveys, 78:3 (2023), 443–499
Citation in format AMSBIB
\Bibitem{VerVepZat23}
\by A.~M.~Vershik, G.~A.~Veprev, P.~B.~Zatitskii
\paper Dynamics of metrics in measure spaces and scaling entropy
\jour Russian Math. Surveys
\yr 2023
\vol 78
\issue 3
\pages 443--499
\mathnet{http://mi.mathnet.ru//eng/rm10103}
\crossref{https://doi.org/10.4213/rm10103e}
\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=4673245}
\zmath{https://zbmath.org/?q=an:07794496}
\adsnasa{https://adsabs.harvard.edu/cgi-bin/bib_query?2023RuMaS..78..443V}
\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=001146055900002}
\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85179940197}
Linking options:
  • https://www.mathnet.ru/eng/rm10103
  • https://doi.org/10.4213/rm10103e
  • https://www.mathnet.ru/eng/rm/v78/i3/p53
  • This publication is cited in the following 4 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Успехи математических наук Russian Mathematical Surveys
    Statistics & downloads:
    Abstract page:477
    Russian version PDF:50
    English version PDF:74
    Russian version HTML:278
    English version HTML:139
    References:56
    First page:21
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024