M. R. Gabdullin, A. O. Radomskii, “Prime avoiding numbers form a basis of order $2$”, Sb. Math., 215:5 (2024), 612

Sbornik: Mathematics

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Forthcoming papers
	Archive
	Impact factor
	Guidelines for authors
	License agreement
	Submit a manuscript

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Mat. Sb.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Sbornik: Mathematics, 2024, Volume 215, Issue 5, Pages 612–633
DOI: https://doi.org/10.4213/sm9980e (Mi sm9980)

Prime avoiding numbers form a basis of order $2$

M. R. Gabdullin^ab, A. O. Radomskii^c

^a Steklov Mathematical Institute of Russian Academy of Sciences, Moscow, Russia
^b University of Illinois Urbana-Champaign, Champaign, IL, USA
^c HSE University, Moscow, Russia

English version PDF (617 kB) HTML full-text Russian version article

References:

PDF

HTML

DOI: https://doi.org/10.4213/sm9980e

Abstract: For a positive integer $n$ we denote by $F(n)$ the distance of $n$ to the nearest prime number. Using the technique from the recent paper “Long gaps in sieved sets” by Ford, Konyagin, Maynard, Pomerance and Tao (J. Eur. Math. Soc., 23:2 (2021), 667–700) we prove that every sufficiently large positive integer $N$ can be represented as a sum $N=n_1+n_2$, where $F(n_i) \geq (\log N)(\log\log N)^{1/325565}$ for $i=1,2$. This improves the corresponding ‘trivial’ statement where only the inequality $F(n_i)\gg \log N$ is assumed.
Bibliography: 17 titles.

Keywords: prime numbers, basis, sieving.

Funding agency	Grant number
Ministry of Science and Higher Education of the Russian Federation	075-15-2022-265
HSE Basic Research Program
The work of M. R. Gabdullin was performed at the Steklov International Mathematical Center and supported by the Ministry of Science and Higher Education of the Russian Federation (agreement no. 075-15-2022-265). The work of A. O. Radomskii was supported by the HSE University Basic Research Program.

Received: 12.07.2023 and 07.02.2024

Bibliographic databases:

Document Type: Article

MSC: 11B05, 11N35

Language: English

Original paper language: Russian

§ 1. Introduction

Let $p_n$ be the $n$th prime number and

$$ \begin{equation*} G(X)=\max_{p_{n+1}\leqslant X}(p_{n+1}-p_n) \end{equation*} \notag $$

denote the largest gap between two consecutive primes up to $X$. The Prime Number Theorem, together with a simple averaging argument, implies that $G(X)\geqslant (1+ o(1))\log X$, and Rankin [1] in 1938 was the first to prove the bound of the type

$$ \begin{equation*} G(X)\geqslant(c+o(1))\frac{\log X \log\log X\log\log\log\log X}{(\log\log\log X)^2}, \end{equation*} \notag $$

improving the previous results of Westzynthius [2] and Erdős [3]. Rankin proved the bound mentioned for $c=1/3$, and in about the next 80 years this constant was increased many times, the last constant being $c=2e^{\gamma}$ owing to Pintz [4]. In 2016 Ford, Green, Konyagin and Tao [5] and, independently, Maynard [6], using different approaches, showed that $c$ can be taken arbitrarily large, thus giving an affirmative answer to a long-standing conjecture of Erdős [7]. In 2018 all these five authors together, by combining the ideas from [5] and [6], made a further breakthrough [8] by establishing that

$$ \begin{equation} G(X)\gg \frac{\log X \log\log X \log\log\log\log X}{\log\log\log X}. \end{equation} \tag{1.1} $$

The expected size of $G(X)$ is of order $(\log X)^2$ (see [9] for the precise conjecture of Cramér based on a probabilistic model of primes); this model was also refined by Granville [10] and, more recently, by Banks, Ford and Tao [11]. We also note that the best known upper bound for $G(X)$, due to Baker, Harman and Pintz [12], is $G(X)\ll X^{0.525}$, and we refer the reader to [8] for a further discussion of the quantity $G(X)$.

In [13], Ford, Heath-Brown and Konyagin introduced the notion of prime avoidance. Given a positive integer $n$, let $F(n)$ denote the distance of $n$ to the nearest prime number (clearly, the maximum value of $F(n)$ taken over all $n\leqslant X$ has the same order as $G(X)$). They called $n$ a ‘prime avoiding number with constant $c$’ if

$$ \begin{equation*} F(n)\geqslant c\frac{\log n \log\log n \log\log\log\log n}{(\log\log\log n)^2}, \end{equation*} \notag $$

and proved that for any positive integer $k$ there exist $c=c(k)>0$ and infinitely many perfect $k$th powers that are prime avoiding with constant $c$. Using a method from [8] Maier and Rassias [14] extended this result to $k$th prime powers and a stronger version of prime avoidance (with lower bound of the type (1.1)); see also their recent work [15].

In this paper we consider the following additive problem related to prime avoidance: can one prove that each large positive integer $N$ can be represented as the sum of two numbers $n_1$ and $n_2$, where both $F(n_1)$ and $F(n_2)$ are large in terms of $N$?

The first instinct to prove such a result can be to use the technique from [8] (which, in general, follows the strategy from some previous papers, starting from [2] by Westzynthius). However, there is a general obstacle, which makes this idea unfit for our setting. In this approach one exploits a ‘smooth’ integer $m$, which is divisible by all small primes up to some (relatively small) $z$, to make sure that the majority of the $G(X)$ successive integers beginning from $m+2$ have a small prime factor; then the aim is to use larger primes to sieve out the remaining numbers and make this procedure as efficient as possible. But if we take $n_1$ and $n_2$ in accordance with this construction, then both are close to such smooth numbers, and their sum (which we want to be $N$) is also close to a smooth number and therefore not arbitrary. Thus one needs to use an entirely different method to attack the problem stated.

First, it turns out that some standard technique allows us to establish the following ‘trivial’ statement. Note that the Prime Number Theorem implies that the average value of $F(n)$ (taken over $n\leqslant N$) is at least of order $\log N$.

Proposition 1.1. Every sufficiently large positive integer $N$ can be represented as a sum $N=n_1+n_2$, where $F(n_i)\gg \log N$, $i=1,2$.

Our goal is to improve the lower bound from this proposition by obtaining a result where $\log N$ is multiplied by some growing function. For a number $\rho\in(0,1)$ we set

$$ \begin{equation} C(\rho)=\sup\biggl\{\delta\in\biggl(0,\frac 12\biggr)\colon \frac{6\cdot10^{2\delta}}{\log(1/(2\delta))}<\rho\biggr\}. \end{equation} \tag{1.2} $$

Our main result is as follows.

Theorem 1.1. Every sufficiently large positive integer $N$ can be represented as a sum $N=n_1+n_2$, where

$$ \begin{equation*} F(n_i) \geqslant (\log N)(\log\log N)^{C(1/2)-o(1)} \end{equation*} \notag $$

for $i=1,2$.

In fact, the proof of Theorem 1.1 implies that there are at least $\exp((\log N)^{1-o(1)})$ such representations (because of many ‘good’ choices of $\mathbf{b}$ in Theorem 4.1 ). Note that numerical calculations show that $C(1/2)>1/325565$.

Theorem 1.1 admits the following interpretation. Recall that a set $A\subseteq \mathbb{N}$ is called a basis of order $k$ if every sufficiently large positive integer can be represented as a sum of $k$ summands from $A$. Now consider the set

$$ \begin{equation*} \bigl\{n\geqslant3\colon F(n)\geqslant (\log n)(\log\log n)^\delta \bigr\} \end{equation*} \notag $$

(which is also kind of a set of prime avoiding numbers). Then Theorem 1.1 implies that this set is a basis of order $2$ for any $\delta<C(1/2)$.

To prove Theorem 1.1 we apply the technique from the recent paper [16] by Ford, Konyagin, Maynard, Pomerance and Tao, where the authors used a hypergraph covering lemma of Pippenger–Spencer type (which was introduced in [8]) to detect long gaps in general sieved sets. To state their result, we need the following definition (the symbol $p$ always denotes a prime number).

Definition 1.1. A sieving system is a collection $\mathcal{I}$ of sets $I_p \subset \mathbb{Z}/p\mathbb{Z}$ of residue classes modulo $p$ for each prime $p$. Moreover, we have the following definitions.

• We say that the sieving system is nondegenerate if $|I_p|\leqslant p-1$ for all $p$.
• Given $B>0$, we say that the sieving system is $B$-bounded if $|I_p| \leqslant B$ for all primes $p$.
• We say that the sieving system is one-dimensional if we have the weighted Mertens-type product estimate
$$ \begin{equation*} \prod_{p\leqslant x}\biggl(1-\frac{|I_p|}{p}\biggr)\sim \frac{C_1}{\log x}, \qquad x\to\infty, \end{equation*} \notag $$
for some positive constant $C_1$.
• Given $\rho>0$, we say that the sieving system is $\rho$-supported if the density of primes with $|I_p| \geqslant 1$ equals $\rho$, that is,
$$ \begin{equation*} \lim_{x\to\infty}\frac{|{p\leqslant x\colon |I_p| \geqslant 1}|}{x/\log x} = \rho. \end{equation*} \notag $$

The main result of [16] is that for such a sieving system the sieved set

$$ \begin{equation*} S_x=S_x(\mathcal{I})=\mathbb{Z}\setminus\bigcup_{p\leqslant x}I_p \end{equation*} \notag $$

(the set of integers which do not belong to any $I_p$ for all $p\leqslant x$) contains a gap of size $x(\log x)^{C(\rho)-o(1)}$, where $C(\rho)$ is defined¹ in (1.2) and the rate of decay in $o(1)$ depends on $\mathcal{I}$. Despite the fact that this general bound, as applied to the Eratosthenes sieve (that is, the sieving system with $I_p=\{0\}$ for all $p$), yields only the bound

$$ \begin{equation*} G(X) \gg (\log X)(\log\log X)^{C(1)-o(1)} \gg (\log X)(\log\log X)^{1/835}, \end{equation*} \notag $$

which is weaker than (1.1), it has the advantage of not dealing with ‘smooth’ numbers from the above discussion, and this is crucial for us.

To deduce Theorem 1.1 we also work with the one-dimensional Eratosthenes sieving system; however, rather than one set, we need to treat the two sets $S_x-n_1$ and $S_x-N+n_1$ (for some $n_1$) simultaneously; our main goal will be to guarantee inequality (2.2). To do this we use disjoint sets of (‘large’) primes of density $1/2$ (and this is why our result involves the exponent $C(1/2)$) to handle these two sets separately. Fortunately for us, the one-dimensionality is, in fact, needed only for ‘small’ primes, and we are able to make use of it before partitioning the set of large primes. Except for dealing with two sets simultaneously, our proof actually almost repeats the proof of the main result in [16]. However, owing to some new technical issues, there are not so many things from [16] we can use without any changes, and so we decide to provide the full proof in spite of a considerable intersection with the text of [16].

The paper is organized as follows. In § 2 we prove Proposition 1.1 (this is relatively short and based on some ideas we need for our main result); at the same time, we reduce Theorem 1.1 to the problem of sieving out two shifts of $S_{x/2}$. In § 3 we give an outline of the next part of the proof and introduce the necessary notation. The arguments in §§ 4–6 are analogous to those in §§ 3–5 of [16], and our Theorems 4.1 and 5.1 are modifications of Theorems 2 and 3 of [16].

Acknowledgement

The authors would like to thank Sergei Konyagin for introducing them to the problem.

§ 2. Preliminaries and proof of Proposition 1.1

In this section we provide a proof of Proposition 1.1 to illustrate some parts of the general strategy in a more simple context and also make the first reduction of Theorem 1.1. This proof is actually similar to the proof of (1.8) in [16] (which shows the existence of a gap of length $\gg x$ in $S_x$), with the difference that, again, we have to handle two sets instead of one.

Proof of Proposition 1.1. Given a number $x\geqslant 2$, let

$$ \begin{equation*} S_x=\{n\in \mathbb{Z}\colon n\not\equiv0 \ (\operatorname{mod}p) \text{ for each } p\leqslant x\}. \end{equation*} \notag $$

Clearly, $S_x$ is a periodic set with period $P(x)=\prod_{p\leqslant x}p$, which contains all primes larger than $x$. Let $z=x/2$, and let $b'\in \mathbb{Z}/P(z)\mathbb{Z}$ be chosen uniformly at random. We consider the random sets

$$ \begin{equation*} A_{b'}:=(S_z-b')\cap[-y,y] \quad\text{and}\quad A_{N-b'}:=(S_z-N+b')\cap[-y,y], \end{equation*} \notag $$

where $y=\lfloor 0.08x \rfloor$, and, for a set $S\subset\mathbb{Z}$ and an integer $b$, we write $S-b$ for the set $\{s-b\colon s\in S\}$. We have

$$ \begin{equation*} \begin{aligned} \, \mathbb{E}|A_{b'}| &=\mathbb{E}\sum_{|n|\leqslant y}1_{n\in S_z-b'}=\sum_{|n|\leqslant y}\prod_{p\leqslant z}\mathbb P(b'\not\equiv -n\ (\operatorname{mod}p)) \\ &=\sum_{|n|\leqslant y}\prod_{p\leqslant z}\biggl(1-\frac1p\biggr)=\frac{(2e^{-\gamma}+o(1))y}{\log z}, \end{aligned} \end{equation*} \notag $$

where $\gamma$ is the Euler constant and, similarly,

$$ \begin{equation*} \mathbb{E}|A_{N-b'}|=\frac{(2e^{-\gamma}+o(1))y}{\log z}. \end{equation*} \notag $$

Therefore, if $x$ is large enough, then

$$ \begin{equation*} \mathbb{E}(|A_{b'}|+|A_{N-b'}|) \leqslant \frac{5y}{\log x}. \end{equation*} \notag $$

Thus, there is a choice of $b'$ modulo $P(z)$ such that

$$ \begin{equation*} |A_{b'}|+|A_{N-b'}|\leqslant \frac{0.4x}{\log x}. \end{equation*} \notag $$

Let $P_{z,x}=\prod_{z<p\leqslant x}p$ and

$$ \begin{equation*} S_{z,x}=\{n\in\mathbb{Z}\colon n\not\equiv 0 \ (\operatorname{mod}p)\ \forall p\in(z,x]\}. \end{equation*} \notag $$

Now we choose a number $b$ modulo $P(x)$. We set $b\equiv b'\pmod{P(z)}$ and claim that there is a choice of $b \pmod{P_{z,x}}$ (denoted by $b''$) such that

$$ \begin{equation} (S_x-b)\cap[-y,y]=(S_x-N+b)\cap [-y,y] = \varnothing. \end{equation} \tag{2.1} $$

To see that this is possible, note that

$$ \begin{equation*} S_x-b=\{n\in \mathbb{Z}\colon n\not\equiv -b \ (\operatorname{mod}p) \ \forall\, p\leqslant x\} = (S_z-b')\cap (S_{z,x}-b''). \end{equation*} \notag $$

Further, for each element $m\in A_{b'}$ we take a prime $q\in(z,x]$ and, using the Chinese Remainder Theorem, define $b\equiv b_q\pmod q$ such that $m\equiv -b_q \pmod q$; thus, $m\notin S_{z,x}-b''$ and so $m\notin S_x-b$. We proceed similarly for each $m\in A_{N-b'}$. Since there are $(0.5+o(1))x/\log x$ primes in $(z,x]$ and at most $0.4x/\log x$ of the numbers $m\in A_{b'}\cup A_{N-b'}$ have survived, it is possible to perform this ‘clean-up’ stage.

To finish the proof we set $f(n)=\min\{|n-l|\colon l\in S_x\}$. Then equality (2.1) means that

$$ \begin{equation*} f(b)\geqslant y\quad\text{and} \quad f(N-b)\geqslant y \end{equation*} \notag $$

for our choice of $b\pmod{P(x)}$. Now we choose $x\approx \log (N/2)$ to be the maximum number such that $P(x)\leqslant N/2$. Thus we see that it is possible to select this $b$ so that $b\in [N/4, 3N/4]$; then $N-b\in [N/4,3N/4]$ too. Finally, since $\{{N^{1/2}<p\leqslant N}\} \subset S_x$, we obtain

$$ \begin{equation*} F(b)\geqslant f(b) \gg x \gg \log N, \end{equation*} \notag $$

and, similarly, $F(N-b)\gg \log N$. This completes the proof of Proposition 1.1.

Now we see that to prove Theorem 1.1 it is enough to show that for any fixed $\delta<C(1/2)$ and $y=\lceil x(\log x)^{\delta} \rceil$ there exists a choice of $b$ modulo $P(x/2)$ such that

$$ \begin{equation} \bigl|\bigl((S_{x/2}-b)\cup (S_{x/2}-N+b)\bigr)\cap [-y,y]\bigr| \leqslant \biggl(\frac12-\varepsilon\biggr)\frac{x}{\log x} \end{equation} \tag{2.2} $$

for some $\varepsilon>0$. Then arguing in accordance with the clean-up stage of the above proof, one can easily obtain that both $F(b)$ and $F(N-b)$ are $\gg (\log N)(\log\log N)^{\delta}$, and Theorem 1.1 follows. Note that the condition $\delta<C(1/2)$ is equivalent to (recall the definition (1.2) of $C(\rho)$)

$$ \begin{equation} \frac{6\cdot10^{2\delta}}{\log(1/(2\delta))}<\frac12; \end{equation} \tag{2.3} $$

we use it in this form in § 5.

§ 3. Notation and outline

Throughout the proof of our main result (Theorem 1.1) we use positive parameters $K$, $\xi$ and $M$, which we describe below; one can think of them as being fixed for most of the time (in fact, it is only at the end of § 5 that the precise choice of them is important). The implied constant in $\ll$ and related order estimates can depend on these parameters. We will rely on probabilistic methods; boldface symbols such as $\mathbf{S}'$, $\boldsymbol{\lambda}$ and $\mathbf{n}$ denote random variables (sets, functions, numbers and so on), and the corresponding regular symbols $S'$, $\lambda$ and $n$ denote deterministic counterparts of these variables.

For fixed $\delta\in (0,1/2)$ satisfying (2.3) we set

$$ \begin{equation} y=\lceil x(\log x)^{\delta} \rceil \end{equation} \tag{3.1} $$

and

$$ \begin{equation} z=\frac{y\log\log x}{(\log x)^{1/2}}. \end{equation} \tag{3.2} $$

Let $\xi>1$ be a real number (which we will eventually choose to be close to $1$), and let

$$ \begin{equation*} \mathfrak H=\biggl\{ H\in \{1,\xi, \xi^2,\dots\}\colon \frac{2y}{x} \leqslant H\leqslant \frac{y}{\xi z}\biggr\}, \end{equation*} \notag $$

so that each $H$ obeys the inequality

$$ \begin{equation} 2(\log x)^{\delta}\leqslant H\leqslant \frac{y}{z} = \frac{(\log x)^{1/2}}{\log\log x}. \end{equation} \tag{3.3} $$

For each $H$ and $i\in\{1,3\}$ let $\mathcal{Q}_{H,i}$ be the set of primes $i\pmod{4}$ in the interval $(y/(\xi H),y/H]$. Note that

$$ \begin{equation} |\mathcal Q_{H,i}|\sim \biggl(1-\frac 1\xi\biggr)\frac{y}{2H\log x} \end{equation} \tag{3.4} $$

whenever $x$ is large enough in terms of $\xi$. Let

$$ \begin{equation*} \mathcal Q=\bigcup_{H\in\mathfrak H}(\mathcal Q_{H,1}\cup\mathcal Q_{H,3}) \end{equation*} \notag $$

and for each $q\in \mathcal{Q}$ let $H_q$ denote the unique $H$ such that $q\in \mathcal{Q}_{H,1}\cup\mathcal{Q}_{H,3}$, which is equivalent to

$$ \begin{equation*} \frac{y}{\xi H}<q\leqslant \frac{y}{H}. \end{equation*} \notag $$

Also let $M$ be a number such that

$$ \begin{equation} 6<M\leqslant 7. \end{equation} \tag{3.5} $$

As in § 2, we use the notation

$$ \begin{equation*} S_z=\{n\in\mathbb{Z}\colon n\not\equiv 0 \ (\operatorname{mod}p) \text{ for all } p\leqslant z\} \end{equation*} \notag $$

and

$$ \begin{equation*} S_{z,u}=\{n\in\mathbb{Z}\colon n\not\equiv 0 \ (\operatorname{mod}p) \text{ for all } z<p\leqslant u\}. \end{equation*} \notag $$

We adopt the abbreviations

$$ \begin{equation} \begin{gathered} \, P=P(z)=\prod_{p\leqslant z}p, \qquad \sigma=\sigma(z)=\prod_{p\leqslant z}\biggl(1-\frac1p\biggr), \\ \mathbf S'=S_z-\mathbf b\quad\text{and} \quad \mathbf S''=S_z-N+\mathbf b, \end{gathered} \end{equation} \tag{3.6} $$

where $\mathbf{b}$ is a residue class chosen uniformly at random in $\mathbb{Z}/P(z)\mathbb{Z}$; thus, both $\mathbf{S}'$ and $\mathbf{S}''$ are random shifts of $S_z$. For fixed $H\in \mathfrak{H}$, we also set

$$ \begin{equation} \begin{gathered} \, P_1=\prod_{p\leqslant H^M}p, \qquad \sigma_1=\sigma(H^M), \qquad \mathbf b_1\equiv \mathbf b \ (\operatorname{mod}P_1), \\ \mathbf S'_1=S_{H^M}-\mathbf b_1, \qquad \mathbf S''_1=S_{H^M}-N+\mathbf b_1 \end{gathered} \end{equation} \tag{3.7} $$

and

$$ \begin{equation} \begin{gathered} \, P_2=\prod_{H^M<p\leqslant z}p, \qquad \sigma_2=\sigma(H^M,z), \qquad \mathbf b_2\equiv \mathbf b \ (\operatorname{mod}P_2), \\ \mathbf S'_2=S_{H^M,z}-\mathbf b_2, \qquad \mathbf S''_2=S_{H^M,z}-N+\mathbf b_2, \end{gathered} \end{equation} \tag{3.8} $$

where $\sigma(H^M,z)=\prod_{H^M<p\leqslant z}(1-1/p)$. For each $H\in\mathfrak{H}$ we obviously have

$$ \begin{equation} P=P_1P_2, \qquad \sigma=\sigma_1\sigma_2, \qquad \mathbf S'=\mathbf S'_1\cap \mathbf S'_2\quad\text{and} \quad \mathbf S''=\mathbf S''_1\cap \mathbf S''_2. \end{equation} \tag{3.9} $$

Note that all quantities defined in (3.7) and (3.8) depend on $H$ and $M$; however, we do not indicate this dependence for brevity (the values of $H$ and $M$ will always be clear from the context).

Finally, we set

$$ \begin{equation} \mathbf{AP}'(KH_q; q,n)=\{n+qh\colon 1\leqslant h\leqslant KH_q\}\cap \mathbf S_1' \end{equation} \tag{3.10} $$

and

$$ \begin{equation} \mathbf{AP}''(KH_q; q,n)=\{n+qh\colon 1\leqslant h\leqslant KH_q\}\cap \mathbf S_1'', \end{equation} \tag{3.11} $$

where $K$ is a positive integer which will be chosen large enough, and, for $q\equiv1\pmod4$,

$$ \begin{equation} \boldsymbol{\lambda}(H_q; q,n)= \begin{cases} \sigma_2^{-|\mathbf{AP}'(KH_q;q,n)|} &\text{if } \mathbf{AP}'(KH_q; q,n) \subset \mathbf S'_2, \\ 0 &\text{otherwise}, \end{cases} \end{equation} \tag{3.12} $$

while for $q\equiv3\pmod4$,

$$ \begin{equation} \boldsymbol{\lambda}(H_q; q,n)= \begin{cases} \sigma_2^{-|\mathbf{AP}''(KH_q;q,n)|} &\text{if } \mathbf{AP}''(KH_q; q,n) \subset \mathbf S''_2, \\ 0 &\text{otherwise}. \end{cases} \end{equation} \tag{3.13} $$

So for each $q\in\mathcal{Q}$, the weights $\boldsymbol{\lambda}(H_q;q,n)$ are random functions which depend on $\mathbf{b}$.

Now we give a brief outline of the proof of (2.2). As in [16], there are three main steps.

1. The uniform random stage. We choose $\mathbf{b}$ modulo $P(z)$ uniformly at random; this is equivalent to choosing $b \pmod p$ randomly with uniform probability, independently for each $p\leqslant z$. Then, first of all, we can easily guarantee that both sets $\mathbf{S}'\cap[-y,y]$ and $\mathbf{S}''\cap[-y,y]$ have size of about $2\sigma y$ (see Remark 5.1 below). We also show that with high probability the sets $\mathbf{S}_1'$, $\mathbf{S}_2'$, $\mathbf{S}_1''$ and $\mathbf{S}_2''$ behave as we need them to for all scales $H\in\mathfrak{H}$.

2. The greedy stage. Having chosen an appropriate $b\pmod{P(z)}$ we continue sieving out the sets $S'\cap[-y,y]$ and $S''\cap[-y,y]$. They have a small intersection, so we must work with both of them separately, by using disjoint subsets of ‘large’ (lying between $z$ and $x/2$) primes $\{q\in\mathcal{Q}\colon q\equiv1\pmod4\}$ and $\{q\in\mathcal{Q}\colon q\equiv3\pmod4\}$ of density $1/2$, respectively. To establish (2.2), we select $b$ modulo $P(z,x/2)$ randomly, but depending on the choice of $b$ modulo $P(z)$. Slightly more precisely, for each prime $q\in(z,x/2]$ such that $q\equiv1\pmod4$ we select $b\equiv b_q \pmod q$ so that $\{b_q+kq\colon k\in\mathbb{Z}\}$ knocks out nearly as many elements of $(S_z-b) \cap [-y,y]$ as possible; we do the same for the primes $q\in(z,x/2]$ such that $q\equiv3\pmod4$, to sieve out almost all of $(S_z-N+b)\cap[-y,y]$. This can be done using the so-called hypergraph covering theorem (Lemma 4.1 below).

3. The clean-up stage. Finally, as we saw in § 2, one can use the remaining primes in $(x/2,x]$ to ‘kill’ all numbers in both $S'\cap[-y,y]$ and $S''\cap[-y,y]$ that survived after the greedy stage, and this actually completes the proof of (2.2).

We refer the interested reader to [16] for a more detailed discussion of the method.

In § 4 we reduce inequality (2.2) to Theorem 4.1, which, in turn, is reduced to Theorem 5.1 in § 5. The final section (§ 6) is devoted to the proof of Theorem 5.1.

§ 4. Greedy sieving using hypergraph covering

Recall that $\mathbf{S}'$ and $\mathbf{S}''$ are the random sets $(S_z-\mathbf{b})$ and $(S_z-N+\mathbf{b})$, respectively, where $\mathbf{b}$ is chosen uniformly at random from $\mathbb{Z}/P\mathbb{Z}$. As mentioned above, by $S'$ and $S''$ we denote their realizations (with respect to some choice of $b$); the same is applied to the random weights $\boldsymbol{\lambda}$.

Theorem 4.1. Fix $\delta$ satisfying (2.3) and suppose that $M-6$, $1/K$ and $\xi-1$ are sufficiently small depending on $\delta$, and that $x$ is sufficiently large depending on $\delta$, $M$, $K$ and $\xi$. Then for any positive $\varepsilon<(M-6)/6$ there exist $b \pmod{P(z)}$ and sets $\mathcal{Q}'\subseteq\{q\in \mathcal{Q}\colon q\equiv1\pmod4\}$ and $\mathcal{Q}''\subseteq\{q\in \mathcal{Q}\colon q\equiv3\pmod4\}$ such that

(i) the inequality

$$ \begin{equation} \bigl|(S'\cup S'')\cap[-y,y]\bigr| \leqslant 9\sigma y \end{equation} \tag{4.1} $$

holds;

(ii) for each $q\in \mathcal{Q}'\cup \mathcal{Q}''$

$$ \begin{equation} \sum_{-(K+1)y<n\leqslant y}\lambda(H_q;q,n)=\biggl(1+O\biggl(\frac{1}{(\log x)^{\delta(1+\varepsilon)}}\biggr)\biggr)(K+2)y; \end{equation} \tag{4.2} $$

(iii) for all but at most $x/(10\log x)$ elements $n$ of $S'\cap[-y,y]$

$$ \begin{equation} \sum_{q\in \mathcal Q'}\sum_{h\leqslant KH_q}\lambda(H_q;q,n-qh)= \biggl(C_2'+O\biggl(\frac{1}{(\log x)^{\delta(1+\varepsilon)}}\biggr)\biggr)(K+2)y, \end{equation} \tag{4.3} $$

and for all but at most $x/(10\log x)$ elements $n$ of $S''\cap[-y,y]$

$$ \begin{equation} \sum_{q\in \mathcal Q''}\sum_{h\leqslant KH_q}\lambda(H_q;q,n-qh)= \biggl(C_2''+O\biggl(\frac{1}{(\log x)^{\delta(1+\varepsilon)}}\biggr)\biggr)(K+2)y, \end{equation} \tag{4.4} $$

where $C_2'$ and $C_2''$ are some quantities independent of $n$ and such that

$$ \begin{equation} 10^{2\delta} \leqslant C_2',C_2'' \leqslant 100. \end{equation} \tag{4.5} $$

Theorem 4.1 can be considered as a preparation to the greedy stage of sieving out the sets $S'$ and $S''$ by using large primes in $(z,x/2)$. After fixing an appropriate $b\pmod{P(z)}$ and obtaining disjoint sets $\mathcal{Q}'$ and $\mathcal{Q}''$ to work with $S'$ and $S''$, respectively, we will be in a position to apply the following lemma (which is Lemma 3.1 of [16]) to deduce Theorem 1.1 from Theorem 4.1.

Lemma 4.1 (on hypergraph covering). Suppose that $0<\delta\leqslant1/2$ and $K\geqslant1$, let $y\geqslant y_0(\delta,K)$ where $y_0(\delta,K)$ ia sufficiently large, and let $V$ be a finite set such that $|V|\leqslant y$. Let $1\leqslant s\leqslant y$, and suppose that $\mathbf{e}_1,\dots,\mathbf{e}_s$ are random subsets of $V$ satisfying the following:

$$ \begin{equation} |\mathbf e_i|\leqslant \frac{K(\log y)^{1/2}}{\log\log y}, \qquad 1\leqslant i\leqslant s, \end{equation} \tag{4.6} $$

$$ \begin{equation} \mathbb P(v\in \mathbf e_i) \leqslant y^{-1/2-1/100}, \qquad v\in V, \quad 1\leqslant i\leqslant s, \end{equation} \tag{4.7} $$

$$ \begin{equation} \sum_{i=1}^s\mathbb P(v,v'\in \mathbf e_i) \leqslant y^{-1/2}, \qquad v,v'\in V, \quad v\neq v', \end{equation} \tag{4.8} $$

and

$$ \begin{equation} \biggl|\sum_{i=1}^s\mathbb P(v\in \mathbf e_i)-C_2\biggr| \leqslant \eta, \qquad v\in V, \end{equation} \tag{4.9} $$

where $C_2$ and $\eta$ satisfy

$$ \begin{equation} 10^{2\delta} \leqslant C_2 \leqslant 100 \quad\textit{and}\quad \eta \geqslant \frac{1}{(\log y)^{\delta}\log\log y} . \end{equation} \tag{4.10} $$

Then there are subsets $e_i$ of $V$, $1\leqslant i\leqslant s$, such that $e_i$ lies in the support of $\mathbf{e}_i$ for every $i$ and

$$ \begin{equation} \biggl|V\setminus \bigcup_{i=1}^se_i\biggr| \leqslant C_3\eta|V|, \end{equation} \tag{4.11} $$

where $C_3$ is an absolute constant.

Deduction of Theorem 1.1 from Theorem 4.1. Let $b$, $\mathcal{Q}'$ and $\mathcal{Q}''$ be from Theorem 4.1. We use the hypergraph covering lemma (Lemma 4.1). Let

$$ \begin{equation*} V'=\{n\in S'\cap[-y,y]\colon (4.3) \text{ holds}\} \end{equation*} \notag $$

and

$$ \begin{equation*} V''=\{n\in S''\cap[-y,y]\colon (4.4) \text{ holds}\}. \end{equation*} \notag $$

For each $q\in \mathcal{Q}'\cup \mathcal{Q}''$ we define a random integer $\mathbf{n}_q$ by setting

$$ \begin{equation} \mathbb P(\mathbf n_q=n)=\frac{\lambda(H_q;q,n)}{\sum_{-(K+1)y<m\leqslant y}\lambda(H_q; q,m)}. \end{equation} \tag{4.12} $$

Note that by (4.2) the denominator is nonzero, so it is a well-defined probability distribution. Now, for $q\in \mathcal{Q}'$ let

$$ \begin{equation*} \mathbf e'_q:=V' \cap \{\mathbf n_q+hq\colon 1\leqslant h\leqslant KH_q\} \end{equation*} \notag $$

and for $q\in \mathcal{Q}''$ let

$$ \begin{equation*} \mathbf e''_q:=V'' \cap \{\mathbf n_q+hq\colon 1\leqslant h\leqslant KH_q\}. \end{equation*} \notag $$

We aim to show that there are choices $n_q$ of $\mathbf{n}_q$ such that the corresponding sets $e_q'$ and $e_q''$ obey the inequalities

$$ \begin{equation} \biggl|V'\setminus \bigcup_{q\in \mathcal Q'}e_q'\biggr| \leqslant \frac{x}{10\log x} \end{equation} \tag{4.13} $$

and

$$ \begin{equation} \biggl|V''\setminus \bigcup_{q\in \mathcal Q''}e_q''\biggr| \leqslant \frac{x}{10\log x}. \end{equation} \tag{4.14} $$

Once this is done, we can set $b\equiv -n_q \pmod{q}$ for $q\in \mathcal{Q}'$ and $b\equiv n_q+N \pmod{q}$ for $q\in Q''$ and make an arbitrary choice of $b\pmod{q}$ for $q\in(z,x/2]\setminus (\mathcal{Q}'\cup \mathcal{Q}'')$. Then, since for each $q\in \mathcal{Q}'$

$$ \begin{equation*} e_q' \subset \{n\in \mathbb{Z}\colon n\equiv n_q \ (\operatorname{mod}q) \} \end{equation*} \notag $$

and for each $q\in \mathcal{Q}''$

$$ \begin{equation*} e_q'' \subset \{n\in \mathbb{Z}\colon n\equiv n_q \ (\operatorname{mod}q) \}, \end{equation*} \notag $$

we obtain

$$ \begin{equation*} \begin{aligned} \, |(S_{x/2}-b)\cap[-y,y]| &\leqslant |(S'\cap[-y,y])\setminus V'| +\biggl|V'\setminus \bigcup_{q\in \mathcal Q'}e_q'\biggr| \\ &\leqslant \frac{x}{10\log x}+\frac{x}{10\log x}=\frac{x}{5\log x} \end{aligned} \end{equation*} \notag $$

and, similarly,

$$ \begin{equation*} \begin{aligned} \, |(S_{x/2}-N+b)\cap[-y,y]| &\leqslant |(S''\cap[-y,y])\setminus V''| +\biggl|V''\setminus \bigcup_{q\in \mathcal Q''}e_q''\biggr| \\ &\leqslant \frac{x}{10\log x}+\frac{x}{10\log x}=\frac{x}{5\log x}, \end{aligned} \end{equation*} \notag $$

and (2.2) follows.

We apply the hypergraph covering lemma twice, to $V'$ and $V''$, to obtain (4.13) and (4.14), respectively. These two applications are quite similar, so we consider only the one concerned with $V'$. We take $s=|\mathcal{Q}'|$, $\{\mathbf{e}_1,\dots,\mathbf{e}_s\}=\{\mathbf{e}_q'\colon q\in \mathcal{Q}'\}$ and $C_2'$ from Theorem 4.1, and

$$ \begin{equation*} \eta=\frac{1}{100C_3(\log x)^{\delta}}. \end{equation*} \notag $$

Then using (4.1) we obtain

$$ \begin{equation*} C_3\eta|V'| \leqslant C_3\eta |S'\cap[-y,y]| \leqslant \frac{9\sigma y}{100(\log x)^{\delta}}\sim \frac{9e^{-\gamma}y}{100(\log x)^{\delta}\log z}\leqslant \frac{x}{10\log x} \end{equation*} \notag $$

for $x$ large enough, and (4.13) follows from (4.11). Thus it suffices to verify the conditions of the hypergraph covering lemma.

First, by (3.1),

$$ \begin{equation*} |\mathbf e'_{q}| \leqslant KH_q \leqslant \frac{Ky}{z} \leqslant \frac{K(\log x)^{1/2}}{\log\log x} \leqslant \frac{K(\log y)^{1/2}}{\log\log y}, \end{equation*} \notag $$

so (4.6) follows. Further, let $n\in V'$ and $q\in \mathcal{Q}'$. By (4.12), (4.2) and (3.1),

$$ \begin{equation*} \begin{aligned} \, \mathbb P(n\in \mathbf e_q') &= \sum_{h=1}^{\lfloor KH_q \rfloor}\mathbb P(\mathbf n_q=n-qh) \ll y^{-1}\sum_{h\leqslant KH_q}\lambda(H_q; q,n-qh) \\ &\ll y^{-1}H_q\sigma_2^{-KH_q} \leqslant y^{-9/10}, \end{aligned} \end{equation*} \notag $$

and (4.7) follows. From (4.3) we also have

$$ \begin{equation*} \begin{aligned} \, \sum_{q\in \mathcal Q'}\mathbb P(n\in \mathbf e_q') &= \sum_{q\in \mathcal Q'}\sum_{h\leqslant KH_q}\frac{\lambda(H_q; q,n-qh)}{\sum_{-(K+1)y<n'\leqslant y}\lambda(H_q;q,n')} \\ &=C_2'+O\bigl((\log x)^{-\delta(1+\varepsilon)}\bigr), \end{aligned} \end{equation*} \notag $$

which confirms (4.9). Now we turn to (4.8). If both $v$ and $v'$ lie in $\mathbf{e}_q$, then $q$ divides $|v-v'|$, which is at most $2y$. But $q>z>\sqrt{2y}$, thus there can be at most one such $q$ for any fixed $v\neq v'$. Therefore, (4.8) follows from (4.7).

This completes the proof of (2.2), and thus Theorem 1.1 follows from Theorem 4.1.

§ 5. The third reduction

In this section we deduce Theorem 4.1 from the following theorem.

Theorem 5.1. Let $M\geqslant2$. Then

(i) the inequality

$$ \begin{equation} \mathbb{E}|\mathbf S'\cap[-y,y]|=\mathbb{E}|\mathbf S''\cap[-y,y]|=\sigma(2y+1) \end{equation} \tag{5.1} $$

holds;

(ii) for every $H\in \mathfrak{H}$, every $j\in\{0,1,2\}$ and $i\in\{1,3\}$,

$$ \begin{equation} \mathbb{E}\sum_{q\in \mathcal Q_{H,i}}\biggl(\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n)\biggr)^j = \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)((K+2)y)^j|\mathcal Q_{H,i}|; \end{equation} \tag{5.2} $$

(iii) for every $H\in \mathfrak{H}$ and $j\in\{0,1,2\}$,

$$ \begin{equation} \begin{aligned} \, \notag &\mathbb{E}\sum_{n\in \mathbf S'\cap[-y,y]}\biggl(\sum_{q\in \mathcal Q_{H,1}}\sum_{h\leqslant KH}\boldsymbol{\lambda}(H;q,n-qh)\biggr)^j \\ &\qquad = \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)\biggl(\frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor }{\sigma_2}\biggr)^j\sigma (2y+1), \end{aligned} \end{equation} \tag{5.3} $$

and

$$ \begin{equation} \begin{aligned} \, \notag &\mathbb{E}\sum_{n\in \mathbf S''\cap[-y,y]}\biggl(\sum_{q\in \mathcal Q_{H,3} }\sum_{h\leqslant KH}\boldsymbol{\lambda}(H;q,n-qh)\biggr)^j \\ &\qquad= \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)\biggl(\frac{|\mathcal Q_{H,3}|\cdot\lfloor KH\rfloor}{\sigma_2}\biggr)^j\sigma (2y+1). \end{aligned} \end{equation} \tag{5.4} $$

We recall that in Theorem 5.1 the random variables $\mathbf{S}'$, $\mathbf{S}''$ and $\boldsymbol{\lambda}$ are defined in terms of the random variable $\mathbf{b}$, chosen uniformly in $\mathbb{Z}/P\mathbb{Z}$, rather than by means of the random variables $\mathbf{n}_q$ we used in § 4.

Remark 5.1. It can be shown that

$$ \begin{equation*} \mathbb{E}|\mathbf S'\cap[-y,y]|^2=\mathbb{E}|\mathbf S''\cap[-y,y]|^2=\biggl(1+O\biggl(\frac{1}{\log y}\biggr)\biggr)\sigma(2y+1) \end{equation*} \notag $$

(this is actually relation (4.2) in [16]). This implies that both the sets $S'\cap[-y,y]$ and $S''\cap[-y,y]$ have size $(2+o(1))\sigma y$ with probability $1-o(1)$, and thus, in fact, almost all the $\mathbf{b} \pmod{P(z)}$ are good for Theorem 4.1. However, we have decided to make the proof slightly shorter and leave out the proof of the above relation. Thus we use only the first moment in (5.1) and show that (at least) half the choices of $\mathbf{b} \pmod{P(z)}$ are good for our purpose.

Deduction of Theorem 4.1 from Theorem 5.1. First we show that (4.1) holds with probability at least $1/2$. From (5.1) we see that

$$ \begin{equation*} \mathbb{E}\bigl|(\mathbf S'\cup\mathbf S'')\cap[-y,y]\bigr|\leqslant \mathbb{E}|\mathbf S'\cap[-y,y]|+ \mathbb{E}|\mathbf S''\cap[-y,y]|=2\sigma(2y+1), \end{equation*} \notag $$

and thus, by Markov’s inequality

$$ \begin{equation*} \mathbb P\bigl(|(\mathbf S'\cup\mathbf S'')\cap[-y,y]| \geqslant 4\sigma(2y+1)\bigr) \leqslant \frac12. \end{equation*} \notag $$

Hence, we have

$$ \begin{equation} |(\mathbf S'\cup \mathbf S'')\cap[-y,y]| \leqslant 9\sigma y \end{equation} \tag{5.5} $$

with probability at least $1/2$.

Now we work on parts (ii) and (iii) of Theorem 4.1. Fix $H\in\mathfrak{H}$. From (5.2) we obtain

$$ \begin{equation} \mathbb{E}\sum_{q\in \mathcal Q_{H,1}}\biggl(\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n) - (K+2)y\biggr)^2 \ll \frac{y^2|\mathcal Q_{H,1}|}{H^{M-2}}. \end{equation} \tag{5.6} $$

Now let $\boldsymbol{\mathcal Q}_H'$ be the (random) set of $q\in \mathcal{Q}_{H,1}$ for which

$$ \begin{equation} \biggl|\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n) - (K+2)y\biggr| \leqslant \frac{y}{H^{1+\varepsilon}}. \end{equation} \tag{5.7} $$

By estimating the left-hand side of (5.6) from below by the sum over $q\in \mathcal{Q}_{H,1}\setminus\boldsymbol{\mathcal Q}'_H$, we find that

$$ \begin{equation} \mathbb{E}|\mathcal Q_{H,1}\setminus\boldsymbol{\mathcal Q}'_H| \ll \frac{|\mathcal Q_{H,1}|}{H^{M-4-2\varepsilon}}. \end{equation} \tag{5.8} $$

Now we set

$$ \begin{equation*} \boldsymbol{\mathcal Q}'=\bigcup_{H\in\mathfrak H}\boldsymbol{\mathcal Q}_H' \subseteq \{q\in\mathcal Q\colon q\equiv1\ (\operatorname{mod}4) \}. \end{equation*} \notag $$

In a quite similar way we define the random set

$$ \begin{equation*} \boldsymbol{\mathcal Q}''=\bigcup_{H\in\mathfrak H}\boldsymbol{\mathcal Q}_H'' \subseteq \{q\in\mathcal Q\colon q\equiv3\ (\operatorname{mod}4) \}, \end{equation*} \notag $$

where for each $H\in\mathfrak{H}$ we denote by $\boldsymbol{\mathcal Q}_H''$ the random set of $q\in \mathcal{Q}_{H,3}$ for which

$$ \begin{equation} \biggl|\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n) - (K+2)y\biggr| \leqslant \frac{y}{H^{1+\varepsilon}}; \end{equation} \tag{5.9} $$

again, we have

$$ \begin{equation} \mathbb{E}|\mathcal Q_{H,3}\setminus\boldsymbol{\mathcal Q}'_H| \ll \frac{|\mathcal Q_{H,3}|}{H^{M-4-2\varepsilon}} \end{equation} \tag{5.10} $$

for all $H\in\mathfrak{H}$.

Now we turn to condition (iii) in Theorem 4.1. Fix $H$. Similarly to (5.6), from (5.3) we have

$$ \begin{equation} \begin{aligned} \, \notag &\mathbb{E}\sum_{n\in \mathbf S'\cap[-y,y]}\biggl(\sum_{q\in \mathcal Q_{H,1}}\sum_{h\leqslant KH}\boldsymbol{\lambda}(H;q,n-qh)-\frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2}\biggr)^2 \\ &\qquad \ll \frac{1}{H^{M-2}}\biggl(\frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2}\biggr)^2\sigma y. \end{aligned} \end{equation} \tag{5.11} $$

Let $\boldsymbol{\mathcal E}'_H$ be the set of $n\in \mathbf{S}'\cap[-y,y]$ such that

$$ \begin{equation} \biggl|\sum_{q\in \mathcal Q_{H,1}}\sum_{h\leqslant KH}\boldsymbol{\lambda}(H;q,n-qh)-\frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2}\biggr| \geqslant \frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2H^{1+\varepsilon}}. \end{equation} \tag{5.12} $$

Then, since $M>6$ and $\varepsilon$ is small, (5.11) implies that

$$ \begin{equation*} \mathbb{E}|\boldsymbol{\mathcal E}'_H| \ll \frac{\sigma y}{H^{1+2\varepsilon}}, \end{equation*} \notag $$

and so $|\boldsymbol{\mathcal E}'_H|\leqslant {\sigma y}/{H^{1+\varepsilon}}$ with probability $1-O(H^{-\varepsilon})$.

Now we estimate the contribution of the ‘bad’ primes $q\in \mathcal{Q}_{H,1}\setminus \boldsymbol{\mathcal Q}'_H$. For each $h\leqslant KH$, from the Cauchy–Schwarz inequality (for vector functions) we obtain

$$ \begin{equation*} \begin{aligned} \, &\mathbb{E}\sum_{q\in \mathcal Q_{H,1}\setminus\boldsymbol{\mathcal Q}'_H}\sum_{n\in \mathbf S'\cap[-y,y]}\boldsymbol{\lambda}(H;q,n-qh) \\ &\qquad \leqslant\bigl(\mathbb{E}|\mathcal Q_{H,1}\setminus\boldsymbol{\mathcal Q}'_H|\bigr)^{1/2}\biggl(\mathbb{E}\sum_{\mathcal Q_{H,1}\setminus\boldsymbol{\mathcal Q}'_H}\biggl|\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n)\biggr|^2\biggr)^{1/2}, \end{aligned} \end{equation*} \notag $$

where we have extended the summation of the $\boldsymbol{\lambda}(H;q,\cdot\,)$ to the larger interval $(-(K+1)y,y]$ (note that the weights $\boldsymbol{\lambda}(H;q,\cdot\,)$ are nonnegative). Further, by the triangle inequality, (5.6) and (5.8)

$$ \begin{equation*} \begin{aligned} \, &\mathbb{E}\sum_{\mathcal Q_{H,1}\setminus\boldsymbol{\mathcal Q}'_H}\biggl|\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n)\biggr|^2 \\ &\qquad \leqslant2\mathbb{E}\sum_{\mathcal Q_{H,1}\setminus\boldsymbol{\mathcal Q}'_H}\biggl(\biggl|\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H;q,n)-(K+2)y\biggr|^2+(K+2)^2y^2\biggr) \\ &\qquad\ll \frac{y^2|\mathcal Q_{H,1}|}{H^{M-4-2\varepsilon}}. \end{aligned} \end{equation*} \notag $$

Combining the two last estimates (and using (5.8) again), after summing over all $h\leqslant KH$ we obtain

$$ \begin{equation*} \mathbb{E}\sum_{n\in \mathbf S'\cap[-y,y]}\sum_{q\in \mathcal Q_{H,1}\setminus \boldsymbol{\mathcal Q}'_H}\sum_{h\leqslant KH} \boldsymbol{\lambda}(H;q,n-qh) \ll \frac{y|\mathcal Q_{H,1}|}{H^{M-5-2\varepsilon}}. \end{equation*} \notag $$

Let $\boldsymbol{\mathcal F}'_H$ be the set of $n\in\mathbf{S}'\cap[-y,y]$ such that

$$ \begin{equation} \sum_{q\in \mathcal Q_{H,1}\setminus \boldsymbol{\mathcal Q}'_H}\sum_{h\leqslant KH} \boldsymbol{\lambda}(H;q,n-qh) \geqslant \frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2H^{1+\varepsilon}}. \end{equation} \tag{5.13} $$

Then

$$ \begin{equation*} \mathbb{E}|\boldsymbol{\mathcal F}'_H| \ll \frac{\sigma_2y}{H^{M-5-3\varepsilon}} \ll \frac{\sigma y \log H}{H^{M-5-3\varepsilon}}, \end{equation*} \notag $$

and by Markov’s inequality

$$ \begin{equation*} |\boldsymbol{\mathcal F}'_H| \leqslant \frac{\sigma y}{H^{1+\varepsilon}} \end{equation*} \notag $$

with probability $1-O(H^{-(M-6-5\varepsilon)})$. Since $\varepsilon<(M-6)/6$, we have $M-6-5\varepsilon>\varepsilon$, and the last probability becomes $1-O(H^{-\varepsilon})$.

Analogously, using (5.4), we can define the set $\boldsymbol{\mathcal E}''_H$ of $n\in \mathbf{S}''\cap[-y,y]$ such that

$$ \begin{equation} \biggl|\sum_{q\in \mathcal Q_{H,3}}\sum_{h\leqslant KH}\boldsymbol{\lambda}(H;q,n-qh)-\frac{|\mathcal Q_{H,3}|\cdot\lfloor KH\rfloor}{\sigma_2}\biggr| \geqslant \frac{|\mathcal Q_{H,3}|\cdot\lfloor KH\rfloor}{\sigma_2H^{1+\varepsilon}}, \end{equation} \tag{5.14} $$

and the set $\boldsymbol{\mathcal F}''_H$ of $n\in \mathbf{S}''\cap[-y,y]$ such that

$$ \begin{equation} \sum_{q\in \mathcal Q_{H,3}\setminus \boldsymbol{\mathcal Q}''_H}\sum_{h\leqslant KH} \boldsymbol{\lambda}(H;q,n-qh) \geqslant \frac{|\mathcal Q_{H,3}|\cdot\lfloor KH\rfloor}{\sigma_2H^{1+\varepsilon}}; \end{equation} \tag{5.15} $$

then we have

$$ \begin{equation*} |\boldsymbol{\mathcal E}''_H|, |\boldsymbol{\mathcal F}''_H| \leqslant \frac{\sigma y}{H^{1+\varepsilon}} \end{equation*} \notag $$

with probability $1-O(H^{-\varepsilon})$. Since $\sum_{H\in\mathfrak{H}}H^{-\varepsilon}\ll (\log x)^{-\delta\varepsilon}$, we see that the probability of the event that there exists $H\in\mathfrak{H}$ such that at least one of the sets $\boldsymbol{\mathcal E}'_H$, $\boldsymbol{\mathcal F}'_H$, $\boldsymbol{\mathcal E}''_H$ or $\boldsymbol{\mathcal F}''_H$ has size greater than $(\sigma y)H^{-1-\varepsilon}$ is $o(1)$.

Now we are ready to make a choice of $\mathbf{b} \pmod{P(z)}$. We consider the event that (5.5) holds and that for each $H\in\mathfrak{H}$ all the four sets $\boldsymbol{\mathcal E}'_H$, $\boldsymbol{\mathcal F}'_H$, $\boldsymbol{\mathcal E}''_H$ and $\boldsymbol{\mathcal F}''_H$ have size at most $(\sigma y)H^{-1-\varepsilon}$. By the above discussion this event holds with probability at least $1/2-o(1)$). From now on, we fix $\mathbf{b}\pmod{P(z)}$ such that this event holds, and thus all of our random sets and weights become deterministic.

Let

$$ \begin{equation*} \mathcal{N}'=S'\cap[-y,y]\setminus\bigcup_{H\in\mathfrak H}(\mathcal E'_H\cup\mathcal F'_H) \end{equation*} \notag $$

and

$$ \begin{equation*} \mathcal{N}''=S''\cap[-y,y]\setminus\bigcup_{H\in\mathfrak H}(\mathcal E''_H\cup\mathcal F''_H). \end{equation*} \notag $$

We verify (4.3) for $n\in\mathcal{N}'$, and (4.4) will follow from our construction in an absolutely similar way. The number of exceptional elements satisfies

$$ \begin{equation*} \biggl|\bigcup_{H\in\mathfrak H}(\mathcal E'_H\cup\mathcal F'_H)\biggr| \leqslant \frac{\sigma y}{(\log x)^{(1+\varepsilon)\delta}}, \end{equation*} \notag $$

which is smaller than $x/(10\log x)$ for large $x$. We fix an arbitrary $n\in \mathcal{N}'$. For such $n$ the inequalities reverse to (5.12) and (5.13) hold, and therefore, for each $H\in\mathfrak{H}$,

$$ \begin{equation*} \sum_{q\in \mathcal Q_{H,1}}\sum_{h\leqslant KH}\lambda(H;q,n-qh)=\biggl(1+O\biggl(\frac{1}{(\log x)^{(1+\varepsilon)\delta}}\biggr)\biggr) \frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2} \end{equation*} \notag $$

by our choice of $M$. Summing over all $H\in\mathfrak{H}$, we have

$$ \begin{equation} \sum_{q\in \mathcal Q'}\sum_{h\leqslant KH_q}\lambda(H_q;q,n-qh)=\biggl(1+O\biggl(\frac{1}{(\log x)^{(1+\varepsilon)\delta}}\biggr)\biggr)C_2'(K+2)y, \end{equation} \tag{5.16} $$

where (recall that $\sigma_2=\sigma_2(H)$)

$$ \begin{equation*} C_2'=\frac{1}{(K+2)y}\sum_{H\in\mathfrak H} \frac{|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor}{\sigma_2}. \end{equation*} \notag $$

Note that $C_2'$ depends on $x$, $K$, $M$, $\xi$ and $\delta$ but not on $n$. Since

$$ \begin{equation*} \lfloor KH \rfloor=KH\biggl(1+O\biggl(\frac 1H\biggr)\biggr)=KH(1+O(\log x)^{-\delta}) \end{equation*} \notag $$

and

$$ \begin{equation*} \sigma_2^{-1}=\prod_{H^M<p\leqslant z}\biggl(1-\frac 1p\biggr)^{-1}\sim\frac{\log z}{M\log H}, \end{equation*} \notag $$

using (3.5) we obtain

$$ \begin{equation*} C_2'\sim \frac{K}{(K+2)y} \cdot \frac12\biggl(1-\frac1\xi\biggr) \sum_{H\in\mathfrak H}\frac{y/H}{\log x}\cdot\frac{H\log z}{M\log H} \sim \frac{K(1-1/\xi)}{2M(K+2)}\sum_{H\in\mathfrak H}\frac{1}{\log H} \end{equation*} \notag $$

as $x\to\infty$. Recalling the definition of $\mathfrak{H}$, we see that

$$ \begin{equation*} C_2'\sim \frac{K(1-1/\xi)}{2M(K+2)\log \xi}\sum_{j}\frac{1}{j}, \end{equation*} \notag $$

where $j$ ranges over the interval

$$ \begin{equation*} \frac{\delta\log\log x}{\log \xi} \leqslant j \leqslant \frac{(1/2+o(1))\log\log x}{\log \xi}. \end{equation*} \notag $$

Thus we obtain

$$ \begin{equation*} C_2'\sim \frac{K(1-1/\xi)}{2M(K+2)\log \xi}\log\frac{1}{2\delta}. \end{equation*} \notag $$

Recall condition (2.3) on $\delta$ and the fact that $K$ and $x$ are sufficiently large, $M$ is sufficiently close to $6$ and $\xi$ is close enough to $1$ (all in terms of $\delta$). We may also assume without loss of generality that $\delta$ is sufficiently close to $C(1/2)$, which gives $\log(1/(2\delta))<13\cdot10^{2\delta}$. Then we see that

$$ \begin{equation*} 10^{2\delta}\leqslant C_2'\leqslant 100. \end{equation*} \notag $$

In combination with (5.16), this implies (4.3). Arguing similarly, one can obtain (4.4) for $n\in \mathcal{N}''$.

Theorem 4.1 is proved.

It remains to establish Theorem 5.1. This is the aim of § 6 of the paper.

§ 6. Computing correlations

First we introduce some notation. For $H\in\mathfrak{H}$ let $\mathcal{D}_H$ be the set of square-free integers $d$ all of whose prime divisors lie in $(H^M,z]$. Further, for $A>0$ let

$$ \begin{equation} E_A(m;H)=\sum_{d\in \mathcal D_H\setminus\{1\}}\frac{A^{\omega(d)}}{d}1_{m \equiv 0\,(\operatorname{mod}d)}. \end{equation} \tag{6.1} $$

Note that $E_A(m;H)=E_A(-m;H)$.

We need the following two lemmas (see Lemmas 5.1 and 5.2 in [16]).

Lemma 6.1. Let $10<H<z^{1/M}$ and $1\leqslant l\leqslant 10KH$, and let $\mathcal{U}\subset \mathcal{V}$ be two finite sets such that $|\mathcal{V}|=l$. Then

$$ \begin{equation*} \mathbb P(\mathcal U\subset \mathbf S'_2)=\mathbb P(\mathcal U\subset \mathbf S''_2)=\sigma_2^{|\mathcal U|}\biggl(1+O\biggl(|\mathcal U|^2H^{-M}+l^{-2}\sum_{\substack{v,v'\in\mathcal V \\ v\neq v'}}E_{2l^2}(v-v';H)\biggr)\biggr). \end{equation*} \notag $$

Remark 6.1. Lemma 5.1 in [16] is in fact formulated for the probability $\mathbb{P}(\mathcal{U}\subset \mathbf{S}_2)$, where $\mathbf{S}_2=S_{H^M,z}+\mathbf{b}_2$. However, it does not make any difference for us since it is easy to see (by making the change of variables $\mathbf{b}_2\mapsto -\mathbf{b}_2$ or $\mathbf{b}_2\mapsto -N+\mathbf{b}_2$) that $\mathbb{P}(\mathcal{U}\subset \mathbf{S}_2)=\mathbb{P}(\mathcal{U}\subset \mathbf{S}'_2)=\mathbb{P}(\mathcal{U}\subset \mathbf{S}''_2)$. Note also that a parameter $B$ is involved in this lemma, since it is related to a $B$-bounded sieving system (recall the definition of a sieving system in § 1); nevertheless, in our case $B=1$ and thus there is no dependence on $B$.

Lemma 6.2. Let $10<H<z^{1/M}$, and let $(m_t)_{t\in T}$ be a finite sequence such that

$$ \begin{equation} \sum_{t\in T}1_{m_t\equiv a\, (\operatorname{mod}d)} \ll \frac{X}{\varphi(d)}+R \end{equation} \tag{6.2} $$

for some $X,R>0$ and all $d\in \mathcal{D}_H\setminus\{1\}$ and $a\in\mathbb{Z}/d\mathbb{Z}$. Then for any $A$ such that $0<A\leqslant H^M$ and any integer $j$

$$ \begin{equation*} \sum_{t\in T}E_A(m_t+j;H) \ll \frac{XA}{H^M}+R\exp(A\log\log y). \end{equation*} \notag $$

Now we are ready to prove Theorem 5.1.

Proof of Theorem 5.1, (i). First, we notice that by making a change of variables it is easy to see that

$$ \begin{equation*} \mathbb{E}|\mathbf S'\cap[-y,y]|=\mathbb{E}|\mathbf S''\cap[-y,y]|. \end{equation*} \notag $$

Thus, it suffices to prove the claim concerned with $\mathbf{S}'$. By the linearity of expectation we obtain

$$ \begin{equation*} \mathbb{E}|\mathbf S'\cap[-y,y]|=\sum_{-y\leqslant n\leqslant y}\mathbb P(n\in \mathbf S'). \end{equation*} \notag $$

Now, since $\mathbf{b}$ is taken uniformly from $\mathbb{Z}/P(z)\mathbb{Z}$, for each fixed $n$, from the Chinese Remainder Theorem we obtain

$$ \begin{equation*} \mathbb P(n\in \mathbf S')=\prod_{p\leqslant z}\mathbb P(\mathbf b\not\equiv -n \ (\operatorname{mod}p))=\sigma, \end{equation*} \notag $$

and (5.1) follows.

Proof of Theorem 5.1, (ii). We prove the claim for $i=1$ (the case $i=3$ can be handled similarly). Fix $H\in\mathfrak{H}$. The case $j=0$ is trivial, and we turn to $j=1$, which is

$$ \begin{equation} \mathbb{E}\sum_{q\in \mathcal Q_{H,1}}\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H; q,n)= \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)(K+2)y|\mathcal Q_{H,1}|. \end{equation} \tag{6.3} $$

By (3.12) the left-hand side expands as

$$ \begin{equation*} \mathbb{E}\sum_{q\in\mathcal Q_{H,1}}\sum_{-(K+1)y<n\leqslant y}\frac{1_{\mathbf{AP}'(KH;q,n)\subset\mathbf S'_2}}{\sigma_2^{|\mathbf{AP}'(KH;q,n)|}}. \end{equation*} \notag $$

Recall that, according to the definitions (3.8) and (3.10), $\mathbf{b}_1$ and $\mathbf{b}_2$ are independent, and so are $\mathbf{AP}'(KH; q,n)=\{n+qh\colon 1\leqslant h\leqslant KH\}\cap \mathbf{S}'_1$ and $\mathbf{S}'_2$. Then the above expression equals

$$ \begin{equation*} \sum_{q\in\mathcal Q_{H,1}}\sum_{-(K+1)y<n\leqslant y}\, \sum_{b_1\, (\operatorname{mod}P_1)}\frac{\mathbb P(\mathbf b_1=b_1)}{\sigma_2^{|\operatorname{AP}'(KH;q,n)|}}\mathbb P(\operatorname{AP}'(KH; q,n)\subset\mathbf S'_2). \end{equation*} \notag $$

For fixed $q$, $n$ and $b_1$ we apply Lemma 6.1 to the (deterministic) sets $\mathcal{U}=\operatorname{AP}'(KH;q,n)$ and $\mathcal{V}=\{n+qh\colon 1\leqslant h\leqslant KH\}$ and find that the left-hand side of (6.3) is equal to

$$ \begin{equation*} \sum_{q\in \mathcal Q_{H,1}}\sum_{-(K+1)y<n\leqslant y}\biggl(1+O\biggl(H^{-(M-2)}+H^{-2}\sum_{\substack{1\leqslant h<h'\leqslant KH}}E_{2K^2H^2}(qh-qh';H) \biggr)\biggr). \end{equation*} \notag $$

Now it is enough to show that, for any $1\leqslant h< h'\leqslant KH$,

$$ \begin{equation*} \sum_{q\in \mathcal Q_{H,1}}E_{2K^2H^2}(qh-qh';H) \ll \frac{|\mathcal Q_{H,1}|}{H^{M-2}}. \end{equation*} \notag $$

For future reference, we show the more general bound

$$ \begin{equation} \sum_{q\in Q_{H,i}} E_{8K^2H^2}(qr+s;H) \ll \frac{|Q_{H,i}|}{H^{M-2}} \end{equation} \tag{6.4} $$

for each $i\in\{1,3\}$, $0<|r|\leqslant KH$, and any integer $s$. Note that $E_A(m;H)$ is an increasing function of $A$.

To show (6.4) we fix $r$ and $s$. For each $d\in\mathcal{D}_{H\setminus\{1\}}$ all prime divisors of $d$ are greater than $H^M>KH\leqslant |r|$, and so $r$ and $d$ are coprime. Hence the congruence $qr\equiv a\pmod{d}$ holds for at most one residue class $q\pmod d$. Therefore, for $d\leqslant y^{1/2}$, by the Brun–Titchmarch inequality (recall that $H\leqslant (\log y)^{1/2}$ by (3.4)) we have

$$ \begin{equation*} \#\{q\in \mathcal Q_{H,i}\colon qr\equiv a\ (\operatorname{mod}d)\} \ll \frac{y/H}{\varphi(d)\log(y/d)} \ll \frac{y/H}{\varphi(d)\log y}. \end{equation*} \notag $$

For $d>y^{1/2}$ we can forget that $q$ is restricted to be prime and trivially obtain

$$ \begin{equation*} \#\{q\in \mathcal Q_{H,i}\colon qr\equiv a\ (\operatorname{mod}d)\} \ll \frac{y/H}{d} + 1 \ll \frac{y^{1/2}}{H}. \end{equation*} \notag $$

So for each $d$ we have

$$ \begin{equation*} \#\{q\in \mathcal Q_{H,i}\colon qr\equiv a\ (\operatorname{mod}d)\} \ll \frac{y/H}{\varphi(d)\log y}+\frac{y^{1/2}}{H}. \end{equation*} \notag $$

Hence, by Lemma 6.2, using (3.4) again we obtain

$$ \begin{equation*} \sum_{q\in Q_{H,i}} E_{8K^2H^2}(qr+s;H) \ll \frac{y/H}{\log y}\frac{H^2}{H^M}+\frac{y^{1/2}}{H}\exp(O(H^2\log\log y)) \ll \frac{|\mathcal Q_{H,i}|}{H^{M-2}}, \end{equation*} \notag $$

since $|\mathcal Q_{H,i}|\asymp (y/H)/\log y$ for each $i\in\{1,3\}$. So (6.4) is proved, and thus the case $j=1$ of Theorem 5.1, (ii), follows.

Now we turn to the case $j=2$ of (ii), which is

$$ \begin{equation*} \mathbb{E}\sum_{q\in \mathcal Q_{H,1}}\biggl(\sum_{-(K+1)y<n\leqslant y}\boldsymbol{\lambda}(H; q,n)\biggr)^2= \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)(K+2)^2y^2|\mathcal Q_{H,1}|. \end{equation*} \notag $$

The left-hand side is expanded as

$$ \begin{equation*} \mathbb{E}\sum_{q\in\mathcal Q_{H,1}}\sum_{-(K+1)y<n_1,n_2\leqslant y}\boldsymbol{\lambda}(H;q,n_1)\boldsymbol{\lambda}(H;q,n_2). \end{equation*} \notag $$

First we note that for each fixed $q$ the contribution of the pairs $(n_1,n_2)$ for which $|n_1-n_2|\leqslant KH$ is negligible: indeed, there are $O(yH)$ such pairs, and each of them contributes at most $\sigma_2^{-2KH}=y^{o(1)}$, so the total contribution of such pairs is $O(y^{1+o(1)}|\mathcal{Q}_{H,1}|)$. Thus we can restrict our attention to those pairs $(n_1,n_2)$ for which the sets $\{n_\nu+qh\colon 1\leqslant h\leqslant KH\}$, $\nu=1,2$, do not intersect; let us call these pairs good. Then it is enough to show that

$$ \begin{equation} \begin{aligned} \, \notag &\mathbb{E}\sum_{q\in\mathcal Q_{H,1}}\,\,\sum_{\substack{-(K+1)y<n_1,\,n_2\leqslant y\\(n_1,n_2) \text { is good}}}\frac{1_{\mathbf{AP}'(KH;q,n_1)\cup \mathbf{AP}'(KH;q,n_2)\subset\mathbf S'_2}}{\sigma_2^{|\mathbf{AP}'(KH;q,n_1)|+|\mathbf{AP}'(KH;q,n_2)|}} \\ &\qquad =\biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)(K+2)^2y^2|\mathcal Q_{H,1}|. \end{aligned} \end{equation} \tag{6.5} $$

Arguing as in the case $j=1$, for any realization $b_1$ of $\mathbf{b}_1$ and any good pair $(n_1,n_2)$, we can apply Lemma 6.1 to

$$ \begin{equation*} \mathcal U=\operatorname{AP}'(KH;q,n_1)\sqcup \operatorname{AP}'(KH;q,n_2) \end{equation*} \notag $$

and

$$ \begin{equation*} \mathcal V=\{n_1+qh\colon 1\leqslant h\leqslant KH\}\sqcup \{n_2+qh\colon 1\leqslant h\leqslant KH\}. \end{equation*} \notag $$

Then, since

$$ \begin{equation*} |\mathcal U|=|\operatorname{AP}'(KH;q,n_1)|+|\operatorname{AP}'(KH;q,n_2)| \end{equation*} \notag $$

and $|\mathcal{V}|= 2\lfloor KH\rfloor$, we see that the left-hand side of (6.5) equals

$$ \begin{equation*} \begin{aligned} \, & \sum_{q\in\mathcal Q_{H,1}}\,\,\sum_{\substack{-(K+1)y<n_1,\,n_2\leqslant y\\(n_1,n_2)\text{ is good}}} \biggl(1\,{+}\,O\biggl(\frac{1}{H^{M-2}}\,{+}\,H^{-2}\!\!\!\!\!\!\sum_{1\leqslant h,h'\leqslant KH}1_{h\neq h'}E_{8K^2H^2}(qh\,{-}\,qh';H) \\ &\qquad\qquad\qquad\qquad\qquad\quad +1_{n_1\neq n_2}E_{8K^2H^2}(n_1-n_2+qh-qh';H)\biggr)\biggr). \end{aligned} \end{equation*} \notag $$

Recalling that all but $O(yH)$ pairs $(n_1,n_2)$ are good, we obtain from here the main term $(K+2)^2y^2|\mathcal{Q}_{H,1}|$. We also obtain acceptable error terms by using (6.4), except for the summands with $h=h'$. To handle them, we note that for any fixed $n_2$, any positive integer $d$ and any $a\pmod{d}$,

$$ \begin{equation*} \#\{-(K+1)y<n_1\leqslant y\colon n_1-n_2 \equiv a\ (\operatorname{mod}d)\} \ll \frac{y}{d}+1. \end{equation*} \notag $$

Thus, from Lemma 6.2 we obtain

$$ \begin{equation*} \sum_{-(K+1)y<n_1, n_2\leqslant y}E_{8K^2H^2}(n_1-n_2;H) \ll \frac{y^2}{H^{M-2}}+y\exp(O(H^2\log\log y)) \ll \frac{y^2}{H^{M-2}} \end{equation*} \notag $$

by (3.4). Now (6.5) and the claim for $j=2$ follow.

Proof of Theorem 5.1, (iii). Fix $H$. We prove only (5.3), since (5.4) can be handled in an absolutely similar manner (the only difference between these cases is that the weights $\boldsymbol{\lambda}(H_q; q,n)$ are defined in terms of $\mathbf{S}'$ for $q\equiv1\pmod4$ and in terms of $\mathbf{S}''$ for $q\equiv3\pmod4$). The case $j=0$ follows from part (i) (that is, (5.1)), so we focus on the case $j=1$, which is

$$ \begin{equation*} \begin{aligned} \, &\mathbb{E}\sum_{n\in \mathbf S'\cap[-y,y]}\sum_{q\in \mathcal Q_{H,1}}\sum_{h\leqslant KH}\boldsymbol{\lambda}(H;q,n-qh) \\ &\qquad= \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)|\mathcal Q_{H,1}|\cdot\lfloor KH\rfloor\sigma_1 (2y+1). \end{aligned} \end{equation*} \notag $$

It is enough to show that, for any $h\leqslant KH$,

$$ \begin{equation} \mathbb{E}\sum_{n\in \mathbf S'\cap[-y,y]}\sum_{q\in \mathcal Q_{H,1}}\boldsymbol{\lambda}(H;q,n-qh) = \biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)|\mathcal Q_{H,1}|\sigma_1 (2y+1). \end{equation} \tag{6.6} $$

According to (3.12), the left-hand side is equal to

$$ \begin{equation*} \mathbb{E}\sum_{n\in\mathbf S'\cap[-y,y]}\sum_{q\in \mathcal Q_{H,1}}\frac{1_{\mathbf{AP}'(KH;q,n-qh)\subset \mathbf S'_2}}{\sigma_2^{|\mathbf{AP}'(KH;q,n-qh)|}}. \end{equation*} \notag $$

By (3.9) the condition $n\in \mathbf{S}'\cap[-y,y]$ implies that $n\in\mathbf{S}'_1\cap[-y,y]$. On the other hand, if $n\in\mathbf{S}'_1$, then $n\in\mathbf{AP}'(KH;q,n-qh)$, and thus the condition $n\in\mathbf{S}'_2$ is covered by the condition $\operatorname{AP}'(KH;q,n-qh)\subset\mathbf{S}'_2$. So the left-hand side of (6.6) can be rewritten as

$$ \begin{equation*} \mathbb{E}\sum_{n\in\mathbf S'_1\cap[-y,y]}\sum_{q\in \mathcal Q_{H,1}}\frac{1_{\mathbf{AP}'(KH;q,n-qh)\subset \mathbf S'_2}}{\sigma_2^{|\mathbf{AP}'(KH;q,n-qh)|}}. \end{equation*} \notag $$

Recalling that $\mathbf{S}_2'$ is independent of $\mathbf{S}_1'$ and $\mathbf{AP}'(KH;q,n-qh)$ we can apply Lemma 6.1 as before and find that the left-hand side of (6.6) is

$$ \begin{equation*} \mathbb{E}\sum_{n\in\mathbf S'_1\cap[-y,y]}\sum_{q\in \mathcal Q_{H,1}}\biggl(1+O\biggl(\frac{1}{H^{M-2}}+H^{-2}\sum_{\substack{h',h''\leqslant KH\\h'\neq h''}}E_{2K^2H^2}(qh'-qh'')\biggr)\biggr). \end{equation*} \notag $$

Now, since

$$ \begin{equation} \mathbb{E}|\mathbf S'_1\cap[-y,y]|=\sigma_1(2y+1), \end{equation} \tag{6.7} $$

we see that (6.6) follows from (6.4).

Now we turn to the case $j=2$ of (iii), which is

$$ \begin{equation*} \begin{aligned} \, & \sum_{h_1,h_2\leqslant KH}\mathbb{E}\sum_{n\in\mathbf S'\cap[-y,y]}\sum_{q_1,q_2\in\mathcal Q_{H,1}}\boldsymbol{\lambda}(H;q_1,n-q_1h_1)\boldsymbol{\lambda}(H;q_2,n-q_2h_2) \\ &\qquad =\biggl(1+O\biggl(\frac{1}{H^{M-2}}\biggr)\biggr)|\mathcal Q_{H,1}|^2\cdot\lfloor KH\rfloor^2\frac{\sigma_1}{\sigma_2}(2y+1). \end{aligned} \end{equation*} \notag $$

By (3.13) the left-hand side is

$$ \begin{equation} \sum_{h_1,h_2\leqslant KH}\mathbb{E}\sum_{n\in\mathbf S'\cap[-y,y]}\sum_{q_1,q_2\in \mathcal Q_{H,1}}\frac{1_{\mathbf{AP}'(KH;q_1,n-q_1h_1)\cup \mathbf{AP}'(KH;q_2,n-q_2h_2)\subset \mathbf S'_2}}{\sigma_2^{|\mathbf{AP}'(KH;q_1,n-q_1h_1)| +|\mathbf{AP}'(KH;q_2,n-q_2h_2)|}}. \end{equation} \tag{6.8} $$

Note that by (3.4) and (6.7) the contribution from $q_1=q_2$ is at most

$$ \begin{equation*} H^2\sigma_2^{-2KH}|\mathcal Q_{H,1}|\sigma_1y\leqslant |\mathcal Q_{H,1}|^2y^{o(1)}, \end{equation*} \notag $$

which is an acceptable error term. Now, if $q_1\neq q_2$, then the set

$$ \begin{equation*} \mathbf{AP}'(KH;q_1,n-q_1h_1)\cup \mathbf{AP}'(KH;q_2,n-q_2h_2) \end{equation*} \notag $$

has size $|\mathbf{AP}'(KH;q_1,n-q_1h_1)|+|\mathbf{AP}'(KH;q_2,n-q_2h_2)|-1$, since $n$ is the only common element of these two progressions (recall that $q_1$ and $q_2$ are prime numbers, $q_1, q_2\gg y/H$ and $h_1,h_2\ll H=y^{o(1)}$). As before, we can consider summation over $n\in\mathbf{S}'_1\cap[-y,y]$ in (6.8), and then use Lemma 6.1 to rewrite the terms with $q_1\neq q_2$ in (6.8) as

$$ \begin{equation*} \lfloor KH\rfloor^2\sigma_2^{-1}\mathbb{E}\!\!\sum_{n\in\mathbf S'_1\cap[-y,y]}\sum_{\substack{q_1,q_2\in \mathcal Q_{H,1}\\ q_1\neq q_2}}\!\!\biggl(1{\kern1pt}{+}\,O\biggl(\frac{1}{H^{M-2}}\,{+}\,\frac{E'(q_1)\,{+}\,E'(q_2)\,{+}\,E''(q_1,q_2)}{H^2}\biggr)\biggr), \end{equation*} \notag $$

where

$$ \begin{equation*} E'(q)=\sum_{\substack{h,h'\leqslant KH \\ h\neq h'}} E_{8K^2H^2}(qh-qh';H) \end{equation*} \notag $$

and

$$ \begin{equation*} E''(q_1,q_2)=\sum_{\substack{h_1',h_2'\leqslant KH \\ h_1'\neq h_1,h_2'\neq h_2}} E_{8K^2H^2}(q_1h_1'-q_1h_1-q_2h_2'+q_2h_2;H). \end{equation*} \notag $$

The contribution of $E'(q_1)+E'(q_2)$ is acceptably small as we already saw in the proof of the case $j=1$. Finally, we need to show that

$$ \begin{equation*} \sum_{q_1,q_2\in\mathcal Q_{H,1}}E_{8K^2H^2}(q_1h_1'-q_1h_1-q_2h_2'+q_2h_2;H) \ll \frac{|\mathcal Q_{H,1}|^2}{H^{M-2}} \end{equation*} \notag $$

for all $h_1'$ and $h_2'$ such that $h_1'\neq h_1$ and $h_2'\neq h_2$. Thus it follows from (6.4) as applied to the sum over $q_1$ for $r=h_1'-h_1$ and $s=-q_2h_2'+q_2h_2$ (after which we sum over all the $q_2$). This completes the proof of the case $j=2$, and Theorem 5.1 follows.



Bibliography

1.	R. A. Rankin, “The difference between consecutive prime numbers”, J. London Math. Soc., 13:4 (1938), 242–247
2.	E. Westzynthius, “Über die Verteilung der Zahlen, die zu den $n$ ersten Primzahlen teilerfremd sind”, Comment. Phys.-Math. Soc. Sci. Fenn., 5:25 (1931), 1–37
3.	P. Erdős, “On the difference of consecutive primes”, Quart. J. Math. Oxford Ser., 6 (1935), 124–128
4.	J. Pintz, “Very large gaps between consecutive primes”, J. Number Theory, 63:2 (1997), 286–301
5.	K. Ford. B. Green, S. Konyagin and T. Tao, “Large gaps between consecutive prime numbers”, Ann. of Math. (2), 183:3 (2016), 935–974
6.	J. Maynard, “Large gaps between primes”, Ann. of Math. (2), 183:3 (2016), 915–933
7.	P. Erdős, “Some of my favourite unsolved problems”, A tribute to Paul Erdős, Cambridge Univ. Press, Cambridge, 1990, 467–478
8.	K. Ford, B. Green, S. Konyagin, J. Maynard and T. Tao, “Long gaps between primes”, J. Amer. Math. Soc., 31:1 (2018), 65–105
9.	H. Cramér, “On the order of magnitude of the difference between consecutive prime numbers”, Acta Arith., 2 (1936), 23–46
10.	A. Granville, “Harald Cramér and the distribution of prime numbers”, Scand. Actuar. J., 1995:1 (1995), 12–28
11.	W. Banks, K. Ford and T. Tao, “Large prime gaps and probabilistic models”, Invent. Math., 233:3 (2023), 1471–1518
12.	R. C. Baker, G. Harman and J. Pintz, “The difference between consecutive primes. II”, Proc. London Math. Soc. (3), 83:3 (2001), 532–562
13.	K. Ford, D. R. Heath-Brown and S. Konyagin, “Large gaps between consecutive prime numbers containing perfect powers”, Analytic number theory, Springer, Cham, 2015, 83–92
14.	H. Maier and M. Th. Rassias, “Large gaps between consecutive prime numbers containing perfect $k$th powers of prime numbers”, J. Funct. Anal., 272:6 (2017), 2659–2696
15.	H. Maier and M. Th. Rassias, “Prime avoidance property of $k$th powers of prime numbers with Beatty sequence”, Discrete mathematics and applications, Springer Optim. Appl., 165, Springer, Cham, 2020, 397–404
16.	K. Ford, S. Konyagin, J. Maynard, C. B. Pomerance and T. Tao, “Long gaps in sieved sets”, J. Eur. Math. Soc. (JEMS), 23:2 (2021), 667–700
17.	K. Ford, S. Konyagin, J. Maynard, C. B. Pomerance and T. Tao, “Corrigendum: Long gaps in sieved sets”, J. Eur. Math. Soc. (JEMS), 25:6 (2023), 2483–2485

Citation: M. R. Gabdullin, A. O. Radomskii, “Prime avoiding numbers form a basis of order $2$”, Sb. Math., 215:5 (2024), 612–633

Citation in format AMSBIB

\Bibitem{GabRad24}

\by M.~R.~Gabdullin, A.~O.~Radomskii

\paper Prime avoiding numbers form a~basis of order~$2$

\jour Sb. Math.

\yr 2024

\vol 215

\issue 5

\pages 612--633

\mathnet{http://mi.mathnet.ru//eng/sm9980}

\crossref{https://doi.org/10.4213/sm9980e}

\mathscinet{http://mathscinet.ams.org/mathscinet-getitem?mr=4809223}

\zmath{https://zbmath.org/?q=an:07945687}

\adsnasa{https://adsabs.harvard.edu/cgi-bin/bib_query?2024SbMat.215..612G}

\isi{https://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcApp=Publons&SrcAuth=Publons_CEL&DestLinkType=FullRecord&DestApp=WOS_CPL&KeyUT=001312960500002}

\scopus{https://www.scopus.com/record/display.url?origin=inward&eid=2-s2.0-85204397133}

Linking options:

https://www.mathnet.ru/eng/sm9980

https://doi.org/10.4213/sm9980e

https://www.mathnet.ru/eng/sm/v215/i5/p47

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Что такое QR-код?

Registration to the website

Logotypes