Self-avoiding walk, spin systems and renormalization

The self-avoiding walk, and lattice spin systems such as the φ4 model, are models of interest both in mathematics and in physics. Many of their important mathematical problems remain unsolved, particularly those involving critical exponents. We survey some of these problems, and report on recent advances in their mathematical understanding via a rigorous non-perturbative renormalization group method.


Introduction
The self-avoiding walk (SAW) is a combinatorial model of lattice paths without self-intersections. In addition to its intrinsic mathematical interest, it arises in polymer science as a model of linear polymers, and in statistical mechanics as a model that exhibits critical behaviour. The mathematical problems associated with the SAW are notoriously difficult and there remain longstanding unsolved problems that are central to the subject. A closely related model is the weakly self-avoiding walk (WSAW), which is predicted to exhibit the same critical behaviour as the SAW.
The critical behaviour of the SAW or WSAW is expressed in terms of critical exponents, which have a qualitative and quantitative relationship with the critical exponents in models of ferromagnetism including the Ising and |ϕ| 4 spin models. Within physics, the critical exponents are well understood, but they nevertheless present deep mathematical problems.
This article is a review of recent mathematical results about critical exponents for the WSAW and |ϕ| 4 models, with focus on the critical behaviour of the susceptibility. Some background on the SAW and Ising models is provided for motivation and context. The results we present involve a unified treatment of the WSAW and |ϕ| 4 models, via an exact relation between the WSAW and a 'zero-component' |ϕ| 4  (a) Strictly self-avoiding walk (

i) Universality and scale invariance
An n-step SAW on the integer lattice Z d is a map ω : {0, 1, . . . , n} → Z d , such that the Euclidean distance between ω(i) and ω(i + 1) equals 1 (nearest-neighbour steps), and such that ω(i) = ω(j) for all i = j (self-avoidance). Let S n denote the finite set of n-step SAWs with ω(0) = 0 (walk starts at origin of Z d ), and let c n be its cardinality. We declare each element of S n to have equal probability, which must therefore be c −1 n . Random n-step SAWs on the square lattice Z 2 , with n = 10 2 and 10 8 , are depicted in figure 1.
The 10 8 -step SAW in figure 1 would not be statistically distinguishable from a SAW instead on the hexagonal lattice, or on the triangular lattice, or indeed on any one of a wide variety of two-dimensional lattices. This feature is called universality. It is similar to the invariance principle for Brownian motion, which is the generalization of the central limit theorem that asserts that (ordinary) random walk with any finite-variance step distribution converges to Brownian motion. The search for a corresponding statement for SAW, i.e. the identification of a limiting probability law for SAW-a scaling limit-is one of the subject's big problems. A related and in general unproven feature is scale invariance: a 10 10 -step SAW, rescaled to the same size as the 10 8 -step SAW in figure 1, would be statistically indistinguishable from the 10 8 -step SAW. The scale invariance is quantified in terms of a universal critical exponent whose existence has not been proven in general.

(ii) The self-avoiding walk connective constant
Since c n c m counts the number of ways that an n-step and an m-step SAW can be concatenated, with the two subwalks possibly intersecting each other, we have c n+m ≤ c n c m . From this, it readily follows that there exists μ = μ(d), the connective constant, such that lim n→∞ c 1/n n = μ and c n ≥ μ n (e.g. [1]). Good numerical estimates and rigorous bounds on the connective constant are known, but the exact value for Z d is not known for any d ≥ 2. For SAWs defined instead on the hexagonal lattice, it has been proved that μ = 2 + √ 2 [2]. As the dimension d goes to infinity, there is an asymptotic expansion μ ∼ 2d − 1 + ∞ n=1 a n (2d) −n with integer coefficients a n whose values are known up to and including a 11 [3]. The connective constant for SAWs in more general settings than Z d is a topic of current research [4].
Our focus here is on the asymptotic behaviour of the ratio c n /μ n , which, unlike the connective constant, is predicted to have a universal asymptotic behaviour.
(iii) Self-avoiding walk critical exponents There is considerable evidence from numerical studies and from arguments from theoretical physics that there exists γ (depending on the dimension d) such that c n ∼ Aμ n n γ −1 (n → ∞).
(2.1) (The symbol ∼ denotes that the ratio of left-hand and right-hand sides has limit 1.) Since c n ≥ μ n , necessarily γ ≥ 1. The susceptibility χ is the generating function of c n : c n z n (z ∈ C).   It has radius of convergence z c = μ −1 since c 1/n n → μ. In view of (2.1), χ can be expected to obey Let R 2 n be the average over S n of ω(n) 2 2 . Then R n is the root-mean-square displacement of n-step SAWs, which is a measure of the average end-to-end distance of an n-step SAW. There is again considerable evidence that there exists ν (depending on d) such that The exponent ν quantifies scale invariance. This article is about critical exponents such as γ , ν for certain SAW and lattice spin models, with the emphasis on γ . There are other critical exponents that we do not discuss. The exponents are predicted to be universal, depending essentially only on the dimension of the lattice. For example, γ and ν should have the same values on the square lattice as on the hexagonal or triangular lattices, unlike the connective constant. A central problem in the subject is to prove the existence of the critical exponents and to show that they have the values listed in table 1.
The rational exponents for d = 2 in table 1 were computed by Nienhuis [5] using non-rigorous arguments based on spin systems like the ones we discuss later. An important breakthrough came with the identification of the stochastic process SLE 8/3 (Schramm-Loewner Evolution with parameter 8 3 ) as the only plausible candidate for the scaling limit [6], which additionally provided  an alternate explanation for the exponents 43 32 and 3 4 . However, it remains an open problem to prove the existence of the critical exponents for d = 2, to prove that they have the rational values in table 1, and to prove that SLE 8/3 truly is the scaling limit.
For d = 3, there is no currently known stochastic process to serve as a scaling limit for SAWs, and the best estimates for critical exponents come from numerical work. To compute the exponents, it is natural to attempt to enumerate SAWs for small n and then extrapolate. Fisher & Gaunt [7] found c n by hand for n ≤ 11, in all dimensions. More than half a century later, for d = 3 the enumeration has reached only n = 36 (table 2), which is insufficient for reliable high-precision estimation of the exponents. It is a challenging problem in enumerative combinatorics to produce a good algorithm to extend table 2 significantly. The Monte Carlo method known as the pivot algorithm gives more accurate estimates for critical exponents, and those appearing for d = 3 in table 1 are estimates using this method [8,9].
The exponents γ = 1 and ν = 1 2 for d ≥ 5 in table 1 are the same as those of simple random walk. This is summarized by the statement that the upper critical dimension is 4. Here is an argument to guess this: Brownian paths are two-dimensional, and since two two-dimensional objects generically do not intersect in dimensions d > 4, SAW should behave like a simple random walk when d > 4. There is a full rigorous understanding of dimensions d ≥ 5. The following theorem from [11] is an example of this. Theorem 2.1. For d ≥ 5, the scaling limit of SAW is Brownian motion, and γ = 1 and ν = 1 2 in the sense that as n → ∞, c n ∼ Aμ n and R n ∼ Dn 1/2 . Theorem 2.1 is proved using the lace expansion, which was originally introduced in [12] and has subsequently been extended to many other high-dimensional models including percolation [13,14].
For d = 4, the logarithms in table 1 reflect the prediction that the two asymptotic formulae in theorem 2.1 must be modified by an additional factor (log n) 1/4 for c n and (log n) 1/8 for R n . We will return to such logarithmic factors later.
For d = 2, 3, 4, none of the entries in Shortly thereafter, for d ≥ 3, Kesten improved the √ n in the exponent to n 2/(d+2) log n [1,16]. For d = 2, the best improvements since 1962 are the replacement of B by o(1) [17], and a proof that the upper bound holds for infinitely many n when B √ n is replaced by n 0.4979 [18]. This is slow progress in over half a century.
For R n , the best results are in the following theorem. The lower bound was proved in [19] and the upper bound in [20].
The lower bound fails to prove that on average the endpoint of a SAW is at least as far away as it is for simple random walk, namely n 1/2 , even though it appears obvious that the self-avoidance constraint must push the SAW farther than a walk without the constraint. The upper bound states that R n /n → 0 but there is no bound on the rate. In particular, it is not proved that there is a constant C such that R n ≤ Cn 0.99999 . The large gap for d = 2, 3, 4 between the predicted results in table 1 and those proven in theorems 2.2-2.3 is an invitation to look for more tractable models that ought to be in the same universality class as SAW. The WSAW is such a model.

(b) Weakly self-avoiding walk
There are two versions of the WSAW: one based on discrete time (also known as the Domb-Joyce model) and one based on continuous time (also known as the lattice Edwards model). Our focus is on the latter. It differs from the SAW in two respects: (i) the underlying simple random walk model takes its steps at random times rather than after a fixed unit of time, and (ii) walks are allowed to have self-intersections but are weighted as less likely according to how much self-intersection occurs.
More precisely, let σ i be a sequence of independent exponential random variables with mean 1/2d. Let (X(t)) t≥0 denote the random walk on Z d which starts at the origin at time t = 0, waits until time σ 1 and then steps immediately to a randomly chosen one of the 2d neighbours of the origin, then waits an amount of time σ 2 until stepping to an independently randomly chosen neighbour of its current position, and so on. The self-intersection local time up to time T is the random variable which provides a measure of how much time the random walk has spent intersecting itself by time T. For fixed g > 0, let c T,g = E(e −gI(T) ), where E denotes expectation for the random walk. As a function of T, c T,g is analogous to the sequence c n for SAW. Every walk contributes to c T,g , but an exponential weight diminishes the role of walks with large self-intersection local time. The elementary argument which led to the existence of the connective constant generalizes to c T,g , and yields the conclusion that there exists ν c (g) ≤ 0 such that lim T→∞ c 1/T T,g = e ν c and c T,g ≥ e ν c T . Thus the susceptibility is finite if and only if ν > ν c . WSAW is predicted to be in the same universality class as SAW for all g > 0, meaning that it has the same critical exponents and scaling limits as SAW. The following theorem is an example of this, for the upper critical dimension d = 4 and for sufficiently small g > 0 [21]. It reveals that γ = 1 with a modification by a logarithmic correction as indicated in table 1. In the physics literature, the computation of logarithmic corrections for d = 4 goes back half a century [22][23][24][25]. A number of related results have been proved for the four-dimensional WSAW [26][27][28], all of which are consistent with the predictions for SAW.
The proof of theorem 2.4 is based on a rigorous and non-perturbative implementation of the RG approach [29]. The RG has for decades been one of the basic tools of theoretical physics, for which Wilson was awarded the Nobel Prize in Physics in 1982. Its reach extends across critical phenomena, many-body theory, and quantum field theory. We make no attempt to refer to the vast physics literature, e.g. [30].
In a 1972 paper with the intriguing title 'Critical exponents in 3.99 dimensions' [31], Wilson and Fisher considered the dimension d as a continuous variable d = 4 − , and applied the RG approach to compute critical exponents in dimension 4 − for small > 0. This captures the idea that the critical behaviour can be expected to vary in a continuous manner as the dimension varies, so dimensions below d = 4 can be regarded as a perturbation of d = 4. Within physics, this has become well developed and it is found that, e.g. γ = 1 + 1 8 + · · · + (known) 6 + · · ·. Although presumably a divergent asymptotic expansion, such -expansions have been used to obtain numerical estimates of critical exponents for d = 3. However, from the perspective of mathematics, the dimension is not a continuous variable and this raises more questions than it answers.

(c) Long-range walks
A different idea to move slightly below the upper critical dimension was also proposed in 1972 [32,33]. In this framework, the upper critical dimension (formerly d = 4) assumes a continuous value d c ∈ (0, 4). In particular, for d = 1, 2, 3 we can choose d c = d + , and thereby study integer dimension d below d c without the need to define the WSAW in any non-integer dimension. In our present context, this idea can be formulated in terms of walks taking long-range steps, as follows.
The long-range steps are defined in terms of a parameter α ∈ (0, 2). Let d = 1, 2, 3, and consider the random walk on Z d that takes independent steps of length r (in any direction) with probability proportional to r −(d+α) . This step distribution has infinite variance, a heavy tail. A convenient choice of such a step distribution is the fractional power −(− ) α/2 of the discrete Laplace operator Thus we consider the random walk on Z d with transition probabilities x − y d+α 2 (2.8) (the notation f g means cg ≤ f ≤ Cg for some constants c, C). This heavy-tailed random walk converges to an α-stable process. The paths of an α-stable process have dimension α [34], so two such paths generically do not intersect in dimensions d > 2α. This suggests that in dimensions d > d c = 2α, long-range SAW or WSAW should behave like the α-stable process. Figure 2 shows a two-dimensional long-range simple random walk with α = 1.1, next to a nearest-neighbour walk for comparison. The heavy tail of the long-range walk produces big jumps, which in turn create fewer self-intersections, thereby making it easier for a walk to be self-avoiding and lowering the upper critical dimension. We can define a long-range model of SAW as follows. A long-range n-step SAW is any sequence with p x,y given by (2.8). The following theorem [35] proves that this SAW does behave like the unconstrained long-range random walk in dimensions d ≥ 1 as long as α < d/2. This is a long-range version of theorem 2.1; its proof is also based on the lace expansion. A technical point is that the theorem actually applies to a so-called spread-out version of the longrange SAW, a small modification. Theorem 2.5. For α ∈ (0, 2) and d > 2α, the scaling limit of spread-out long-range SAW is an α-stable process, and the critical exponents are γ = 1, ν = 1/α. However, our primary interest here is to go below d c to observe scaling behaviour that is different from that of the α-stable process. For this, we consider WSAW and its susceptibility χ defined as in §2b but with the expectation E now with respect to the continuous-time long-range random walk. We choose α = 1 2 (d + ), so that d = d c − is below the critical dimension d c = 2α. The following theorem [36] gives an example of an -expansion. It is proved using a rigorous RG method. The restriction on g in the hypothesis of the theorem is used in the proof, but the statement is expected to be true for all g > 0. Further results are obtained in [37]. A related paper which is focused on renormalization rather than critical exponents is [38]. Theorem 2.6. Let d = 1, 2, 3. For small > 0, for α = 1 2 (d + ), and for g ∈ [c , c ] for some c < c , there is a constant C such that as t = ν − ν c ↓ 0,

Spin systems
Spin systems are basic models in statistical mechanics. We discuss two examples here: the Ising and |ϕ| 4 models. At first sight, spin systems appear to be unrelated to SAW, but a connection will be made in §4.  (a) Ising model The most fundamental spin system is the Ising model of ferromagnetism, which is defined as follows. Let Λ ⊂ Z d be a finite box. An Ising spin configuration on Λ is an assignment of +1 or −1 to each site in Λ, i.e. σ = (σ x ) x∈Λ with σ x ∈ {−1, 1}. An example for d = 2 is depicted in figure 3. Spin configurations are random, with a probability distribution parametrized by temperature T and determined by the energy of σ which is defined to be Here is the discrete Laplace operator (2.7), restricted to Λ. Apart from an unimportant constant, H Λ (σ ) is equal to − x∼y σ x σ y where the sum is over all pairs of neighbouring sites in Λ. At temperature T, the probability of σ is given by the Boltzmann weight Thus spin configurations with more alignment between neighbouring spins are more likely than those with less alignment, and this effect is magnified for small T compared to large T. For dimensions d ≥ 2, there is a critical temperature T c such that when T > T c typical spin configurations are disordered, whereas for T < T c there is long-range order. This is depicted in figure 4 where the +/− symmetry is broken by a boundary condition. The behaviour at T c , and as T approaches T c , is of great current interest and there is a vast literature, particularly for d = 2 where the model is exactly solvable and exciting connections with SLE have been discovered, e.g. [39]. At the critical temperature, the rich geometric structure apparent in figure 4 is scale invariant. Critical exponents are rigorously known for d = 2 and for d > 4 but not for d = 3, although in the physics literature the conformal bootstrap has been used to compute exponents to high accuracy for d = 3 [40]. A recent survey of mathematical work on the Ising and related models can be found in   the box used for the Ising model by one with periodic boundary conditions, i.e. Λ is a discrete d-dimensional torus. The a priori or single-spin distribution of ϕ x is set to be proportional to e −V(ϕ x ) dϕ x , where dϕ x is Lebesgue measure on R n and with g > 0, ν ∈ R, and with |ϕ x | the Euclidean norm of ϕ x ∈ R n . We are primarily interested in ν < 0, in which case for n = 1 the potential V has the double-well shape of figure 5. The probability density of a spin configuration (ϕ x ) x∈Λ ∈ (R n ) |Λ| is then proportional to the Boltzmann weight For n = 1, spins are more likely to assume values near one of the two minima of the double well. For n > 1, there is a continuous set of minima. The Laplacian term in (3.4) discourages large differences between neighbouring spins and is thus a ferromagnetic interaction. For example, for n = 1 it encourages the spins to break the symmetry and primarily prefer one minimum over the other. Now ν plays the role played by the temperature T in the Ising model, and there is a phase transition and corresponding critical exponents associated with a critical value ν c (g) < 0. For ν < ν c , spins are typically aligned, whereas they are disordered for ν > ν c . The existence of a phase transition is proved for d ≥ 3 for general n ≥ 1 in [42], and for d = 2 and n = 1 in [43]; the Mermin-Wagner theorem states that there is no phase transition for n > 1 when d = 2. The susceptibility is defined by assuming the limit exists. It represents the sum over all x of the correlation of the spin at 0 with the spin at x. If ν is above the critical point ν c then correlations remain summable, but there is divergence at ν = ν c . The predicted behaviour of the susceptibility, as t = ν − ν c ↓ 0, is with a universal critical exponent γ (depending on d, n, but not g), and with a logarithmic correction for d = 4. It was proven in 1982 that γ = 1 for d > 4 [44,45]. The concept of universality was discussed in §2a. It is predicted that the one-component |ϕ| 4 model is in the same universality class as the Ising model, and that more generally the n-component |ϕ| 4 model lies in the universality class of the model in which the single spin distribution e −V(ϕ x ) dϕ x is replaced by the uniform distribution on the sphere of radius √ n in R n . In addition, if the nearest-neighbour interaction given by the Laplacian is replaced by any other finite-range interaction respecting the symmetries of Z d , then the resulting model is predicted to be in the same universality class as the nearest-neighbour model.
The following theorem from [46] determines the asymptotic form of the susceptibility for d = 4. Its proof is via a rigorous renormalization group method. It and related work [27,28] give extensions of mathematical work from the 1980s [47][48][49]. (A caveat for theorems 3.1-3.2 is that the susceptibility is defined with the infinite volume limit taken through a sequence of tori of period L N with fixed large L, as N → ∞.) To go below the upper critical dimension, we again consider a long-range version of the model, by replacing the Laplacian term ϕ x · (− ϕ) x in (3.4) by a term ϕ x · ((− ) α/2 ϕ) x with fractional Laplacian and α ∈ (0, 2). The upper critical dimension is again d c = 2α. Several rigorous results use the lace expansion to prove mean-field behaviour for various long-range models when d > d c , e.g. [50] for the Ising model. The following theorem from [36] concerns dimensions d = d c − which lie slightly below d c = 2α. Related results are proved in [37], and earlier mathematical papers for long-range models are [51][52][53].

Supersymmetry and n = 0
An alternate idea from physics with a similar conclusion to de Gennes's was proposed independently in 1980 by Parisi & Sourlas [55] and by McKane [56]. Their idea was that while an n-component boson field ϕ (usual spin) contributes a factor n for every loop in the geometric representation of the susceptibility, an n-component fermion field contributes −n. When combined, all loops cancel, leaving the SAW. From a mathematical point of view, this realization of zero components as n − n is less problematic than setting n = 0 or considering the limit n ↓ 0, and it leads to a theorem. Some history of the mathematical work in this direction can be found in [57]. An important early step was [58], which was inspired by [59].
Fermion fields are often defined in terms of Grassmann variables, which multiply with an anticommuting product. A fermion field can also be constructed using differential forms with their anti-commuting wedge product, and we follow this route in the following.
Given any finite set Λ of cardinality M = |Λ|, we consider 2M real coordinates and corresponding 1-forms: The wedge product ∧ is associative and anti-commuting, e.g. du x ∧ dv y = −dv y ∧ du x . Let u = (u 1 , . . . , u M ) and similarly for v. A form is a function of (u, v) times a product of 1-forms, or a linear combination of these. A sum of forms which each contains a product of p distinct 1-forms is called a p-form. Owing to the anti-commutativity, p-forms are zero if p > 2M. Also, any 2M-form F can be written uniquely as (4.2) and the integral of a p-form is defined to be zero if p < 2M. This definition of integration extends by linearity to arbitrary forms. We write the 2M real coordinates in terms of M complex coordinates: Let The field φ x is a two-component boson field on Λ, and ψ x is a two-component fermion field. We define the differential forms Smooth functions of forms are defined by Taylor expansion in ψ,ψ, which terminates as a Taylor polynomial due to the anti-commutativity. For example, The susceptibility of the WSAW on a finite subset Λ ⊂ Z d is then given by the remarkable identity (4.6) with the integral on the right-hand side evaluated according to the definition of the integral as in (4.2) after conversion of the complex coordinates to real coordinates [58]. The identity (4.6) is discussed in detail in [57], where a proof is given based on supersymmetry, which is a symmetry that relates the boson and fermion fields. Replacement of − by (− ) α/2 in the definition of τ ,x in (4.4) and (4.6) gives a corresponding identity for the long-range model. The right-hand side of (4.6) is reminiscent of the right-hand side of the definition of the susceptibility in (3.5). For example, the bosonic part of the exponent on the right-hand side of (4.6) matches the exponent on the right-hand side of the Boltzmann weight (3.4) for the |ϕ| 4 model. The RG method discussed in §5 applies equally well with or without the presence of the fermion field. This allows a treatment of WSAW simultaneously with the n-component |ϕ| 4 model, as the n = 2 − 2 = 0 case, and provides a mathematically rigorous implementation of de Gennes's idea, via the supersymmetric formulation introduced by Parisi and Sourlas and by McKane.  The 'group' operation in the term 'renormalization group' is the operation of composition of maps. The maps are generally not invertible, so this is a semigroup with identity rather than a group. The terminology renormalization 'group' has nevertheless become commonplace.

Renormalization group method
The proofs of theorems 3.1-3.2 for the |ϕ| 4 model are based on the above strategy, with some adaptation due to lattice effects. As discussed in §4, the proofs of theorems 2.4 and 2.6 for the WSAW require relatively minor modifications of the proofs for |ϕ| 4 .
In the remainder of the paper, we flesh out the above strategy as it is employed in our context. To focus on the main ideas, we consider only the long-range ϕ 4 model with n = 1 component in dimensions d = 1, 2, 3. The essential problem is one of a certain Gaussian integration.

(b) Multi-scale Gaussian integration
Let d = 1, 2, 3. Let Λ be the discrete d-dimensional torus of period L N , where L > 1 is a fixed integer. The infinite-volume limit is achieved by N → ∞. Let C be a positive-definite |Λ| × |Λ| matrix. The Gaussian expectation E C with covariance C of a function F : R |Λ| → R is defined by Fix α ∈ (0, 2). Let m 2 > 0 and let C be the positive-definite |Λ| × |Λ| matrix For g 0 > 0 and ν 0 ∈ R, let The essential problem is to compute the convolution of the Gaussian expectation E C with Z 0 , namely uniformly as m 2 ↓ 0 and N → ∞. For example, it is an exercise in calculus to see that the finitevolume susceptibility is given by where the directions 1 in the directional derivative are the constant function 1 x = 1.
To evaluate (5.4), the Gaussian integration is carried out incrementally, or progressively, with each increment effecting integration over a single length scale. For this, we use the elementary property of Gaussian integration that if C = C + C then where on the right-hand side the inner Gaussian integral integrates with respect to ζ (holding ϕ + ζ fixed), and the outer Gaussian integral then integrates with respect to ζ . The choice of L N as the period of the torus allows for the partition of the torus into disjoint jblocks of side L j , for j = 0, 1, . . . , N. The 0-blocks are simply the points of Λ, and the unique N-block is Λ itself. In general, the set B j of j-blocks has L (N−j)d elements. Small values of j are depicted in figure 6. The scales j = 0, 1, 2, . . . , N for the progressive integration correspond to the block side lengths L 0 , L 1 , L 2 , . . . , L N .
We use a carefully constructed covariance decomposition, such that in the corresponding decomposition ζ = ζ 1 + · · · + ζ N of the field, the fluctuation field ζ j captures the fluctuations of the field ζ on scale j − 1. This is quantified by estimates on the covariances in the decomposition, which express the fact that a typical Gaussian field with covariance C j+1 is roughly constant on j-blocks and has size of order L −j(d−α)/2 . These estimates hold until the mass scale j m , which is the smallest value of j for which L αj m 2 ≥ 1; for scales j > j m the covariance is smaller and the integrations for such covariances is subject to a simpler analysis. An additional finite-range property of the covariances plays an important simplifying role by making the field values in non-contiguous blocks independent [36,[60][61][62]. In view of (5.8), we define a sequence iteratively by Z j+1 (ϕ) = E C j+1 Z j (ϕ + ζ ) and Z 0 (ϕ) = e −V 0 (Λ,ϕ) . (5.9) Each step in the sequence performs integration of a fluctuation field on a single scale. Then Z N is the final element of the sequence, and we are interested in the limit N → ∞. We wish to start the sequence with Z 0 defined in terms of V 0 with ν 0 slightly above the critical value ν c . However, we do not have a useful a priori description of ν c ; its identification is part of the problem. To deal with this issue, we enlarge the focus, and consider a Gaussian convolution as a mapping on a space of functions of the field, defined on a suitable domain. In other words, given a covariance C + = C j+1 , we write E + = E C + and define a scale-dependent map Z → Z + by for integrable Z. Given a function F of the field, and given a field ϕ, we define a new function θ ϕ F by (θ ϕ F)(ζ ) = F(ϕ + ζ ). Then we can rewrite (5.10) compactly as We wish to capture the scale invariance at the critical point as a 'fixed point' of the mapping Z → Z + . We do not achieve this literally, because of lattice effects. Indeed, the mapping is between different spaces, with different norms that implement rescaling. Nevertheless, the notion of a fixed point provides vital guidance.

(c) Relevant and irrelevant monomials
The mapping Z → Z + is a transformation of one function of the field to another, and we wish to identify which are the important aspects of the map to track carefully, and which parts can be regarded as remainders.
For small ϕ, an approximation of Z(ϕ) involves monomials ϕ p x . The relative importance of such monomials is assessed by calculating their size when summed over a block B ∈ B j , when ϕ x is a typical Gaussian field for the covariance C + . For the specific choice α = 1 2 (d + ) in the covariance (5.2), this leads to L j (p = 4). (5.12) For powers p > 4, a negative power of L j instead occurs, so such monomials scale down as the scale is advanced. The monomials 1, ϕ 2 , ϕ 4 are said to be relevant or expanding, while ϕ 6 , ϕ 8 , . . . are irrelevant. The relevant monomials ϕ 2 , ϕ 4 appear already in V 0 . The monomial 1 plays a relatively insignificant role for the analysis of the susceptibility. Monomials containing spatial gradients need also to be considered, in general, but for the long-range model such monomials are irrelevant.

(d) Perturbation theory
With the classification of monomials as relevant or irrelevant in mind, we treat Z as approximately equal to e −V(Λ) with V given by a local polynomial V(Λ) = x∈Λ ( 1 4 gϕ 4 x + 1 2 νϕ 2 x + u) with coupling constants g, ν, u. We seek to find V + , defined with new coupling constants g + , ν + , u + , such that Z + is well approximated by e −V + (Λ) . Then the map Z → Z + is approximately captured by the map V → V + . We refer to V as the perturbative coordinate. The term 'perturbation theory' refers to the evaluation of the map V → V + to some specific order in V, together with the analysis of this approximate map to compute critical exponents. We consider second-order perturbation theory here.
It is straightforward to compute E + e −θV(Λ) as a formal power series in V to within an error of order V 3 . Details of a way to do this are laid out in [63]. Up to irrelevant terms, the upshot is that E + e −θV(Λ) ≈ e −V + (Λ) with V + (Λ) = x∈Λ (g + ϕ 4 x + ν + ϕ 2 x + u + ) and with the coupling constants g + , ν + , u + given by an explicit quadratic polynomial in g, ν, u with coefficients determined by the covariance C + .
In order to maintain the approximation of Z by e −V over all scales, a critical m 2 -dependent choice ν 0 = ν c 0 (m 2 ) is required. With the wrong choice, V would grow exponentially and not remain small as the scale advances. As m 2 ↓ 0, ν 0 (m 2 ) approaches the critical value ν c . If we are able to control the above approximations over all scales, then we finally arrive at Z N (ϕ) ≈ e −V N (Λ,ϕ) . Substitution of this approximation into the right-hand side of (5.5) leads, after a small calculation, to The proof of theorem 3.2 in [36] verifies that the approximation (5.13) is indeed valid, and that, moreover, for small m 2 > 0 the following limits hold for general n: m 2((n+2)/(n+8))( /α)+O( 2 ) . (5.14) Together, (5.13)-(5.14) imply the differential inequalities 15) and integration then yields the statement of theorem 3.2.
Thus the proof of theorem 3.2 reduces to the validation of the approximation (5.13) with careful choice of ν c 0 (m 2 ), and the computation of the limits in (5.14). The first of these two problems is significantly more difficult than the second.
A change of variables is helpful to understand the flow of coupling constants under the RG map. To incorporate the effect of the growth of relevant monomials in (5.12), it is natural to rescale the coupling constants at scale j asĝ j = L j g j andν j = L αj ν j . A further explicit change of variables (ĝ,ν) → (s, μ) creates a simpler triangular system. In terms of the new variables, the map V → V + is described by s + = L s(1 − βs) (5.16) and with a remainder that does not play an important role. The coefficient β is given in terms of the accumulated covariance w k = k i=1 C i by Properties of the covariance decomposition imply that, for m 2 = 0, the limit a = lim j→∞ β j (0) exists; this permits β to be replaced by a in ( uncontrolled non-perturbative errors as the volume parameter N goes to infinity or as the field ϕ becomes large.

(e) Non-perturbative renormalization group coordinate
The perturbative coordinate V is supplemented by a non-perturbative coordinate K which controls all errors in the above approximations. A description of K requires the introduction of the following concepts.
We fix a scale j which we drop from the notation; scale j + 1 is denoted by +. A polymer is a union (possibly empty) of blocks from B. We write P for the set of polymers, and B(X) and P(X) for the sets of blocks and polymers contained in the polymer X ∈ P. Let N denote the algebra of smooth functions of the field ϕ. We consider maps F : P → N , e.g. F(X) = e −V(X) with V(X, ϕ) = x∈X V(ϕ x ). Given F, G : P → N , we define the circle product F • G : P → N by The circle product depends on the scale, since P does. It is commutative and associative, with unit 1 which takes the value 1 on the empty polymer and the value 0 on any non-empty polymer. We say that F : P → N factorizes over blocks if F(X) = B∈B(X) F(B) for all X ∈ P, e.g. F(X) = e −V(X) factorizes over blocks. If F and G both factorize over blocks then (5.20) since in this case expansion of the product on the right-hand side produces the sum in (5.19). Instead of the approximation Z(Λ) ≈ e −V(Λ) used in perturbation theory, we use an exact formula Z(Λ) = e −u|Λ| (I • K)(Λ). (5.21) Here I = I(V) factorizes over blocks; it may be regarded for present purposes as I(X) = e −V(X) but in fact an additional term must be included. The K appearing on the right-hand side of (5.21) is a non-perturbative quantity which encapsulates all errors in perturbation theory, much as the Taylor remainder formula expresses the error in a Taylor approximation. Initially, at scale 0 we have Z 0 (Λ) = e −V 0 (Λ) = (e −V 0 • 1)(Λ), so (5.21) holds with K 0 = 1. We seek to preserve the form of Z after the Gaussian expectation: with a scale-(j + 1) circle product, and with the operator θ as in (5.11). The choice of I + is determined by perturbation theory. Given any choice of I + , there is a K + such that (5.22) holds.
In fact, there are many, as the representation Z + (Λ) = e −u + |Λ| (I + • K + )(Λ) does not uniquely determine K + . The following proposition is a prototype for an effective choice of K + . For its statement, the closure of a polymer X ∈ P is defined to be the smallest polymer X ∈ P + such that X ⊂ X.   The right-hand side is (5.23) withK + given by (5.24), and the proof is complete.
The non-perturbative coordinate K must have two features: it must be O(V 3 ), and it must contract as the scale advances. Each of these demands requires a norm on K : P → N ; we do not describe the delicate choice of norm here [64]. TheK + produced by Proposition 5.1 is a start, but it is insufficient as it can be shown to be O(V 2 ) rather than O(V 3 ), and neither is it contractive. Delicate adjustments are required to achieve these two goals [64].
On the other hand,K + does preserve a good factorization property. We say that polymers X, Y ∈ P j are disconnected if they are separated by distance at least L j . A polymer is connected if it is not the union of two disconnected polymers, and any polymer X partitions into connected components Comp(X) which are separated by at least distance L j . We say that F : P → N factorizes over connected components if F(X) = Y∈Comp(X) F(Y). The finite-range property of the covariance decomposition is the statement that C j;xy = 0 if x − y 1 ≥ 1 2 L j . This ensures that E + (F(X)G(Y)) = E + (F(X))E + (G(Y)) if X, Y ∈ P + are disconnected, (5.28) because uncorrelated Gaussian random variables are independent. Suppose that K factorizes over connected components at scale j. It can be verified thatK + then factorizes over connected components at scale j + 1, using (5.28). This factorization functions in parallel with the norm, which has the property that the norm of a product is at most the product of the norms. The geometry of the identity (5.24) definingK + (U) is illustrated in figure 7, which is helpful for the verification of factorization.

(f) Renormalization group map and phase portrait
The RG map is a scale-dependent map RG : (s, μ, K) → (s + , μ + , K + ), (5.29) defined on a suitable domain. It is defined in such a way that K + is third order in (s, μ) if K is, and the K component is contractive under change of scale. The values of (s + , μ + ) depend on K as well as (s, μ), and this dependence is engineered to remove the relevant parts from K. This extraction is responsible for the contraction of K under change of scale, and is indispensable for the iteration of the RG map over all scales. As long as K is third order, its effect on the flow of the coupling constants does not change the second-order perturbation theory that determines the asymptotic behaviour of ν N and the critical exponent γ for the susceptibility. The RG map is used to define a map T on a space of sequences (s j , μ j , K j ) j≥0 , such that a fixed point of T corresponds to a sequence which provides a recursive solution to (5.29) for all scales j. The j = 0 value of this global RG flow identifies the critical point ν c . The RG flow is depicted schematically by the phase portrait shown in figure 8. For the long-range model with α = 1 2 (d + ) in dimensions d = 1, 2, 3, the non-Gaussian Wilson-Fisher hyperbolic fixed point is stable and the Gaussian fixed point is unstable. The critical point lies on the stable manifold from which the flow converges to the non-Gaussian fixed point. The one-dimensional unstable manifold reflects the growth of μ for a non-critical choice of the initial condition. For the four-dimensional nearestneighbour model, the two fixed points merge into a single stable non-hyperbolic Gaussian fixed point.

Conclusion
The creation of a comprehensive theory of phase transitions and critical phenomena is one of the great achievements of theoretical physics during the second half of the last century. The mathematical problems posed by that theory remain a very active topic of current research. Wilson's RG approach is a cornerstone of the physical theory. Mathematical theorems based on the RG approach began to appear decades ago, but a great deal remains to be done to provide a complete and non-perturbative understanding of critical phenomena, without uncontrolled