Abstract
Models used in simulation may give accurate short-term trajectories but distort long-term (statistical) properties. In this work, we augment a given approximate model with a control law (a ‘thermostat’) that gently perturbs the dynamical system to target a thermodynamic state consistent with a set of prescribed (possibly evolving) observations. As proof of concept, we provide an example involving a point vortex fluid model on the sphere, for which we show convergence of equilibrium quantities (in the stationary case) and the ability of the thermostat to dynamically track a transient state.
1. Introduction
In applications of modern computational science, the underpinning physical laws (and equations of motion) are often well established, yet the detailed behaviour is unpredictable on long time scales due to the presence of deterministic chaos. Examples of this arise in molecular dynamics [1,2], in polymer simulation [3] and in the modelling of turbulent fluids in the atmosphere and ocean [4,5], where long simulations are used to extract statistical information (e.g. the statistics of rare transitions between basins in molecular dynamics, or slow relaxation processes in fluids). In this type of dynamical sampling, the common requirement is that simulated paths are sufficiently accurate to calculate measures of dynamical mixing such as 2-point temporal correlation functions [6].
In Hamiltonian systems such as molecular dynamics, it is common to run ensembles of microcanonical (i.e. constant energy) simulations. For such simulations, backward error analysis [7,8] suggests that a long simulation should not be viewed as the approximation of a particular trajectory of the system but rather as an exact solution of a perturbed continuum process described by modified equations. In the case of dynamical sampling of complex systems, the statistics of simulation data are biased because they sample an invariant measure of the modified equations (rather than those of the target system), i.e. the time-discretization error induces an effective statistical bias.
Statistical bias may also arise due to spatial discretization. For example, in the setting of geophysical fluid dynamics, a comparison of discretizations of the quasi-geostrophic equations reveals that the long-time mean potential vorticity field and pointwise fluctuation statistics are heavily dependent on discrete conservation laws such as energy, enstrophy and material conservation of vorticity [9–11]. The discretization bias may be controlled by reducing timestep or by incorporating a Metropolis condition [12,13], possibly increasing the computational cost. The use of a Metropolis condition can also increase the difficulty of computing accurate dynamical properties. The need for computations to be accurate with respect to the dynamical process as well as to the long-term statistics thus poses difficult challenges to the simulator.
In this paper, we consider a method of perturbing the dynamics of the system to correct for statistical bias. Our method is based on the concept of a thermostat, by which we mean a scheme based on a control law designed to facilitate sampling of a target probability density function (pdf). Although originally proposed in molecular dynamics (see, e.g. the Nosé-Hoover method [14,15]), thermostats can be extended to handle a wide variety of systems and have been used in a wide variety of applications. For example, in Dubinkina et al. [16], a thermostat was used as a model reduction technique for a vortex model of a fluid (suppressing the detailed interactions of a few strong vortices with a weak vortex field), an approach we draw on in the numerical experiments of this article. In another recent article [17], thermostats have been suggested as a means of sampling incompletely specified systems (with noisy gradients), with applications in machine learning for Bayesian inference in the ‘big data’ context.
In the general modelling scenario, thermostats are significantly hampered by the need to specify a functional form of the pdf a priori. In this article, we assume that, instead of the pdf, what is actually available is a partial set of expectations of observables with respect to the unknown probability measure, which may arise from experiment or other types of modelling. In this setting, information theory (in particular entropy maximization [18,19]) offers tools for constructing densities consistent with observations that are close to a given prior distribution (see [20–22] for examples in geophysical modelling, [23,24] for discussion in molecular modelling). Our algorithm for computing the parameters finds Lagrange multipliers (one for each observable) that modify the probability density, using iterative techniques like those in Agmon et al. [25], Haken [26], Davis & Gutiérrez [27], where each step of iteration involves ensemble simulation that reweights samples from the prior distribution appropriately. The density resulting from such an entropy maximization is then used as the basis for designing a thermostat that ensures the extended dynamical system samples the distribution and consequently exhibits the correct (long-term) statistical averages, while mildly perturbing the short-term dynamical behaviour of the system. The complexity and generality of our iterative procedure (samplingparametrizationsampling…) means that convergence has to be verified through computer experiment.
Entropy maximization also forms the basis for closure approximations in the works of Ilg et al. [3] and Samaey et al. [28], where the result from entropy maximization is used in determining evolution equations for resolved macroscopic quantities. This approach is distinct in philosophy from the current work in which the result of entropy maximization is used to define a statistical ensemble and evolution equations for microscopic dynamics that are consistent with macroscopic observations.
While we motivate the method in the setting of equilibrium sampling of a fixed target distribution, we can easily apply our scheme to the case of a more general model with unknown or even undefined steady state. For this purpose, an ensemble of simulations are run in finite duration bursts, approximating the Lagrange multipliers via an iterative procedure based on the simulation history. This procedure allows entropy maximization to be performed with ensemble averages that evolve in time, providing a flexible method of nonlinear data assimilation.
The remainder of this article is organized as follows. In §2, we review the idea of a thermostat which is the key tool we employ to control the invariant distribution. Section 3 describes the combination of thermostats with the maximum entropy framework for correcting the density to reflect thermodynamic constraints. Section 4 describes the adaptive procedure. In §5, we apply and evaluate the method in the setting of a system of point vortices on the surface of a sphere, which represents a simple geophysical model with multiple statistically relevant first integrals.
2. Thermostats
We introduce the thermostat technique for generic Hamiltonian dynamical systems of the form
Thermostats were introduced in molecular dynamics to model the trajectories of molecules in a fluid at constant temperature. In the common statistical mechanical framework, it is assumed that the trajectories of a system of particles in thermal equilibrium with a reservoir at constant temperature sample the canonical distribution, with smooth invariant density proportional to exp(−βH) where β is a parameter reciprocal to temperature scaled by Boltzmann’s constant. The trajectories of the Hamiltonian system (2.1) are restricted to a level set of energy. Hence, to model a system at constant temperature, it is necessary to perturb the vector field to make trajectories ergodic with respect to the canonical distribution. The most common way of achieving this is by adding stochastic and dissipative terms satisfying a fluctuation–dissipation relation (Langevin dynamics), which is rigorously ergodic [29].
However, DelSole [30] points out that additive stochastic forcing of trajectories leads to inaccurate dynamical quantities as autocorrelation functions are strongly perturbed. For smooth deterministic Hamiltonian dynamics, normalized velocity autocorrelation functions are of the form 1−cτ2, c>0 in the zero-lag limit , whereas the autocorrelation of a variable that is directly forced by white noise must take the form , κ>0 in the same limit. This implies that direct stochastic perturbation leads to autocorrelation functions that have non-zero slope and opposite curvature at zero lag [31].
An alternative approach, due to Nosé [14,15] and Hoover [32], augments the phase space with an additional thermostat variable ξ(t) (driven by a differential equation) in order to ensure that the extended dynamics on Rd×R preserves an equilibrium density whose marginal on Rd is the target (e.g. Gibbs) density. The deterministic thermostats of Nosé and Hoover are not ergodic, but they can be combined with stochastic forcing of the thermostat variable ξ, leading to the so-called Nosé–Hoover–Langevin method [33]. For generic divergence-free dynamical systems, the generalized Bulgac–Kusnezov (GBK) [34,35] thermostat provides the augmented system:
The parameter ε controls the strength of the thermostat relative to that of the vector field f. This affects the rate at which the invariant measure is sampled. It has been demonstrated in Bajars et al. [36,37] and Leimkuhler et al. [31] that GBK/NHL thermostating gives a weak perturbation of microcanonical dynamics: typical autocorrelation functions match the exact ones to leading order, i.e. have the form 1−cτ2+O(τ3), as .
Designing a thermostat relies on knowing the functional form of the target distribution, which may not always be available. In this article, we relax this requirement by constructing the target distribution iteratively and adaptively, based on Jaynes’ principle of least-biased ensemble prediction.
3. Bias correction method
Suppose that we are given a simplified dynamical model for the evolution of some projection, i.e. a ‘coarse graining’ of the phase variables (coarse-grained variables y(t)∈Rd) which is in the form (2.1). Although the original system is complex and its details unknown, we assume that we can obtain in some way (e.g. through measurement) a collection of observations of mean values of functions of the reduced variables. That is, there are functions , k=1,2…,K and given values ck, k=1,2,…,K, such that
(a) Entropy maximization
Empirical information theory generalizes the principle of insufficient reason, by proposing the least-biased probability density consistent with a set of observations. This idea was first proposed by Jaynes [18,19] and is treated in the monographs of Haken [26], Dewar et al. [38] and in the context of geophysical fluids by Majda & Wang [39]. The least-biased density is defined as the probability density ρ(y) that maximizes the information entropy functional subject to a set of constraints given by observations. When is a compact set and there are no observations, the minimizer is the uniform density . The entropy is the unique measure of uncertainty that is positive valued, monotonically increasing as a function of uncertainty, and additive for independent random variables. With observable functions {Ck(y) | k=1,2,…K} let
In some cases, besides the observations, we may be given prior statistical information on the process y(t). The Kullback–Leibler divergence, or relative entropy,
Suppose y is a random variable with distribution (law) y∼ρ, where ρ is unknown. Suppose further, that we are given a prior distribution π, presumed to be close to ρ, and a set of K observations (3.1). Following Jaynes [18,19], the least-biased distribution ρ consistent with the observations ck and prior π solves the constrained minimization problem
Calculation of the Lagrange multipliers is discussed in [25–27]. We use the following algorithm based on re-weighting. Assume we are given a sequence of samples yn, n=1,…,N, distributed according to a known prior distribution π(y), i.e. yn∼π. The expectation under π(y) of a function Φ(y) has the consistent and unbiased estimator
Given the posterior distribution ρ(y) of the form (3.3), compute the expectation by re-weighting of the integral
We wish to ensure that the estimators for observables Ck match their empirical values ck, i.e.
The Jacobian matrix J=(Jkj) of the vector function r is determined as
4. Adaptive determination of Lagrange multipliers
When the observations deviate far from the prior, the Lagrange multipliers take larger values and the weights of the samples diverge, i.e. some weights become very large and others become very small. This results in a larger variance on the estimates and consequently in poorer results for the Lagrange multipliers. Moreover, the approach of the previous section excludes applications where (i) the statistical knowledge is expected to improve as the simulation progresses, (ii) the average observables are known to vary slowly with time or (iii) it is unfeasible to construct a large enough ensemble distributed in the prior. For these cases we consider using the simulation data of a small ensemble (propagated in short bursts of M timesteps) for updating the Lagrange multipliers for mean observation data. This results in an adaptive algorithm for obtaining the Lagrange multipliers on-the-fly during simulation. In this situation, the sample set used is drawn from a distribution closer to the target, resulting in better estimations for the residuals and Jacobian.
Let us assume that we have a (moderately sized) ensemble of initial conditions according to some distribution
Let us consider two limiting cases to justify the approach. (i) By increasing the number of ensemble members P, one recovers the non-adaptive scheme, where the Lagrange multipliers are found based on the initial ensemble after a single burst. On the other hand, (ii) if we have a modest ensemble P, it is necessary that the burst is long enough for the thermostatted dynamics with the adaptive parameters to equilibrate on this time scale. The error due to the transient of the thermostat is removed in the limit of many timesteps per burst, i.e. as . We present numerical verification of the practical use of the scheme in a specific application setting in §6c.
(a) Adaptive algorithm
There are two important modifications to the algorithm described above that are included in the numerical implementation of this method:
— If the Lagrange multipliers change rapidly, the thermostat may require a long time to equilibrate. This requires a larger value for M, increasing the simulation time required before including new samples. The effect is notable at the beginning of a simulation, due to two factors: (i) the small sample size leads to inaccurate expectations for the observables, and (ii) the initial values for Lagrange multipliers may be far from their equilibrium. By limiting the rate of change of the Lagrange multipliers, these problems are circumvented.
— In equations (4.4) and (4.5), all previous values are included. In a long simulation, this leads to a growing computational demand. By taking only a fixed number (q) of recent steps the cost is reduced. In the case that the initial samples cannot accurately be drawn from the prior, this has the further advantage that these inaccuracies are eventually forgotten.
The resulting method is summarized in algorithm 1.
5. Application to reduced modelling of point vortices
Here we apply the least-biased correction method to a system of point vortices on the sphere, which has a Poisson structure and multiple conserved quantities, including total energy and angular momentum and a set of Casimirs.
(a) Point vortex system
Point vortices represent singular solutions to two-dimensional, incompressible fluid flow in which each vortex, positive or negative, is centred at a specified position. Point vortices have been studied extensively in [41–43]. We work on a spherical domain [44–48], viewing the vortex position as a vector in R3 with unit norm. The equations of motion have a Lie–Poisson structure
By introducing , equation (5.1) has the form (2.1) with the block-diagonal structure matrix
The vortex positions are defined in Cartesian coordinates, but initial positions xi(0) are chosen on the sphere. Because each |xi| is a Casimir of the Poisson bracket, the vortices remain on the sphere. Furthermore, the rotational symmetry of the sphere gives rise to three Noether momenta, which are expressed by the angular momentum vector
The GBK thermostat (2.2) is only applicable to divergence free systems ∇⋅f≡0. It is straightforward to check that this condition holds for the spherical point vortex model.
A numerical integrator for the point vortex system is constructed as in Myerscough & Frank [48] based on splitting the differential equations into integrable subproblems (see related ideas in [53,54]). The backward error analysis of symplectic integrators has an analogous development for Poisson systems, implying approximate conservation of the Hamiltonian [8]. The Casimirs (|xi|≡1) and the momentum are exactly preserved by the integrator.
(b) Thermostat perturbation vector
The choice of the perturbation vector field g in (2.2) is flexible, but the vector fields f and g should satisfy a Hörmander condition [36]. It is vital that the Casimirs should be respected by the perturbation vector field. Double-bracket dissipation [55] preserves Casimirs of the original system and is thus an ideal candidate for g:
The thermostat (2.2) is designed to sample a target density ρ(y)∝e−A(y) on the phase space of y. The thermostat variable ξ is normally distributed, yielding the extended distribution . The perturbation vector field g must additionally ensure that the thermostatted system is ergodic in the target density. Because the target measure is positive for all open sets on the phase space, hypoellipticity of the Fokker–Planck equation associated with (2.2) is sufficient to prove uniqueness of the invariant measure [56]. Hypoellipticity follows from Hörmander’s controllability condition [57]. The condition has been tailored to GBK thermostats in Bajars et al. [36], but it is difficult to verify in practice. Here we verify using simulation that single trajectories have statistics that agree with the target distribution.
(c) Maximum entropy model
To apply the methods proposed in §3 in the setting of a reduced model for point vortices, we use point vortices distributed evenly over the surface of the sphere as the prior π. The use of a thermostat for the bias correction limits us to using observables that are function of the first integrals. Furthermore, we assume the angular momentum has no directional preference, a choice we motivate in the numerical comparisons of §6. In that case, only the magnitude |J| of the momentum vector needs to be considered.
To accurately reproduce statistics from a full model with a moderate number of point vortices, it is necessary to modify the canonical density with a term quadratic in the Hamiltonian, that is, a density of the form [16]. Motivated by Dubinkina et al. [16], we choose observations that include linear and quadratic functions of H and |J|2.
We consider the following set of observables:
The least-biased density consistent with observations of the is given by
6. Numerical comparison
Here we apply algorithm 1 to a reduced model of point vortices similar to the configuration used in Bühler [58] and Dubinkina et al. [16]. We distinguish between three models. The full model consists of a system (5.1) of Mfull=288 point vortices, of which eight strong vortices of circulation Γj=±1 and 280 weak vortices of circulation . Both strong and weak classes comprised equal numbers of positively and negatively oriented point vortices. The reduced model consists of (5.1) with just M=8 strong vortices. Finally, the corrected model consists of a thermostatted system (2.2) with unperturbed vector field f given by (5.1) for M=8 strong vortices, perturbation vector field g given by (5.4), and equilibrium measure defined by the least-biased density (5.6). Additionally, we compare with Metropolis–Hastings samples from the least-biased density (5.6) to help distinguish between errors incurred due to the maximum-entropy model and those due to sampling bias of the (discretized) thermostatted dynamics.
We run five long simulations of the full model with total energies chosen from the set Hfull∈{−2,−1,0,1,2}. The total angular momentum vector is fixed at Jfull=0 such that there is no directional preference for the momentum of the reduced model embedded in the full model. For each run, we determine the time averages of the observables (5.5) for the subset of strong vortices. When computing the Hamiltonian H of this subsystem, we include only the internal coupling between strong vortices. The time averages are tabulated in table 1.
〈H〉 | 〈|J|2〉 | 〈H2〉 | 〈|J|4〉 | 〈H|J|2〉 | |
---|---|---|---|---|---|
Hfull=−2 | −0.33 (−0.38) | 4.59 (4.45) | 0.22 (0.23) | −0.63 (−0.98) | 34.58 (31.45) |
Hfull=−1 | −0.11 (−0.18) | 4.78 (4.68) | 0.10 (0.12) | 0.38 (−0.01) | 37.55 (35.88) |
Hfull=0 | 0.02 (−0.04) | 4.63 (4.56) | 0.08 (0.08) | 0.90 (0.60) | 35.26 (34.30) |
Hfull=1 | 0.17 (0.15) | 4.74 (4.75) | 0.13 (0.12) | 1.73 (1.61) | 37.76 (37.44) |
Hfull=2 | 0.31 (0.28) | 4.87 (5.00) | 0.22 (0.21) | 2.49 (2.46) | 39.26 (41.74) |
Using these averages, we compute the Lagrange multipliers using the algorithm described in §4 with prior distribution π taken to be uniform on the sphere. The Lagrange multipliers are also recorded in table 2. The magnitude of γ{H,J,HJ} indicates that including these observables impacts the resulting density. In other words, all observables add information to the least-biased density, with the possible exception of the lowest energy case with Hfull=−2.
βH | βJ | γH | γJ | γHJ | |
---|---|---|---|---|---|
Hfull=−2 | 5.98 | −0.20 | 0.69 | 0.41×10−3 | −0.04 |
Hfull=−1 | 2.89 | −0.03 | 2.67 | 9.77×10−3 | −0.33 |
Hfull=0 | −0.76 | 0.20 | 3.38 | 9.97×10−3 | −0.37 |
Hfull=1 | −3.54 | 0.37 | 4.29 | 15.31×10−3 | −0.54 |
Hfull=2 | −6.42 | 0.53 | 4.45 | 14.05×10−3 | −0.51 |
We then run simulations of the corrected model using the computed parameters. Table 1 also records expectations from the thermostat-corrected model.
By analogy with canonical statistical mechanics, we may think of the weak vortices that are ignored in the reduced model as forming a reservoir with which our reduced model exchanges energy and angular momentum. Experience with canonical statistical mechanics of point vortices in the plane [16,58] suggests that for small reservoir sizes, the canonical distribution must be modified with higher order terms to agree with the full system statistics. Table 3 gives the multipliers for different numbers Mfull of vortices in the full system, confirming that those corresponding to γH, γJ and γHJ are more significant for smaller Mfull; as the number of weak vortices grows, the strong vortex system approaches the canonical density.
βH | βJ | γH | γJ | γHJ | |
---|---|---|---|---|---|
Mfull=36 | −1.51 | 1.35 | 27.75 | 117.15×10−3 | −3.07 |
Mfull=72 | −4.27 | 0.82 | 8.71 | 37.79×10−3 | −1.12 |
Mfull=144 | −0.97 | 0.32 | 6.70 | 25.80×10−3 | −0.82 |
Mfull=288 | −0.76 | 0.20 | 3.38 | 9.97×10−3 | −0.37 |
Mfull=576 | −1.09 | 0.13 | 0.87 | 3.08×10−3 | −0.10 |
The form of the distribution (5.6) implies there is a non-zero probability for any configuration of the strong vortex system with finite energy and momentum. When considering the strong vortex system in contact with a reservoir of weak vortices, its energy is bounded by the energy in the total system plus or minus how much energy can be stored in the reservoir and in the interactions between strong and weak vortices. For the momentum, a similar condition holds, but here there are no interaction terms. If it is not possible for the reservoir to add or subtract sufficient energy from the reservoir, it will not be possible for the subsystem of strong vortices to attain all possible configurations with non-zero probability. For the energy, it is readily verified that as long as the reservoir contains three vortices not all of equal sign, the reservoir may supply or remove an arbitrary amount of energy, as close of approaches of two like-signed (respectively, opposite-signed) vortices produces arbitrarily large-positive (respectively, negative) energies. For the momentum, the result is more interesting. The system of M strong vortices, all with strength ±Γstrong, may exist in configurations with an angular momentum . Note the supremum, as the configuration with exactly will have infinite energy. For the reservoir, it holds that . For the the reservoir to supply sufficient angular momentum, it is necessary that
Remark 6.1
(a) Equilibrium results
In this section, we compare statistical properties of the corrected model with those of the full and reduced models. In figures 1–3, we show histograms of a number of solution features for the eight vortex model: the distributions of H and |J|, as well as typical distances between like- and opposite-signed vortices, a metric also used by Bühler [58]. In each histogram, the statistics corresponding to the strong vortices in the full model, the reduced model, thermostat-corrected reduced model and Metropolis–Hastings samples are displayed. Figures 1–3 correspond to approximate total energies Hfull≈−2, 0 and 2, respectively.
The full and reduced model simulations are performed with a timestep of 5×10−3 and run up to T=5×104, taking 105 samples spaced evenly in time. For the Metropolis–Hastings method, we use 106 samples. The same figures also show results from the thermostatted system (dashed-dotted lines), run with a timestep of 10−3 up to T=106, taking 106 samples. The parameters in (2.2) were set to be ϵ=10 and γ=0.1. These results confirm that the thermostatted system samples the least-biased density closely. The complete simulation trajectories are available in the electronic supplementary material.
The reduced model is Hamiltonian and the Poisson integrator ensures that the energy is conserved with a standard deviation of order 10−3 and the angular momentum constant to machine precision. Both cases correspond to approximate delta-distributions in the upper histograms in figures 1–3. Note that due to the high skewness of the distribution for |J|, the observed mean differs significantly from the median and mode, implying some ambiguity in choosing the angular momentum for an appropriate initial condition for the reduced model.
A simple Hamiltonian-reduced model is incapable of sampling the energy and angular momentum spectra, as these are first integrals, hence the reduced model shows significant bias in statistics such as vortex separation. On the other hand, the thermostat-corrected model faithfully samples the least-biased probability density, as indicated by the good agreement in the histograms of the corrected model and Metropolis–Hastings samples. The least-biased density approximates the strong-vortex statistics well, particularly in the negative to moderate total energy regime. At large positive total energies, the strong vortex energy and angular momentum distributions are still well represented by the least-biased PDF, but some bias in the vortex separations can be observed. The closeness of the thermostat results to those from the Metropolis–Hastings sampling indicate the error lies in the choice of least-biased density, not in the thermostat sampling.
(b) Dynamic consistency
The results in the previous section confirm that the thermostatted simulations lead to equilibrium distributions of observables H and |J| similar to those of the full system. In this section, we address the issue of the degree to which our equilibrium correction mechanism disturbs dynamics, as encoded in autocorrelation functions and diffusivity. Diffusivity was considered by Chavanis [59] for a system of identical point vortices and by Cotter & Pavliotis [60] for a wide array of problems with scale separation. We emphasize that the values of the thermostat parameters ε and γ have no impact on the equilibrium statistics presented in the previous section, and only affect the rate at which the least-biased PDF is sampled. Parameters with a larger deviation from the unperturbed dynamics lead to faster equilibration of the distribution, giving rise to a modelling choice.
(i) Autocorrelation functions
Given a sequence of L equally spaced observation times ti∈[0,T] for i∈[0,L], and the values of the relevant observable (in our case vortex position) ui=u(ti) at those times, the discrete autocorrelation function is defined by
We average the autocorrelation function over all 3M (strong) vortex coordinates {xm,ym,zm}m=1…M. Three symmetries in the problem justify this averaging: the vortex numbering is arbitrary; the choice of reference frame is arbitrary and the sign of the vortices appears in the dynamics as a reversal of time, to which the autocorrelation is insensitive. Additionally, the observables H and |J| are isotropic.
Furthermore, we ensure that the phase space is well sampled by averaging the autocorrelation functions over an ensemble of P solutions. We then find the average autocorrelation function
In figure 4, we compare autocorrelation functions for the strong vortices in the full and reduced models as well as for the thermostat-corrected model over a range of parameters ε and γ. The thick solid black line represents the result for an (unthermostatted) system in contact with 280 weak () vortices, with a total energy Hfull=0. The results present the average over an ensemble of 1000 runs. For each simulation, the initial placement of each strong vortex was taken uniformly over the sphere and the weak vortices were placed such that the full system satisfied Hfull=0 and Jfull=0. The thick dashed black line represents the results for an ensemble of simulations of the isolated system, with everything else unchanged. The other lines represent results for thermostatted simulations using the Lagrange multipliers as given in table 2 for the case of H=0. As expected, when the thermostat is weak the autocorrelation functions are similar to those of the isolated system. The stronger thermostat parameters presented here match the autocorrelation well for a short-time period, but all result in excessive decorrelation for longer lag times. In the following section on the diffusivity constant, we discuss the performance of different parameter values in more detail.
(ii) Diffusivity
For general multiscale dynamical systems with a separation of slow and fast dynamics, it is often desirable to model fast forces by a diffusion process, resulting in stochastic differential equation of the form [61,62]
If we take the average diffusivity for each vortex coordinate, we find
Figure 5 presents the diffusivity constant as a function of the interval time Δτ for different parameter values for ε and γ. The results are taken from the same simulations used for the autocorrelation functions, as described in the section.
For ε small, the thermostat perturbation is weak, and both autocorrelation functions and diffusivity approach those of the reduced model with constant H, J. Also, the autocorrelations are insensitive to the parameter γ in this regime. For values of ε<1, the autocorrelations and diffusivities become indistinguishable from those of the isolated model. Hence even though the dynamics samples the least-biased density on long time scales, its short-time dynamics is similar to an unperturbed model. For moderate ε, dependence on γ becomes more pronounced, and a diffusivity closer to that of the full model can be achieved. For even larger values of ε, the diffusivity becomes much more sensitive to the value of γ, as indicated in figure 5c.
Figure 5d has been included to illustrate two important properties by plotting the diffusion parameter against the sampling interval on a log–log scale. First, for large sampling interval the estimator shows an inverse linear tendency. This corresponds simply to the decorrelation of the vortex dynamics. Second, as the sampling interval goes to zero, the estimator of the diffusivity constant shows polynomial behaviour. This is in agreement with known results for the GBK thermostat [63] and is an improvement on Langevin thermostats, which tend to a constant value for short sampling intervals, in disagreement with a deterministic reference [30]. See also the discussion in §2.
(c) Adaptive determination of multipliers
In this section, we apply the scheme of §4 to the point vortex system. Consider the same reduced point vortex model of eight vortices with Γ=±1. We perform a thermostatted simulation of this system, which we feed with observation data taken from a simulation of the 288-vortex system including the weak vortices that form the thermal bath. The observation from simulation with the thermal bath may be replaced by observation from another source–if available–such as empirical observations or a prediction of the trend for the observables. In particular, we use the observed data in three different ways for computing the Lagrange multipliers that determine the equilibrium distribution used by the thermostat:
(i) The first approach mirrors the non-adaptive method of §3. We use the long time averages from equilibrated thermal bath simulation. The equilibrium distribution varies during the thermostatted simulation, but during the whole simulation it converges to the same equilibrium density as before (like those denoted in table 2).
(ii) The second approach does not assume long time averages are available a priori. Instead, the running mean from the thermal bath simulation is used. This means the iterative procedure (4.4) drives the system towards a varying target during simulation. Eventually, the target approaches the long time average used above.
(iii) In the final implementation, we relax the notion of statistical equilibrium for the strong system and consider the statistical state to be slowly varying. This is implemented by using a time-local averaged observations from the thermal bath simulations. As a result, there is no stationary equilibrium density.
For testing the adaptive scheme, we only consider the thermal bath simulations where Hfull=0. Also, we consider just two observables C1=H and C2=|J|2, associated with the Lagrange multipliers βH and βJ. Including more observables is possible, but the Newton–Raphson iteration is sensitive to noise in the estimators, requiring a larger ensemble. We initialize with an ensemble of P=100 initial conditions drawn from the uniform prior, applying algorithm §4a The timestep in the thermostatted simulation is chosen as 1×10−2 and the method described in §4 for updating the Lagrange multipliers is applied every time unit, i.e. M=100. Between subsequent updates of the multipliers, the maximum difference is limited by |Δλk|≤0.1. When using equilibrium statistics, this limit only affects the beginning of the simulation, when the small sample size used leads to a large variance in the estimators.
We present the results for the three different uses of the observed data. The complete simulation trajectories are available in the electronic supplementary material.
(i) In figure 6, the long time mean is taken and used throughout.
(ii) In figure 7, the running mean is used. This reflects the situation where we have no a priori knowledge of the observations, and are continuously feeding new real-time data into the simulation.
(iii) In figure 8, a time-localized average of the observable is used. The averaging has a time-scale of a 100 time units.
In each of the figures, panel (a) shows the target mean for the energy solid cyan (light), the instantaneous ensemble mean in black dots and the running mean of the ensemble in dashed red (dark). Panel (b) shows the same means for the momentum magnitude and (c) shows the Lagrange multipliers for the energy (cyan, light) and momentum (black). Superimposed on the Lagrange multiplier for the energy is an indicator (red, darker dots) showing when the rate of change in the estimate for the Lagrange multiplier has been limited.
When using either a long time mean observation or a running mean observation, the simulation results tend towards the correct long-time averages. When using time-local averages the simulation averages appear to tend towards a similar value. In all three cases, the instantaneous ensemble mean remains close to the (moving) target for both energy and momentum. This is especially notable for the third case, where the target varies over time, but the simulation ensemble mean follows closely, with only a little lag.
The inaccuracies during the first approximately 100 time units indicate that the prior does not match the observed state well. This results in the (negative) growth of βH being limited briefly at the beginning of each simulation. Subsequently, both Lagrange multipliers appear to oscillate irregularly about some mean value for the first two cases. In the case of a shifting target, the Lagrange multipliers vary in time more erratically, resulting in the limiter being active for a few brief periods of the simulation.
7. Conclusion
We have proposed a thermostat-based method for perturbing trajectories of numerical simulations to correct for stationary or time-dependent thermodynamic observations. We have also described an adaptive procedure that uses the thermostatted simulation data in finding the least-biased density, removing the need to compute the density a priori.
Our numerical experiments with the vortex model confirm that the distributions of the observed energy H and angular momentum magnitude |J| can be well approximated using the thermostat technique. Other equilibrium metrics such as the distribution of distances between like- and opposite-signed vortices are also in agreement across a range of total energy values of the full system, although some discrepancies occur at large positive energies.
We also investigated the degree to which correction of trajectories for expectations may affect dynamical information in the form of autocorrelation functions and diffusivity. By decreasing the perturbation parameter ε of the thermostat, the autocorrelation functions of the unperturbed, reduced system may be precisely recovered. As ε is increased, one may increase the diffusivity to values that agree with the full system. This is consistent with results reported in Frank & Gottwald [63], in the context of molecular dynamics where it was shown that the GBK thermostat used here approaches Langevin dynamics in the limit of large stochastic forcing.
Data accessibility
The code for thermostated simulation of the point vortex system can be found at https://www.dropbox.com/sh/ekhj4ohtetj35ri/AABNMpHki63cn16hpnaQ.OYa?dl=0.
Authors' contributions
The authors have contributed equally to this work.
Competing interests
We declare we have no competing interests.
Funding
The work of K.W.M. was supported by a grant financed by the Netherlands Organisation for Scientific Research (NWO), as part of research programme no. 613.001.009. B.L. was supported by grant no. EP/K039512/1, grant no. SI2-CHE: ExTASY (Extensible Tools for Advanced Sampling and analYsis) from the Engineering and Physical Sciences Research Council (UK) and a faculty fellowship from the Alan Turing Institute (ATI).
Footnotes
1 The latter condition is automatic for systems (2.1) with constant B. Strictly speaking, the approach described here is applicable to any system with divergence-free vector field ∇⋅f≡0 possessing one or more first integrals.
Appendix A. Time integration of the thermostatted system
In this appendix, we detail the method of integration for the perturbation dynamics and thermostat variable.
The unperturbed dynamics are formed by the combination of pairwise interactions:
The integration of the perturbed dynamics for a vortex pair (i,j) means integrating the system of ordinary differential equations given by
By introducing αS=(Γi+Γj)/4π, the symmetric part may be written as
The change in the distance between the two vortices can be represented by the change of their inner product
In the antisymmetric part, we introduce αA=(Γi−Γj)/4π to write