A hybrid continuous-discrete method for stochastic reaction–diffusion processes

Stochastic fluctuations in reaction–diffusion processes often have substantial effect on spatial and temporal dynamics of signal transductions in complex biological systems. One popular approach for simulating these processes is to divide the system into small spatial compartments assuming that molecules react only within the same compartment and jump between adjacent compartments driven by the diffusion. While the approach is convenient in terms of its implementation, its computational cost may become prohibitive when diffusive jumps occur significantly more frequently than reactions, as in the case of rapid diffusion. Here, we present a hybrid continuous-discrete method in which diffusion is simulated using continuous approximation while reactions are based on the Gillespie algorithm. Specifically, the diffusive jumps are approximated as continuous Gaussian random vectors with time-dependent means and covariances, allowing use of a large time step, even for rapid diffusion. By considering the correlation among diffusive jumps, the approximation is accurate for the second moment of the diffusion process. In addition, a criterion is obtained for identifying the region in which such diffusion approximation is required to enable adaptive calculations for better accuracy. Applications to a linear diffusion system and two nonlinear systems of morphogens demonstrate the effectiveness and benefits of the new hybrid method.

Stochastic fluctuations in reaction-diffusion processes often have substantial effect on spatial and temporal dynamics of signal transductions in complex biological systems. One popular approach for simulating these processes is to divide the system into small spatial compartments assuming that molecules react only within the same compartment and jump between adjacent compartments driven by the diffusion. While the approach is convenient in terms of its implementation, its computational cost may become prohibitive when diffusive jumps occur significantly more frequently than reactions, as in the case of rapid diffusion. Here, we present a hybrid continuous-discrete method in which diffusion is simulated using continuous approximation while reactions are based on the Gillespie algorithm. Specifically, the diffusive jumps are approximated as continuous Gaussian random vectors with time-dependent means and covariances, allowing use of a large time step, even for rapid diffusion. By considering the correlation among diffusive jumps, the approximation is accurate for the second moment of the diffusion process. In addition, a criterion is obtained for identifying the region in which such diffusion approximation is required to enable adaptive calculations for better accuracy. Applications to a linear diffusion system and two nonlinear systems of morphogens demonstrate the effectiveness and benefits of the new hybrid method.

Introduction
Many biological systems are subject to stochastic fluctuations when the copy number of the molecules is relatively small [1,2]. For spatially inhomogeneous systems that involve both reactions and diffusions, the spatial distributions of the molecules are important because local copy number fluctuations may result in phenotypic differences, though the total number of molecules of the relevant species may be high [3].
For a spatially homogeneous system, the Gillespie stochastic simulation algorithm (SSA) tracks each reaction event and updates the state of the system after each occurrence of reactions [4]. Numerous methods have been developed to improve the efficiency of this method for cases involving large sizes for certain species or frequent reactions; examples of such methods include the next reaction method [5], τ -leaping [6][7][8], the hybrid methods for simulating fast and slow reactions [9][10][11] and the adaptive multi-level simulation algorithm [12].
For a spatially inhomogeneous system, the SSA can be applied by first partitioning the spatial domain into many compartments. In each compartment, reactions are treated as in the homogeneous case; however, molecules may jump between adjacent compartments through diffusion [13]. In this approach, the size of each compartment must be sufficiently small that diffusive jumps occur more rapidly than reactions and the inhomogeneity inside each compartment can be ignored [14,15]. Also, the time scale for the molecule to diffuse throughout a compartment should be much faster than the time scale for the fastest bimolecular chemical reaction [14,16,17].
In the case of frequent diffusive jumps, the spatial SSA may become computationally inefficient. Consequently, modifications have been made to accelerate the SSA, as in the next sub-volume method [18] and the null process [19], or to develop a new computational algorithm that optimizes the search process in the SSA [20], or to approximate diffusion, as in the multinomial simulation algorithm [21], the diffusive finite state projection method [22] and adaptive algorithms [23,24]. These approximation methods share a common feature: reaction and diffusion processes are simulated independently, and the diffusive jumps occur at predetermined time intervals, between which the reactions are simulated. As a result, the time step depends on the fastest diffusion rate, leading to excessive computational cost, in particular, for the case of fast diffusion.
Several hybrid methods were developed for solving stochastic reaction-diffusion systems [25][26][27][28]. The most common approach is a spatially hybrid method between the SSA and deterministic approach. Noise effect in high concentration region is relativity small so deterministic approach such as using partial differential equations is sufficient to ensure the accuracy of simulations. However, small noise effect in high concentration region may still result in large fluctuations in low concentration region because of communication between the regions, so the development of a hybrid method which combines two or more stochastic approaches without involving any deterministic approach becomes an important and effective approach.
Here, we introduce a hybrid continuous-discrete method to simulate spatially inhomogeneous systems with better efficiency and accuracy. As in the Central Limit Theorem, a continuous Gaussian random variable can be used to approximate the change in the number of molecules introduced by a large number of independent diffusive jumps. A brief overview of the algorithm is listed below: 1. Calculating the time of the next occurring reaction based on the SSA. 2. Approximating the numbers of diffusive jumps using Gaussian random vectors with timedependent means and covariances by assuming no reactions during the period of time. 3. If the approximated number of diffusive jumps at some compartments is large enough, the approximation is then applied at that locations; otherwise the SSA is applied. Since the number of diffusive jumps may be different at each time period between reactions, so the locations at which the approximation is applied are set adaptively over time.
As a result, the updating takes place only when a reaction occurs, and the time step is then determined by the reaction rates. Because the correlation among diffusive jumps is considered in our method, the approximation of the diffusion is reliable, and its effect on the reactions is small. The method is applied to 3 one-dimensional reaction-diffusion systems for morphogen gradients in addition to a two-dimensional morphogen system. The efficiency of the method improves as the diffusion coefficients increase or the number of diffusing molecules increases, yielding more than 60% savings in computational cost compared with the standard spatial SSA. In addition, because of the new time-adaptive criterion for identifying the region for the diffusion approximation, our hybrid method allows for zero initial conditions (i.e. an initial state without any molecules) in the simulations, unlike many other existing stochastic methods, which need to start with a certain level of molecules in order to obtain good approximations.
Here, s r ji and s p ji are the stoichoimetric coefficients of the reactant and product species, respectively, and γ j is the macroscopic rate constant of R j . Although we describe the method on a one-dimensional domain, it is worth noting that it is easy to extend this method in a higher dimensional domain. The examples in one-and two-dimensional domains will be discussed in the Simulation results section.
If the system is spatially inhomogeneous, then the domain is partitioned into K identical compartments with uniform length h, where h = L/K. The subsystem in each compartment is assumed to be homogeneous. Molecules in different compartments are treated as different species, denoted by {S 11 , S 12 , . . . , S ki , . . . , S KN }, where S ki is the ith species in the kth compartment. The system state is denoted by where X ki is the number of molecules of S ki .
Only molecules in the same compartment can react. The jth reaction in the kth compartment R kj is as follows: where γ kj is the reaction rate constant of reaction R kj . Diffusion is treated as a reaction in which a molecule in one compartment jumps to one of its neighbouring compartments at a constant rate. Without loss of generality, we assume that only species S 1 diffuses in the algorithm description. Assume that species S 1 diffuses with the coefficient D 1 with reflective boundary conditions on the boundary of the domain. Then, the diffusive jumps obey the following chain reactions: Multiple diffusive species can be easily treated using the same approach, and an example of such a case is presented in the Simulation results section.
Consider X(t) as a variable x, the probability that the reaction R kj will fire in the next time interval [t, t + dt) is a kj (x) dt, where a kj is called the propensity function of R kj . The state of the system transfers from one state to another through reaction firing. The net change of the state of the system caused by one occurrence of reaction R kj is denoted as ν kj and so In addition, denote as a kL (x) and a kR (x) the propensity functions of diffusion jumps J kL : S k1 → S (k−1)1 and J kR : S k1 → S (k+1)1 , respectively. Denote as ν kL and ν kR the net change of the state of the system caused by J kR and J kL , respectively. As diffusion is treated as mono-molecular reactions, we have The chemical master equation (CME) that governs the temporal evolution of the probability density function p(x, t) that the state of the system is x at time t is as follows: (2.1)

Diffusion approximation
For most biological systems, the state space and the dimension of the CME are large or infinite, rendering the CME impossible to solve. The stochastic process underlying the CME can be simulated by the spatial SSA (details can be found in appendix A). Similar to the existing methods [14][15][16][17], the compartment size is first appropriately chosen such that diffusive jumps are usually more frequent than reactions because of the assumption of the homogeneity of reactions in each compartment. In the case of frequent diffusive jumps, the spatial SSA may become computationally inefficient. Here we shall approximate the diffusion processes using Gaussian random vectors with time-dependent means and covariances. Given X(t 0 ) = x 0 = (x 0 11 , x 0 12 , . . . , x 0 ki , . . . , x 0 KN ), let n kL (t) and n kR (t) denote the numbers of leftward and rightward diffusive jumps of molecules from the kth compartment between time t 0 and time t. Under the assumption that a large number of diffusive jumps occurs between reactions in each compartment, we can approximate the numbers of diffusive jumps as follows: and for k = 2, 3, . . . , K. The functions φ kL (t) and φ kR (t) characterize the macroscopic features of the diffusive jumps. When the copy number of molecules is large enough, the system of the average of molecule concentration will approach a deterministic diffusion system. The macroscopic features of the diffusive jumps are determined by the solution of the deterministic system. The second terms ξ kL and ξ kR are real random numbers, measuring the fluctuations around φ kL (t) and φ kR (t). The means of these random numbers are zero and the covariances can be determined by the formula presented later in this section. The constant Λ is the number of molecules per unit concentration in a compartment. For example, if the concentration 1 µM corresponds to 600 molecules in each compartment, then Λ = 600 µM −1 . When Λ 1, φ kL (t) and φ kR (t) are considered to be continuous functions of t.
Let Π (ξ 1L , ξ 2L , ξ 2R , . . . , ξ nL , t) be the probability density function of ξ 1L , ξ 2L , ξ 2R , . . . , ξ nL at time t. The function Π for molecule concentrations can be obtained from a rescaling of the probability density function p, which is for the number of molecules. The formula for Π is as follows: is the probability density function that the state of the system is x at time t with the condition X(t 0 ) = x 0 .
By performing the standard systematic expansion of the master equation equation (2.1) [29], we obtain the following equation: and comparing the order term Letting and defining the vector variable (y 1 , y 2 , y 3 , . . . , y 2K−2 ) = (ξ 1R , ξ 2L , ξ 2R , . . . , ξ KR ) , we obtain the following Fokker-Planck equation: where So the corresponding covariance matrix is This approximation is valid under the assumption (2.4), which is true when the number of diffusive molecules and the value of diffusion coefficient are large. Overall, we approximate the numbers of diffusive jumps by obtaining (i) the macroscopic features of the diffusive jumps, Φ : . . , ξ KL ) and (iii) the probability, P R (t), that no reaction will occur until time t. These three components can be solved as follows: 1. Φ satisfies the following equations: Based on the study of Othmer & Scriven [30], the eigenvalues and eigenvectors of A can be obtained analytically and used to solve the system directly. 2. Ξ is a Gaussian random vector with a mean of zero and a covariance of where Y(t) = exp(At) and B(t) = diag(∂Φ/∂t) is a diagonal matrix with the diagonal elements equal to ∂Φ/∂t.

P R (t) takes the form
where a lj is the propensity function of reaction R lj for l = 1, 2, . . . , K and j = 1, 2, . . . , M.

Adaptivity of the time step and the spatial partitioning of the compartments
To determine the next time step after time t 0 , we generate a random number r that is uniformly distributed in [0, 1] and find t such that P R (t) = r. To update the numbers of diffusive jumps at time t, we calculate Φ and generate Ξ , a random vector that has a multivariate normal distribution with a mean of zero and a covariance of ρ. However, because t is chosen as a random number, the interval between t and t 0 might be short; this would allow only a small number of diffusive jumps to occur among compartments with a low copy number of molecules, leading to poor accuracy of the Gaussian approximation. In such compartments where the Gaussian approximation fails due to a low copy number of molecules, we simulate the diffusive jumps using the SSA method. To determine which compartments fit this criterion, we compare the Poisson distribution with mean μ to the Gaussian distribution with mean μ and variance μ. It is known that as μ increases, the difference between the two distributions decreases. For example, when μ is larger than 10, we have Therefore, we apply the SSA method for diffusion in those compartments where the mean number of diffusive jumps is small (e.g. fewer than 10). For the case in which the amount of species S 1 decreases monotonically towards the Kth compartment in the deterministic model, we use the SSA method in all compartments with an index larger than or equal to k g (t), which is the smallest integer such that where Λφ approximates the mean value of diffusive jump and T A is the cut-off threshold for determining the region for the Gaussian approximation. We set T A = 10 to ensure that the absolute difference between the Poisson and Gaussian distributions is less than 0.01, as seen in (2.8). When T A increases, the SSA is applied in more compartments and it may increase the CPU time cost in our hybrid method. For two or more types of diffusive molecules, we can still apply (2.9) as long as the diffusion process of each type of molecules is independent of each other. As a result, different types of diffusive molecules have different k g values.

Algorithm overview
Given x(t 0 ) = x 0 , we perform the following steps: 1. Generate two independent random numbers r 1 and r 2 that are uniformly distributed in [0, 1]. 2. Find t such that P R (t) = r 1 and letᾱ 3. Evaluate k g using equation (2.9) to determine the compartments for which the SSA method will be used. If k g = 1, simulate diffusion in all compartments using the SSA method; if k g > 1, perform the following steps: (a) simulate diffusion between the k g th compartment and the Kth compartment using the SSA method, and treat the k g th compartment as a reflective boundary; (b) calculate Φ and ρ in equation (2.6) and equation (2.7) at time t; remove rows 2k g − 2, 2k g − 1, . . . , 2K − 2 and columns 2k g − 2, 2k g − 1, . . . , 2K − 2 in the matrix ρ, and let ρ s be the remaining submatrix; (c) generate a (2k g − 3)-variate Gaussian random vector Ξ with a mean of zero and a covariance of ρ s ; (d) evaluate n kL and n kR in equations (2.2) and (2.3) at time t, and update x k1 using n kL and n kR for 1 ≤ k ≤ k g − 2; (e) add n (k g −2)R − n (k g −1)L − n (k g −1)R + Λφ k g L to x (k g −1)1 , and add n (k g −1)R − Λφ k g L to x k g 1 ; and (f) round the x k1 values to their nearest non-negative integers.

Find the smallest values of m and q such that
Then, let the mth reaction occur in the qth compartment, and update x in accordance with reaction R qm . 5. Advance the time to t. Then, go back to step 1 until the simulation time reaches the stop criterion.
In this algorithm, the calculations ofᾱ lj in step 2 and ρ in step 3(b) are the most computationally expensive steps among all steps.ᾱ lj and ρ can be estimated using simple numerical integration quadratures such as the trapezoidal rule, which we use in the test cases presented below. In step 3(b), x pro production region of receptor [N] [ C]

Example I: morphogen system with linear degradation
First, we consider a one-dimensional morphogen system in which morphogens diffuse out from a local production region and undergo degradation throughout the entire domain (figure 1a). This model was used to study the stochastic effect on patterning of the Drosophila wing disc [31,32]. The deterministic dynamics of morphogen concentration [M] can be described by the following PDE system with no-flux boundary conditions: where D M is the diffusion coefficient, d M is the degradation rate coefficient and V(x) is the production rate of morphogen. We define the production rate as

Example II: morphogen-receptor system
Here we consider a one-dimensional morphogen-receptor system in which receptors are produced in the entire region and bind with morphogen to induce morphogen degradation (figure 1b). This model involves the stochastic effect in a binding process between two kinds of molecules [32,33]. The deterministic dynamics of three kinds of molecule concentrations (morphogen [M], receptor [R] and morphogen-receptor complex [W]) can be described by the following PDE system with no-flux boundary conditions: The morphogen production function V(x), the domain and its discretization are defined as in example I.       Table 2 shows that the CPU time cost of the SSA is much higher than the cost of the hybrid method when D M = 5 µm 2 s −1 , 10 µm 2 s −1 or 20 µm 2 s −1 . In particular, the CPU time cost of the hybrid method is around 20% of that of the SSA when D M = 20 µm 2 s −1 .

Example III: morphogen system with two types of diffusive molecules
In this example, we want to study the performance of the hybrid method for the system with two types of diffusive molecules. Many studies on morphogen system were based on reaction-diffusion models with only one diffusion term in either free or non-signalling bound-morphogens [31][32][33]. However, this type of one-diffusion model is not biologically complete because it is possible to have more than one type of diffusion transport. Morphogen models with two types of diffusive molecules have been proposed and studied in [34][35][36]. Here we consider a morphogen model with both diffusion of free morphogens and movement of non-signalling morphogen complexes through a 'bucket brigade' process [35] (figure 1c and

Example IV: two-dimensional morphogen system
Here, we study the performance of hybrid method for a two-dimensional morphogen system which is based on example I ( figure 1a). The two-dimensional domain [0, x max ] × [−y max , y max ] is divided into 250 compartments with uniform dimension 2 × 2 µm, which is based on the cell size of Drosophila wing disc.  The domain size we consider is x max = 100 µm and y max = 5 µm. We define the morphogen production function as The initial condition for simulations is [M](0, x, y) = V(x, y)/d M . The parameters for this example are equal to the set we used in example I. Five hundred simulations were used to calculate the statistical quantities.
In this case, the number of morphogens decreases monotonically along the x-axis in the deterministic model, so the condition (2.9) is applied on x-direction.

Example V: non-monotone pattern
In general, the adaptive method can be applied for some types of non-monotone pattern. Similar to example I, when the morphogen production region is considered at the centre of the domain as in the one-dimensional domain [0, 200], the same setting used in example I can generate a single-peak solution as in figure 7a. We reformulate the condition (2.9) by considering the largest region [k g1 , k g2 ] near the peak such that for all j ∈ [k g1 , k g2 ], we have  This condition ensures the accuracy of the Gaussian approximation in the region at the centre. Figure 7a,b displays the means and standard deviations of the number of molecules in each compartment at simulation time 20 s with D = 40 µm 2 s −1 . In the results, 500 simulations were used to calculate the statistical quantities for each case. Figure 7c,d shows that the relative differences of means and standard deviations between the simulations by the hybrid method and the SSA are less than 0.15. For efficiency, the CPU time cost of the hybrid method is around 57% of that of the SSA. The adaptive method can be extended to patterns consisting of multiple peaks if the peaks can be determined by the deterministic system.

Example VI: Turing system
In the last example, we consider an activator-substrate Turing system consisting of a short-range diffusion for the activator and a long-range diffusion for the substrate. The normalized one-dimensional activator-substrate Turing system [37] can be described by the following two-equation system with the no-flux boundary conditions: In the hybrid method, the Gaussian approximation is applied for simulating the diffusion processes of substrates which have rapid diffusion processes.
The Gaussian approximation is used only for the fast diffusion in substrates. Since the chemical gradients may not be monotonic in space, the time-adaptive criterion (2.9) is not used in this example, and instead, the Gaussian approximation for the diffusion process of substrates is applied everywhere in space. Since the number of substrates is large enough in each compartment, the approximation has a good accuracy in the entire space.
The study using different methods (figure 8) indicates that two stochastic methods show a similar pattern for A at t = 10. The averages of CPU time costs in the 500 simulations for different methods indicate that the SSA costs 5.61 × 10 3 s for each simulation and the hybrid method costs 2.16 × 10 3 s, suggesting a 50% speed-up of the hybrid method.

Discussion
We have introduced a new algorithm to accelerate the stochastic simulation of reaction-diffusion systems. In this hybrid approach, the numbers of diffusive jumps in regions with a high level of molecules are calculated using Gaussian vectors, whereas the diffusive jumps in regions with a low level of molecules, along with the reaction events, are simulated using the SSA. Because of the diffusion approximation, the size of the time step, which predominantly depends on rate of reactions, can be significantly larger than that allowed by the existing approaches. Thus, the hybrid method is particularly effective for diffusion-dominant systems. Moreover, the diffusion approximations for different diffusing species are performed independently, which makes the method particularly advantageous for application to systems with multiple diffusing species.
To determine the compartments for which the diffusion approximation is applied, we use a macroscopic quantity that can be easily calculated. In addition, the approximation region can be adaptively specified in both space and time with a good control of approximation errors. Thus, the hybrid method is effective in dealing with spatial distributions of molecules that may vary significantly over time or may not be monotonic in space.
The Gaussian random vector is directly related to the matrix whose non-zero location is determined by the communication among compartments. The partitioning of a domain with an irregular geometry may be achieved using existing software (COMSOL or URDME [38]) to enable application to complex domains. To extend our method for the system incorporating rapid and slow reactions, we can accelerate the simulation of rapid reactions by coupling our method with the hybrid SSA/τ -Leaping strategy [6] to create a hybrid diffusion approximation + SSA/τ -Leaping strategy. The overall approach introduced in this work will be an efficient and accurate algorithm for simulating biological and physical systems in which number of molecular copies has a large range within the spatial domain and diffusion is dominant comparing with other types of reactions.