Efficient analysis of stochastic gene dynamics in the non-adiabatic regime using piecewise deterministic Markov processes

Single-cell experiments show that gene expression is stochastic and bursty, a feature that can emerge from slow switching between promoter states with different activities. In addition to slow chromatin and/or DNA looping dynamics, one source of long-lived promoter states is the slow binding and unbinding kinetics of transcription factors to promoters, i.e. the non-adiabatic binding regime. Here, we introduce a simple analytical framework, known as a piecewise deterministic Markov process (PDMP), that accurately describes the stochastic dynamics of gene expression in the non-adiabatic regime. We illustrate the utility of the PDMP on a non-trivial dynamical system by analysing the properties of a titration-based oscillator in the non-adiabatic limit. We first show how to transform the underlying chemical master equation into a PDMP where the slow transitions between promoter states are stochastic, but whose rates depend upon the faster deterministic dynamics of the transcription factors regulated by these promoters. We show that the PDMP accurately describes the observed periods of stochastic cycles in activator and repressor-based titration oscillators. We then generalize our PDMP analysis to more complicated versions of titration-based oscillators to explain how multiple binding sites lengthen the period and improve coherence. Last, we show how noise-induced oscillation previously observed in a titration-based oscillator arises from non-adiabatic and discrete binding events at the promoter site.


Introduction
Gene expression is fundamentally a stochastic biochemical process that arises from thermal fluctuations. An important source of stochastic noise comes from the discrete and random binding and unbinding events between the regulating transcription factors (TFs) and the promoter sites of the regulated genes. Conventionally, these DNA binding and unbinding events are thought to be fast compared with the downstream processes of transcription, translation and degradation [1]. This separation of timescales leads to an approximation, known as a quasi-steady state or adiabatic approximation, where the mean transcription rate simplifies to a function of the concentrations and protein-DNA dissociation constants at the promoter [2,3]. The adiabatic approximation is commonly used to reduce the number of dynamical variables (e.g. promoter states) in gene regulatory networks. However, it is also a bold assumption because experiments [4][5][6][7] show that promoter dynamics (e.g. the binding and unbinding events of TFs) can take place at a comparable, or even slower, timescale than the downstream processes of gene expression. This observation has motivated theoretical studies into the effects of slow or non-adiabatic binding on gene regulatory networks.
Many of these studies focused on the system properties at stationarity and mostly ignored the effects of non-adiabatic binding on the non-equilibrium dynamics of gene regulatory networks. In this article, we address the following questions: what are the dynamical consequences of non-adiabatic binding? What kind of modelling framework accurately describes the non-stationary dynamics of gene regulatory networks in the non-adiabatic regime? To answer these questions, we use a model of titration-based clocks to illustrate the effects of nonadiabatic binding on dynamics (e.g. oscillation) and to show how an analytical framework, known as a piecewise deterministic Markov process (PDMP), accurately describes the stochastic dynamics of the full model in the non-adiabatic regime. The PDMP is an efficient method of analysis that is valid regardless of the mechanism (e.g. slow chromatin, DNA looping, unbinding kinetics) responsible for slow-switching promoters.
This article is organized as follows. In §2.1, we introduce two idealized models of titration-based circuits commonly found in circadian clocks and immune signalling. We prove in §2.2 that limit cycles are impossible in the fast-binding (adiabatic) limit. In §2.3, we simulate the full chemical master equation (CME) to demonstrate that the titration-based circuits exhibit stochastic cycles in the slow-binding (non-adiabatic) limit. We then transform the CME into a PDMP where transitions between discrete promoter states are stochastic but the rates depend upon the faster deterministic dynamics of the transcription factor concentrations regulated by these promoters ( §2.4). The PDMP makes no assumptions regarding the timescales of promoter switching and is valid for both slow or fast switching. It is an exact formulation of the CME in the thermodynamic limit for systems where protein numbers are large. The thermodynamic limit and, hence, PDMP analysis, is well suited for stochastic gene dynamics in eukaryotic cells where cell sizes and the number of regulatory proteins can be large. We show that the PDMP framework accurately describes the observed periods and coherence of stochastic cycles in the non-adiabatic regime. We also demonstrate that the PDMP can be readily applied to more detailed and mechanistic models in §3. We conclude in §4 by discussing our results and PDMP analysis in the context of previous work on non-adiabatic binding and oscillation in gene networks.

Mathematical framework
We begin by introducing two idealized models of titrationbased gene regulatory networks commonly found in biological oscillators. These models are 'idealized' in the sense that transcription and translation are lumped into a single-stage of 'production', and the intermediate mRNA populations are not explicitly modelled. We further simplify the cis-regulatory architecture of each promoter to the fewest number of possible binding states. The purpose of these idealized models is to illustrate how PDMP analysis can be used to understand the origin and properties of the stochastic cycles that emerge in the non-adiabatic regime. We will relax some of these assumptions in later sections.

Idealized models
Both idealized models consist of two genes, which produce two kinds of regulatory proteins X (a TF) and Y (an inhibitor that titrates X into an inactive complex; figure 1). Our first model is called the activator-titration circuit (ATC) because protein X is a transcriptional activator [14,15]. In this model, X increases the production rate of inhibitor Y by binding to cis-regulatory binding sites in the promoter of gene Y with an association rate k Y . There are a total of N Y cis-regulatory binding sites in the promoter of gene Y and we assume that binding of X is sequential, such that there are a total of N Y þ 1 promoter states. Bound X dissociates sequentially from each binding site with a rate u Y . The production rate of gene Y depends nonlinearly on the number of X bound to the promoter because the production rate is b b Y (bound) when any of the binding sites are occupied; otherwise, the production rate is b f Y (free). We note that b b Y . b f Y because X is an activator. Gene X is unregulated and, thus, activator X is constitutively produced at a constant rate b X . Last, inhibitor Y inhibits the activity of TF X by titration, where one Y molecule irreversibly binds to one X molecule with  Figure 1. Schematic diagrams of the idealized (a) activator-titration circuit (ATC) and (b) repressor-titration circuit (RTC). Protein X is a transcription factor and Y is an inhibitor that can irreversibly associate with X to form an inactive complex. In the ATC, X is an activator that can sequentially bind multiple DNA sites in the promoter of gene Y and increase transcription of the inhibitor. In the RTC, X is a repressor that can bind its own promoter and repress transcription. (Online version in colour.) rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170804 a bimolecular rate of association (a) and forms a non-functional heterodimer. The idealized ATC can be modelled by the following elementary reactions: ð2:1Þ Here, s Y ¼ 0, 1, . . . , N Y identifies the promoter state by its number of bound X. A schematic diagram of the ATC can be found in figure 1a. All the elementary rate constants are defined in the sense of mass action kinetics, and x denotes the concentration of TF X in the thermodynamic limit. In the stochastic models that consider discrete molecules (detailed in §2.3), the rates have to be properly scaled by V, which is a parameter that quantifies the system size and is related to cell volume. The scaling relationship between the mass-action rate constants and the stochastic model rates can be found in appendix A. The second model is called a repressor-titration circuit (RTC) because X is a transcriptional repressor [15] (figure 1b). This model differs from the ATC in two ways. First, the inhibitor Y is now constitutively expressed at a constant rate b Y . Secondly, X negatively auto-regulates itself where X decreases its own production rate by binding to cis-regulatory binding sites in the promoter of gene X with an association rate k X . There are a total of N X cis-regulatory binding sites in the promoter of gene X and we assume that binding of X is sequential, such that there are a total of N X þ 1 promoter states. Bound X dissociates sequentially from each binding sites with a rate u X . The production rate of gene X depends nonlinearly on the number of X bound to the promoters, where the production rate of X is b b X (bound) when any of the binding sites are occupied; otherwise, the production rate is b f X (free). We note that b f X . b b X because X is a repressor. The rest of the process and parameters are similarly defined as in the ATC. The idealized RTC can be modelled by the following elementary reactions: and s X À ! uX s X À 1, if 1 s X N X (unbinding): Similarly, s X ¼ 0, 1, . . . , N X identifies the promoter state by its number of bound X.

No limit cycle in the adiabatic limit
The aim of this section is to show that mass action kinetics describing the ATC and RTC in the fast-switching (adiabatic) limit do not allow deterministic limit cycles. Below, we generically use Z ¼ X or Y as the gene index and Z ¼ X or Y as the protein index. The discrete switching events between the bound TF at the promoter sites, s Z ! s Z + 1, are a random birth-and-death process [16,17] where the birth rate k Z depends on the concentration (x) of the transcription factor (TF) X. In the fast-switching (adiabatic) limit, formally expressed as O(xk Z ),O(u Z ) ) O (any other transition rates), the variable x is a slow variable and is treated as approximately constant. In this limit, the birth and death rates are approximately constant, and the quasi-stationary distribution (QSD) of s Z is obtained using detailed balance of this one-dimensional birth-death process [16]: The effective production rate of the regulated gene, b eff Z , can be derived using the QSD (2. ð2:4Þ In the thermodynamic limit, we denote the concentrations of X and Y by x and y respectively, and the resulting mass action kinetics of the ATC and RTC are described by the following deterministic differential equations: For simplicity, we unified the expressions for the idealized ATC and RTC where b eff X (x):¼ constant b X in the ATC and b eff Y (x):¼ constant b Y in the RTC. Equations (2.5) constitute a two-dimensional dynamical system. The Bendixson criterion [18] states that limit cycles do not exist when the trace of the Jacobian, @ x F (x, y) þ @ y G(x, y), does not change sign on a simply connected domain. On the biologically relevant domain x ! 0 and y ! 0, The trace of the ATC is always negative because db eff X (x)/ dx ¼ 0. The trace of the RTC is also always negative because db eff X (x)/dx , 0. Thus, there are no deterministic limit cycles for the idealized ATC and RTC in the adiabatic limit. If we were to modify the ATC such that the activator X also stimulates its own production (i.e. positive feedback), then db eff X (x)/dx . 0 and it would be possible to have limit cycles in the adiabatic limit.

Stochastic cycles in the non-adiabatic regime
We first develop a full stochastic model that describes the dynamics where the population of molecules and the number of bound promoter sites are all discrete. We will then use this model to show the emergence of stochastic rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170804 cycles with well-defined periods in the non-adiabatic regime. The state of the model is determined by (i) the population of X, N X , (ii) the population of Y , N Y , (iii) the bound promoter state of gene X, s X and (iv) the bound promoter state of gene Y, s Y . The probability of having N X ¼ i, N Y ¼ j, s X ¼ k, and s Y ¼ l at time t is given by P i,j,k,l (t). The discrete-state stochastic process of the idealized model is described by the CME [16,17]: where we have suppressed writing the t-dependence of P i,j,k,l for brevity. The boundary conditions P i,j,k,l ¼ 0 when i , 0, j , 0, k , 0, l , 0, k . N X or l . N Y are imposed. We unified the model descriptions of ATC and RTC; for ATC, N X , k X , u X : 1 fconditiong is the characteristic function: it is equal to 1 when the condition is true; otherwise 0. The different rates and the protein population scale as a function of system size V, as outlined in appendix A [11,16]. The concentrations x and y in previous sections are interpreted as the normalized population density N X /V and N Y /V, where N X and N Y are the discrete populations of regulatory proteins X and Y .
We refer to the model (2.7) as the full CME. Standard continuous time Markov chain simulations were constructed to generate exact sample paths of the full CME of the ATC and RTC [19,20]. We chose two sets of parameters listed in table 1, where the ATC and RTC promoters have a single binding site (N Z ¼ 1Þ and the only source of nonlinearity is the titration of X by Y . We chose this parameter set because it is simple and it illustrates the fundamental ingredients of stochastic cycling in the non-adiabatic regime. We will consider more complicated cis-regulatory promoters in later sections. For each parameter set, we introduce a scaling factor l, such that the binding and unbinding rates are, respectively, parametrized by k Z :¼ l k Z and u Z :¼ l u Z . We fixed k Z and u Z and systematically change the value of l in order to examine the dynamics of the same model in both the adiabatic and non-adiabatic regime. In figure 2a,b, we present sample paths of the full CME. We do not observe limit cycles in (x, y) for the idealized ATC or RTC in the fast binding and unbinding limit (i.e. adiabatic regime, l ¼ 1000), as predicted by our analysis in §2.2. When we decreased the parameter l to 1, the system entered a regime where the timescale of binding and unbinding between the TF and gene is comparable to other processes. In this non-adiabatic regime, figure 2b shows alternating high-amplitude expression of X and Y that appears oscillatory. We measured the 'period' of each stochastic cycle using a protocol detailed in appendix B. The measured period of stochastic cycles exhibits a unimodal distribution with a dominant frequency, as shown in figure 2e.

Derivation of the PDMP approximating gene expression dynamics in the non-adiabatic regime
In this section, we develop the PDMP framework [21,22] of the full CME to analyse and understand the observed stochastic cycling in the non-adiabatic regime. The idea of PDMP is to basal production rate of gene X when the number of bound X ¼ 0 2 10 zeroth order b b X repressed production rate of gene X when the number of bound X .
the binding rate of TF X to an empty target promoter site on gene X 0 0.2l second order k Y the binding rate of TF X to an empty target promoter site on gene Y l 0 second order u X the dissociation rate of a TF X bound to promoter sites of X 0 0.4l first order u Y the dissociation rate of a TF X bound to promoter sites of Y 0.5l 0 first order a the association rate of X and Y Then, for any fixed promoter states (k, l ), we approximate the stochastic dynamics in the TF population space using a set of ordinary differential equations (ODEs), thus leaving the discrete and Markovian stochastic switching in the (k, l )-space. This approximation is accurate for large system size (V) or the thermodynamic limit [23,24]. The PDMP framework makes no assumptions regarding relative timescales and is equally valid for adiabatic and non-adiabatic regimes in the thermodynamic limit. To derive the PDMP, we first defined a continuum-limit probability density p k,l (x, y, t) / P i,j,k,l (t) with the scaled variables x: ¼ i/V and y: ¼ j/V. After inserting the p k,l (x, y, t) into the master equation (2.7), performing a Kramers-Moyal expansion [16,17] with respect to large system size V and collecting terms to the lowest order (O(V 0 )), we arrived at the coupled partial differential equations for the probability density p k,l ; p k,l (x, y, t): ð2:8Þ The coupled partial differential equations describe the evolution of joint probability density p k,l (x, y, t). Again, the 'boundary conditions' in the (k, l )-space, p k,l ¼ 0 if k , 0, l , 0, k . N X or l . N Y , are imposed. Note that the evolution contains two parts: some terms contain @ x or @ y and describe the Liouvillian flow, whereas other terms contain k Z or u Z and describe the Markovian switching between discrete promoter states (k, l ). Because the total state follows the deterministic Liouvillian flow between stochastically switching discrete state (k, l ), the resulting process is referred to as the PDMP [25,26]. Equation (2.8) describes the evolution of the joint probability density. Equivalently, the sample paths of the PDMP can be described by a set of ODEs with randomly switching parameters (which is detailed in appendix D). Using the sample-path representation, the PDMP of the ATC and RTC models with a single promoter site (N Z ¼ 1) are summarized in the schematic diagrams presented in figure 3a,b.
Kinetic Monte Carlo simulations using the algorithm described in appendix C were implemented to generate the sample paths of the PDMP in the non-adiabatic (l ¼ 1) regime (figure 2c). These PDMP sample paths capture the salient features of the dynamics of the full CME in figure 2b. For example, the measured distribution of stochastic cycle periods using the PDMP is in perfect agreement with that of the full CME (figure 2e). Numerically, the advantage of the PDMP framework is that the kinetic Monte Carlo simulations are faster than the continuous time Markov chain simulations of the full CME because x, y are determined by numerically integrating ODEs, when the system size V ) 1.

Linearization of the PDMP
While the PDMP can be numerically simulated for any given state, the evolution of the TF concentrations is described by a set of nonlinear ODEs that do not allow for analytic solutions. The nonlinearity comes from the term axy, which describes the titration (second-order reaction). We developed a linearization approximation, which uses the fast titration limit (i.e. large axy compared to any other reactions) to reduce the nonlinear ODEs into linear ones. A consequence of this linearization is that each PDMP state is described by two regimes (either x . 0, y ¼ 0 or x ¼ 0, y . 0). In the 'linearized PDMP', the ODEs are linear (represented in figure 3c,d), and the transition between these two regimes is determined by determinstic titration of x or y. In each of the compartments, the linear ODE's allow analytic solutions and facilitate quantification of the x y  random switching times. We used kinetic Monte Carlo simulations with the algorithm described in appendix C to generate sample paths of the linear PDMP in the non-adiabatic (l ¼ 1) regime. The measured distribution of stochastic cycle periods using the linear PDMP is similar to that of the full CME (figure 2e), although it tended to underestimate the shorter cycles that occur in the PDMP and full CME.

Origin of stochastic cycles
The PDMP schematic in figure 3 suggests that the stochastic cycles arise from the two-state nature of the regulated promoter, which must follow cyclical Markovian dynamics . To demonstrate, we consider a two-state promoter with constant transition rates (k þ and k 2 ), e.g. a promoter with a single binding site and a fixed concentration of a regulating TF. This trivial two-state promoter system generates stochastic cycles with a unimodal distribution of 'period' (t) given by a hypoexponential distribution: with a mean period m t ¼ 1/k þ þ 1/k 2 and variance The mean and variance of the period are a sum of the mean and variance of the individual transitions in the two-state cycle because the waiting times are independent. The period distribution is qualitatively similar to figure 2e, which suggests that stochastic dynamics of a two-state promoter explain much of the stochastic cycling observed in the single binding site ATC and RTC model. In the non-adiabatic regime, the faster protein dynamics faithfully track the underlying promoter state dynamics and generate large-amplitude stochastic cycles in (x, y). This raises the question of whether stochastic cycling between two promoter states can be called oscillation. This is difficult to answer because the distinction between stochastic cycling and oscillation is ill-defined. For example, by including mechanisms that reduce variance in the timing of individual transitions, one can produce cycles that are more coherent. In the extreme limit where each transition has no variance, the period of the two-state cycle has no variance and is indistinguishable from a deterministic limit cycle. In the following section, we will investigate potential mechanisms that reduce the variance of the stochastic transition times and make the stochastic cycles more 'deterministic'.

Increased coherence of stochastic cycles in ATC and RTC
The linearized ATC and RTC in figure 3c,d shows that the system often cycles through four discrete states that alternate between stochastic promoter switching and deterministic titration of x, y. The only state where the system has more than one 'option' is s Z ¼ 1 and x . 0 (bottom right box): it can transit to either s Z ¼ 0 and x . 0 by a dissociation event of bound X or to s Z ¼ 1 and x ¼ 0 by deterministic titration of x. When u Z is sufficiently small (i.e. slow dissociation rate in the non-adiabatic regime), the system favours the latter route, which induces a 'full cycle' through all four discrete states in the anticlockwise order (green arrow in figure 3). As described below, this 'full cycle' and the x-dependence of the association rate conspire to reduce variance and produce more coherent stochastic cycles.
The predominant resource of uncertainty in the 'full cycle' of the ATC and RTC is the stochastic promoter switching (horizontal transitions) because the titration of x (upward arrow) and y (downward arrow) are deterministic and exhibit little variance. As before, the transition rate from s Z ¼ 1 ! 0 is constant and, thus, the waiting time for dissociation is a simple exponential where r(t) ¼ ue 2ut (figure 4a). Unlike the previous model, the transition from s Z ¼ 0 ! 1 is not constant and depends on x(t), which can be quantified by computing the survival function [27]. Using the linearized ATC, x(t) can be exactly solved for the s Z ¼ 0 state: and the survival probability starting with t ¼ t 0 is equal to The distribution of binding times is uniquely determined by this survival function, which we plot for different initial conditions x 0 in figure 4b. A similar calculation can be performed for the linearized RTC (figure 4c). The variance in binding time is reduced when initial x 0 is close to zero because the system must wait until the population of x increases to a value above which binding is likely to take place. This explains why the 'full cycle' reduces variance of the total period because the system always starts at s Z ¼ 0 and x ¼ 0 (top left box in figure 3c,d) due to the previous titration and dissociation of x. Thus, x 0 ¼ 0 and the waiting time before x binds the promoter will have reduced variance. We remark that the binding times are not exponentially distributed and are dependent on the concentration of the activator (x) in general. Hence, the transition is not Markovian as some of the 'memory' is stored in the TF space (x, y).

Alternative deterministic limit without invoking the adiabatic approximation
The deterministic dynamics in equation (2.5) describe the mean (x, y) concentrations in the adiabatic limit where the effective protein synthesis rates are determined by the stationary distribution of promoter states. The PDMP framework explicitly models the stochastic binding and unbinding events and is valid in both the adiabatic and non-adiabatic limits. Here, we consider an alternative 'deterministic limit' (ADL) of the linear PDMP, where the remaining variability due to stochastic binding and unbinding is artificially set to zero and the stochastic cycle becomes a 'deterministic' limit cycle. We use the first moments of the random waiting times as a deterministic residence time of a promoter state and, thus, the dynamics in (x, y) will be deterministic. The first moments can be easily computed numerically from equation (2.11) for the linear PDMP: P{T binding . t} dt: ð2:12Þ When there is more than one possible reaction, we choose the reaction with the minimal deterministic waiting time. This is analogous to Gillespie's 'first reaction' method [19]. For the parameter set of the idealized ATC and RTC in figure 2, the average time to titrate all the X is shorter than the average dissociation time and the system cycles through all four states. The time series of the ATC and RTC in the ADL is shown in figure 2d.

Analyses of more detailed mechanistic models
The PDMP framework will now be applied to more sophisticated models of the ATC. A previous model of the ATC [15], which we call the KB model, showed that multiple binding sites lengthened the period and improved coherence of stochastic cycling. Using the PDMP, we will show that multiple binding sites per se are insufficient to improve coherence. Rather, slow mRNA dynamics and multiple binding sites conspire to push the dynamics across all the promoter states and improve the coherence of stochastic cycling. The second model of the ATC [28], which we call the VKBL model, has an additional positive feedback loop where the activator activates itself in addition to activating the inhibitor. The authors previously showed that the VKBL model exhibits excitation -relaxation or noise-induced oscillation beyond the Hopf bifurcation. We will use PDMP analysis to reveal that the fluctuations in the random rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170804 unbinding events of the bound TF on the promoter sites are essential for inducing transcriptional noise, which in turn drives the excitable system away from its stable fixed point and induces large excursions in a semi-periodic manner.

Multiple binding sites do not improve coherence of stochastic cycling in idealized ATC and RTC
One explanation for the improved coherence of stochastic cycling in the KB model is that the coefficient of variance (CV) of the total period is reduced by increasing the number of steps in the 'full cycle'. For example, if there are N independent stochastic steps in the full cycle and if the means and variances at each step are of equal magnitude, then the mean and variance of the total period scales with N but the CV decreases as ffiffiffiffi N p . To test this idea, we simulated the idealized ATC and RTC with multiple promoter sites (N Z ¼ 3) using the full CME with the parameters listed in table 1. As before, we only see stochastic cycling in the non-adiabatic limit ( figure 11a,b). Strikingly, the distribution of periods was similar to that of simulations for single binding sites; compare figures 2e -11e. To understand why multiple binding sites did not increase the period or improve the coherence of stochastic cycles, we first transformed the full CME into a PDMP (appendix D). We confirmed that simulations of the PMDP accurately reproduced the results of the full CME; see figure 11c. We then reduced the PDMP into a linearized PDMP framework, which explains why multiple binding sites in the idealized ATC and RTC do not significantly alter the length of the period or improve coherence. The linear PDMP shows that the system becomes trapped in a 'mini-cycle' between the s Z ¼ 0 and s Z ¼ 1 promoter states at the blue and red boundaries ( figure 6). The production rate changes instantaneously upon promoter state switching across the boundary, and deterministic titration of x will immediately start pushing the system upwards (red box). The timescale of titration is typically faster than that of the next stochastic binding and, thus, produces a stochastic mini-cycle around the boundary. This mini-cycle dynamic is also reflected in the ADL of the ATC and RTC, as shown in figure 11d.

Origins of improved coherence in the KB model
The KB model has several additional features compared to the idealized ATC, which could explain the observed increase in the period and coherence of stochastic cycles. First, the dynamics of mRNA transcription, degradation and protein translation are explicitly modelled. Second, the activators form homodimers before they can bind to the promoter sites and regulate gene expression. Third, the homodimers bind to the promoter sites independently and no longer need to bind sequentially (i.e. distributive binding). Last, the activator and inhibitor heterodimer is no longer irreversible and can dissociate to form monomers.
Below, we describe PDMP analysis of the KB model [15] for the ATC shown in figure 5a. Beginning with the master equation governing the KB model, we performed the system-size expansion presented in §2.4 and arrived at the following PDMP: and G À ! uG G À 1: We used the same variables and parameter set in figure 6 of the original paper [15].  9a,b. In both cases, the stochastic cycles exhibit a well-defined distribution of periods with reduced CV, as previously observed. The schematic of the linear PDMP in figure 7 suggests that the period and coherence improved because the mRNA dynamics introduce a time lag between the change in mRNA production rate and the resulting protein synthesis rate. Thus, even though the transcription rate changes instantaneously upon crossing the boundary when G ¼ 0 ! 1, the mRNA levels will respond and reach a new state on the timescale set by the mRNA degradation rate d m . This lag delays the process of deterministic titration, which requires new inhibitor synthesis, such that G can reach saturation before x is titrated. As a consequence, the KB model now goes through the largest cycle from G ¼ 0 to G ¼ 3. Given the importance of the lag, we expect the coherence of stochastic cycling to decrease upon increasing the rate of mRNA degradation and, thus, making the mRNA more responsive to changes in transcription. We tested this idea by rescaling the mRNA degradation d m ¼ 4 d m and protein translation b ¼ 4 b, such that the total protein levels stayed fixed, but the mRNA degradation rate could be varied through 4. Our results in figure 9c confirm that increasing the mRNA degradation rate via larger 4 created shorter and less coherent stochastic 'mini-cycles', similar to the idealized ATC which had a measured CV ¼ 0.622.
We noted that the variance of the waiting times of binding events in the linear PDMP was much less than those of unbinding events. This motivated us to keep only the stochastic unbinding events whose waiting times are all exponentially rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170804 distributed and take a deterministic waiting time for the binding events by evaluating the first moment of the cumulative distribution (E 2). The resulting model is summarized in figure 8 and is referred to as the reduced PDMP. In the reduced model, the only stochasticity-the random unbinding events-results in a random duration in a series of promoter states which actively produce the inhibitor I (top row of figure 8). The excellent agreement between the reduced PDMP and full CME of the KB model suggests that the variability in stochastic cycle times is mostly determined by the Figure 5. Schematic diagrams of the KB model [15] and the VKBL model [28] of the activator-titration circuit (ATC). The parametrizations were adopted from the original papers. (Online version in colour.) deterministic titration deterministic titration deterministic titration deterministic titration Figure 6. Schematic diagram of the linearized PDMP describing the idealized ATC with multiple binding sites (N Y ¼ 3) in the non-adiabatic regime. The green (circular) arrows indicates the emergent cycles which are predominantly observed in the simulations. The path of the dotted arrow is also observed, but less frequently (figure 11b). (Online version in colour.) rsif.royalsocietypublishing.org J. R. Soc. Interface 15: 20170804 stochasticity of unbinding events (figure 9d). Although the serial nature of unbinding events helps reduce the overall CV and improve the coherence of stochastic cycling, the randomness of unbinding events propagates nonlinearly and contributes to the overall uncertainty of the stochastic cycle period. For example, in each 'episode' of serial stochastic unbinding, the number of synthesized inhibitors will be a random quantity that subsequently determines the time to titrate the produced I back to zero (downward arrow) before the promoter state can deterministically cycle back to the actively producing I state (G ¼ G max and a ¼ 0).

Noise-induced oscillation in the VKBL model
We then turned our attention to the ATC model studied by Vilar et al. [28], whose schematic is shown in figure 5b. In the VKBL model, the activator activates itself in addition to the inhibitor and, thus, can exhibit deterministic limit cycles. However, the authors deliberately studied the VKBL model for a parameter set where there were no deterministic limit cycles but the system exhibited excitation -relaxation or noise-induced oscillations. Below, we will use PDMP analysis to show that stochastic promoter fluctuations are responsible for kicking the stable q 2q 3q deterministic titration deterministic titration deterministic titration deterministic titration 3ay q 2ay 2q ay 3q G = 0 (a = 0, z > 0) Figure 7. The linearized PDMP for the KB model [15]. To make the expressions compact, we define r 1 :¼ ½r A ; r 2 :¼ ½r I ; z :¼ ½I; and a :¼ ½A ; ½A þ 2½A 2 . Green (circular) arrows indicate the direction of the emergent cycle which are frequently observed. (Online version in colour.) deterministic titration deterministic titration deterministic activation ð3:2Þ We adopt the same symbols and parameters of figure 5 from the original work [28], except for discrete G A [ f0, 1g and G R [ f0, 1g, which represent the number of bound activators on the promoters of A or R. The sample path of the PDMP faithfully captures the signature of the dynamics of the full CME in the parameter regime with noise-induced oscillations; cf. figure 10a,b. The PDMP only takes into account the stochasticity of the binding and unbinding events (i.e. G Z ¼ 0 N 1); the rest of the processes are described by deterministic evolutionary equations. Thus, we can conclude that the noise-induced oscillations in the full CME are due to the discrete binding and unbinding events at the promoter site. In both the full CME and PDMP, the system constantly switches back and forth between G A ¼ 0 N 1 and produces a bursty activator mRNA population ( figure 10a,b). However, occasionally, an unbinding event takes longer than usual, which leads to a larger-than-average number of activator mRNAs. This larger-than-average number of activators titrates all the inhibitors (R) and the critical accumulation of activator excites the system through a large excursion in the phase space; see figure 10b. The ADL of the VKBL model does not exhibit any excitable excursions, as shown in figure 10c. By definition, the ADL does not exhibit any variability in the binding and unbinding events. The lack of excitable excursions in the ADL is consistent with the idea that rare fluctuations in the unbinding times are the critical ingredient for generating enough activators (A) to titrate all the inhibitors (R) in the system.

Discussion and future outlook
Dynamical models of gene expression often assume that switching between promoter states (e.g. binding and unbinding of regulatory proteins) takes place at a much shorter timescale than any other processes in the model. This idealization, known as the quasi-steady-state or adiabatic approximation [2,3], uses an effective rate inferred from the QSD of the promoter states. When the timescale of promoter state switching is  Figure 9. Numerically measured data of the KB model [15]. (a) A sample path of the full CME, (b) a sample path of the linear PDMP ( figure 7). The stochastic cycles of the model were measured using the protocol provided in appendix B. The full CME exhibited a minor fraction of short period cycles in the genetic states. To quantify the predominant longer periods (100 period), we discarded any periods less than 50 to generate panels (c,d ). Panel (c) presents the coefficient of variation (CV) of the stochastic periods measured in the full CME as a function of the scaling factor 4. Each point was computed from 10 4 stochastic cycles.
The larger the 4, the more short-lived is the mRNA. The long-lived mRNA introduces a delay in protein production and improves the CV. Panel (d ) presents the probability distribution of the stochastic periods when the scaling factor 4 ¼ 1, as measured from 10 5 stochastic cycles of the full CME, PDMP and reduced PDMP ( figure 8). (Online version in colour.) comparable to other processes, which is typically the case for natural systems, this approach fails to describe the resulting dynamics accurately.
In this article, we investigated the stochastic dynamics of biological clocks in the non-adiabatic regime. Previous work [14,15] demonstrated that time delays, which arise from slow promoter switching in the non-adiabatic regime, are important for the emergence of deterministic limit cycles. These studies modelled the transcription rate as an ensemble-averaged transcription rate of the discrete promoter states. Such a treatment would be precise if one had a large copy number of independent promoters, e.g. models in [29]. However, the copy number of genomic DNA is small in many biological systems and the transcription rate at any given time can only be one of the two discrete values b b Z and b f Z . When the switching timescale is fast (i.e., adiabatic), the promoter state goes through a large number of cycles between consecutive transcription events, and the effective rate of transcription converges to the ensemble-averaged transcription rate. However, when the switching timescale is slow (i.e. non-adiabatic), the averaged transcription rate cannot capture the nature of alternating rates of transcription events.
The effects of non-adiabatic promoter fluctuations have been investigated, mostly numerically, in the literature of biological clocks. Both the studies of Potoyan & Wolynes [30] and Gonze et al. [31] reported that slower switching rates compromise the coherent oscillation seen in the deterministic limit. In a slightly different model, Feng et al. [32] observed coherent oscillation when the binding and unbinding events were either very fast (adiabatic) or very slow (non-adiabatic). Last, stochastic resonance was reported by Li & Li [33], who showed that there exists a 'sweet spot' where the promoter switching is neither fast or slow. In the above-mentioned studies, the adopted methods range from direct computation of the eigenvalues of the truncated CME [30], direct computation of the stationary distribution [32], and direct continuous-time Markov simulations and power spectral analyses of the generated sample paths [31,33]. Although it is straightforward to carry out these analyses, they reveal little about the mechanisms of stochastic oscillations. For example, these methodologies could not answer why a system with more promoter binding sites exhibits more coherent oscillation, or quantify the impact of mRNA or post-translational reactions, e.g. dimerization.
We present a mathematical framework to analyse the stochastic dynamics of gene expression in the non-adiabatic regime. In this framework, we begin with the most detailed description of the individual molecular-based and stochastic dynamics, the CME and systematically construct the PDMPs, which retains the discrete and stochastic switching nature of the genetic states. This framework is a natural generalization of our previous work [11,21,22], and the derived PDMP has been shown to be a powerful mathematical tool to model coloured noise in stochastic gene expression [12,[34][35][36][37][38][39][40][41][42]. Our analyses showed that, for the models we investigated, the PDMP faithfully captures dynamical features of the individual molecular-based models. We further proposed a scheme to construct an alternative 'deterministic description' of the dynamics without invoking the adiabatic assumption. These analytical tools revealed the emergent non-equilibrium transitions between the discrete genetic states in the nonadiabatic regime. In the idealized models, both the ATC and RTC exhibited stochastic cycling in the discrete genetic states. We showed that a more robust and coherent oscillation (the full cycle) occurred in a regime of slower dissociation rate (small u Z in the idealized model). The analysis also revealed the interactions between the TF population and the transition between the discrete genetic states, showing that it is necessary to consider the joint process describing the TF dynamics and gene switching dynamics. While the joint process (PDMP) is Markovian, it is known that the TF dynamics alone [43] is non-Markovian. In this work, we showed that the gene switching dynamics alone is also non-Markovian.
To illustrate the practicality of these analytical tools, we analysed more sophisticated and detailed models. Interestingly, in the two models we investigated, the analysis revealed different mechanisms of to induce oscillations. In the KB model [15], we found that the oscillation was induced by the slow-transitions between the discrete genetic states, similar to the idealized models. Nevertheless, in contrast to the idealized model which exhibits similar dynamics when we changed the number of promoter sites N Z , the inclusion of the mRNA in the KB model pushes the system to transit through more genetic states. This is because the product of the gene, i.e. mRNA, no longer directly (and abruptly) regulates its own production rate and there is a delay. As the predominant stochasticity of the system arises from the random unbinding events, the ability to travel through more internal stages decreases the coefficient of variance. Consequently, more coherent oscillations were observed in the KB model, compared to the idealized models. By applying the analytical tools to the VKBL model [28], we were able to show that the average (deterministic) genetic switching events were not sufficient to induce the oscillation. Instead, longer-than-average binding events were responsible for exciting the FitzHugh-Nagumo-like system to go through a large excursion in the phase space. Biologically, these longer-thanaverage binding events are called transcriptional bursting noise. It is straightforward to show that by increasing translational bursting, achieved by simultaneously scaling up the translation rate and scaling down the transcription rate, one can also induce similar oscillations (data not shown). On a final note, the PDMP is derived from the detailed CME and can be viewed as a hybrid model which combines the continuous and deterministic TF dynamics and the discrete and stochastic promoter switching dynamics. The PDMP is related to other stochastic-hybrid approaches [44][45][46][47][48][49]. However, on a conceptual level, we explicitly demonstrate how the PDMP arises from the fully discrete CMEs in the limit of large protein numbers (e.g. eukaryotic cells). We further show that the PDMP accurately captures stochastic gene dynamics of the full CME in the non-adiabatic regime. The PDMP is therefore a promising 'bridge model' connecting detailed and mechanistic computational models and highly idealized discrete-state oscillators [50][51][52], phase oscillators [53][54][55][56] or one-dimensional delay-induced oscillators [57,58] which were previously proposed ad hoc.