Cell-cycle coupled expression minimizes random fluctuations in gene product levels

Expression of many genes varies as a cell transitions through different cell-cycle stages. How coupling between stochastic expression and cell cycle impacts cell-to-cell variability (noise) in the level of protein is not well understood. We analyze a model, where a stable protein is synthesized in random bursts, and the frequency with which bursts occur varies within the cell cycle. Formulas quantifying the extent of fluctuations in the protein copy number are derived and decomposed into components arising from the cell cycle and stochastic processes. The latter stochastic component represents contributions from bursty expression and errors incurred during partitioning of molecules between daughter cells. These formulas reveal an interesting trade-off: cell-cycle dependencies that amplify the noise contribution from bursty expression also attenuate the contribution from partitioning errors. We investigate existence of optimum strategies for coupling expression to the cell cycle that minimize the stochastic component. Intriguingly, results show that a zero production rate throughout the cell cycle, with expression only occurring just before cell division minimizes noise from bursty expression for a fixed mean protein level. In contrast, the optimal strategy in the case of partitioning errors is to make the protein just after cell division. We provide examples of regulatory proteins that are expressed only towards the end of cell cycle, and argue that such strategies enhance robustness of cell-cycle decisions to the intrinsic stochasticity of gene expression.

Mathematical models have played a key role in predicting the impact of bursty expression on noise in the level of a given protein. However, these studies have primarily relied on models where synthesis rates are assumed to be constant and invariant of cell-cycle processes. While such an assumption is clearly violated for cell-cycle regulated genes, replication-associate changes in gene dosage can alter expression parameters genome wide [30][31][32][33]. It is not clear how such cell-cycle dependent expression affects the stochastic dynamics of protein levels in single cells. To systematically investigate this question, we formulate a model where a cell passes through multiple cell-cycle stages from birth to division. Cell cycle is coupled to bursty expression of a stable protein and the rate at which bursts occur depend arbitrarily on the cell-cycle stage (Fig. 1).
In addition to stochastic expression in bursts, the model incorporates other physiological noise sources, such as, variability in the duration of cell-cycle times and random partitioning of molecules between daughter cells at the time of division [34][35][36][37][38][39][40][41][42].
In the proposed model, some cell-to-cell variability or noise in the protein level is simply a result of cells being in different cell-cycle stages (i.e., asynchronous population). We illustrate a novel approach that takes into account such cell-cycle effects, and quantifies the noise contribution just from bursty expression and partitioning errors. Formulas obtained using this approach reveal that cell-cycle dependent expression considerably alters noise levels, always affecting contributions from bursty expression and partitioning errors in opposite ways. Intriguingly, our results show existence of optimal strategies to synthesize a protein within the cell cycle that minimize noise contributions for a fixed mean protein level. For example, the noise contribution from bursty expression is minimal when the protein is synthesized only towards the end of cell cycle. We discuss intuitive reasoning behind these optimal strategies, and provide examples of proteins that are expressed in this fashion to enhance fidelity of cell-cycle decisions.

Cell division
Cell-cycle stage 1 Cell-cycle stage n Cell-cycle stage 2 Cell-cycle stage 3 Top: The outer loop shows an individual cell from birth to division passing through cell-cycle stages C 1 , C 2 , . . . , C n , with transition rates between stages give by λ i , i ∈ {1, 2, . . . , n}. The cell is born in stage C 1 and division is initiated in C n . The inner loop (transcriptional cycle) represents the rate at which protein expression bursts occur and is given by k i in cell-cycle stage C i . Bottom: Representative trajectory of the protein level in an individual cell through multiple cell cycles (dashed lines). In this case, the transcription rate is assumed to double at the cell-cycle midpoint due to replication-associated increase in gene dosage. The spike train above represents the firing times of burst events. Steady-state distribution of the protein copy numbers obtained from running a large number of Monte Carlo simulations is shown on the right. The cell cycle was modeled by choosing n = 20 stages with equal transition rates between. Protein expression was assumed to occur in geometric bursts with B = 10 and molecules were partitioned between daughter cells based on a binomial distribution.

Model coupling cell cycle to gene expression
We adopt a phenomenological approach to model cell cycle and divided it into n stages C 1 , C 2 , . . ., C n . A newborn cell is in stage C 1 and transitions from C i to C i+1 with rate λ i . In stage C n , cell division is initiated with rate λ n , and upon division the cell returns to C 1 . In the stochastic formulation of this model, the cell resides in stage C i for an exponentially distributed time interval with mean 1/λ i , and cell-cycle duration is a sum of n independent, but not necessarily identical, exponential random variables. These stages can be mathematically characterized by Bernoulli processes c 1 (t), c 2 (t), . . ., c n (t), where c i (t) = 1 when the cell is in stage C i and c i (t) = 0 otherwise. Based on the model structure, these processes satisfy The latter equality results from the fact that only one of the c i can be equal to 1 at any given time. In addition, where the symbol denotes the expected value. Next, we describe the coupling between the cell cycle and stochastic expression models.
We assume that gene-expression bursts occur at a Poisson rate k i in cell-cycle stage C i . Using the abovedefined Bernoulli processes, the burst arrival rate can be compactly written as n i=1 k i c i (t). Let x(t) denote the number of protein molecules in a singe cell at time t. Then, whenever burst events occur, the protein level is reset as where the burst size B ∈ {0, 1, 2, . . . } is a random variable independently drawn from an arbitrary distribution, and reflects the net contribution of transcriptional and translational bursting. As is true for most proteins in E. coli and S. cerevisiae, we assume a stable protein without any active degradation between burst events [43][44][45]. At the time of cell division (as dictated by the cell-cycle model), the protein molecules are randomly partitioned between daughters. This corresponds to the following reset that is activated during division where the mean and variance of x + (level just after division) conditioned on x (level just before division) are given by respectively. The first equation in (5) shows that the number of molecules is approximately halved during division, while the second equation quantifies the stochasticity in the partitioning process through the parameter α. The ideal case of zero partitioning errors corresponds at α = 0, where x + (t) = x(t)/2 with probability one. Binomial partitioning, where each molecule has an equal chance of ending up in one of the two daughter cells, is given by α = 1 [46][47][48]. Finally, values of α > 1 represent additional noise in the partitioning process that arise when protein molecules form multimers, or reside in organelles that are themselves subject to binomial partitioning [34,49]. The overall model coupling cell cycle to expression is illustrated in Fig. 1 together with a representative trajectory of x(t).

Mean protein level for cell-cycle driven expression
We illustrate an approach based on closing moment dynamics for deriving an exact analytical formula for the mean protein level. The first step is to obtain differential equations describing the time evolution of the statistical moments for x(t) and c i (t). These equations can be derived using the Chemical Master Equation (CME) corresponding to the stochastic model presented in the previous section (see Appendix A in SI). In particular, time evolution of the means (first-order moments) is given by Steady-state analysis of (6a) yields the average value of Bernoulli processes as which can be interpreted as the fraction of time spent in the cell-cycle stage C i . We use to denote the expected value of a stochastic process as t → ∞.
Note that the dynamics of x in (6b) is "not closed", in the sense that it depends on second-order moments xc n . This leads to the well-known problem of moment closure that often arises in stochastic chemical kinetics [50][51][52][53][54][55][56][57]. It turns out that in this case, the model structure can be exploited to automatically close moment equations. This is done by augmenting the system of equations in (6) with the time evolution of moments of the form xc i At the first look, these equations are unclosed and depend on third-order moments of the form xc i c n .
However, exploiting the fact that c i c j = 0 from (1) leads to trivial closure After using (9) in (8a), the mean protein level can be computed exactly by solving a linear dynamical system given by (6) and (8). At steady-state, the linear equations can be solved recursively to yield where B is the mean protein burst size. Since c i 's are binary random variables, the mean protein level conditioned on the cell-cycle stage (i.e., synchronized cell population) can be obtained as Furthermore, using (10) and the fact that n i=1 c i = 1, Next, we investigate the mean protein level x in some limiting cases. Consider equal transition rates between cell-cycle stages λ i = n/T , which corresponds to an Erlang distributed cell-cycle durations with mean T and shape parameter n. In this scenario and further reduces to when the rate of expression bursts k i = k is constant throughout the cell cycle. Finally, in the limit of

Protein noise level for cell-cycle driven expression
The mathematical approach illustrated above is now used to obtain the noise in protein copy numbers. By noise, we mean the magnitude of fluctuations in x(t) that can be attributed to two stochastic mechanisms: bursty expression and random partitioning. Note that even in the absence of these mechanisms, there will be cell-cycle related fluctuations with protein molecules accumulating over time and dividing by half at random cell-division times. To correct for such cell-cycle driven fluctuations, we define another stochastic process y(t) that estimates the protein level if expression and partitioning were modeled deterministically. More specifically, within the cell cycle y(t) evolves according to the following differential equatioṅ which is the deterministic counterpart to the stochastic expression model presented earlier. At the time of cell division, the level is divided exactly by half with zero partitioning errors, i.e., α = 0 in (5). This allows us to define a new zero-mean stochastic process z(t) corrected for cell-cycle effects that measures the deviation in the protein count in the original stochastic model (x) from its expected levels if noise mechanisms were modeled deterministically (y). The protein noise level can now be defined through the dimensionless quantify measuring the steady-state variance in z(t) normalized by the square of the mean level. Since x = y and xy = y 2 (see Appendix B in SI), it can be rewritten as In the context of prior work, y 2 / y 2 is interpreted as the "extrinsic noise" in gene expression resulting from cell-cycle effects. It is typically measured by the covariance in the singe-cell expression of two identical copies of a gene with common cell-cycle regulation [58,59]. In contrast, CV 2 is the "intrinsic noise" resulting from stochasticity in gene expression and partitioning processes, and is measured by subtracting the extrinsic noise from the total noise Having appropriately defined the noise level, we next compute it using moment equations. The time evolution of the moments z 2 and z 2 c i are given by (see Appendix C in SI) and depend on the fourth-order moments z 2 c 1 c n . Exploiting the model structure as before, it follows from (1) that z 2 c 1 c n = 0 , and (6), (8), (21) constitute a "closed" set linear differential equations. Steady-state analysis yields the following noise level (see Appendix C in SI) Partitioning errors (22) that is inversely proportional to the mean x . The noise can be decomposed into two terms: the first term represents the contribution from protein synthesis in random bursts and depends on the statistical moments of the burst size B. The second term is the contribution from partitioning errors and depends linearly on α.
Recall that α measures the degree of randomness in partitioning of molecules between daughter cells, and is defined through (5). Interestingly, results show that the effect of cell-cycle regulation on the noise level can be quantified through a single dimensionless parameter that is uniquely determined by the number of cell-cycle stages in the model (n), transition rates between stages (λ i ), and protein synthesis rates across stages (k i ). Note from (22) that β affects the noise terms in opposite ways -any coupling of cell-cycle to expression that increases β will attenuate the contribution from bursty expression but amplifies the contribution from partitioning errors. Finally, we point out that in the case of non-bursty expression (B = 1 with probability one) and binomial partitioning (α = 1) and the noise level is always consistent with that of a Poisson distribution 1 irrespective of the value of β, and hence the form of cell-cycle regulation.

Optimal cell-cycle regulation to minimize noise
We explore how different forms of cell-cycle regulation affect CV 2 and begin with the simplest case of a constant synthesis rate k i = k, i ∈ {1, 2, . . . , n} throughout the cell cycle. This case would correspond to a scenario where the net rate of expression (across all copies of a gene) remains invariant to replication- 1 The coefficient of variation squared for a Poisson distributed random variable is inverse of its mean associated changes in gene dosage, as has recently been shown in different organisms [32,33]. Further assuming equal transition rates λ i = n/T (Erlang distributed cell-cycle durations) which reduces to β = 2 as n → ∞. Thus, in this important limit of no cell-cycle regulation (equal k i 's) and deterministic cell-cycle duration (large n), Next, consider the following strategies for coupling cell cycle to gene expression: 1. The burst arrival rate is assumed to increase by two-fold at the cell-cycle midpoint due to gene duplication. Assuming even n, this corresponds to 2. Expression only occurs at the start of cell cycle, i.e., k 1 = k and all other k i 's are zero.
3. Expression only occurs at the end of cell cycle, i.e., k n = k and all other k i 's are zero.
4. Expression only occurs at the cell cycle midpoint, i.e., k n 2 = k and all other k i 's are zero.
For a mathematically controlled comparison, the parameter k is adjusted using (12) from case-to-case so as to maintain a fixed average number protein molecules. The noise levels corresponding to the different forms of cell-cycle regulation are illustrated in Fig. 2. Interestingly, duplication of the protein expression rate within the cell cycle leads to a lower noise contribution from bursty synthesis, as compared to a constant rate throughout the cell-cycle. Moreover, expressing the protein only at the start (end) of cell cycle yields the highest (lowest) noise contribution from bursty synthesis. As expected from (22), the noise contribution from partitioning errors exhibits a completely opposite trend (Fig. 2).
The above analysis begs an intriguing question: Is there an optimal way to express a protein during the cell cycle that maximizes/minimizes noise levels? Since the form of cell-cycle regulation impacts CV 2 through β, this amounts to choosing k i 's so as to maximize/minimize it. Our result show that β is bounded from both below and above (see Appendix D in SI) The minimal value of β = 1 is attained when expression only occurs at the start of cell cycle, i.e., a non-zero k 1 and all other k i 's are zero. In this case with the lowest noise contribution from partitioning errors, but the highest contribution from bursty synthesis. In contrast, the maximum value of β = β max is attained when expression only occurs at the end of cell cycle, i.e., a non-zero k n and all other k i 's are zero. Note form (28) that β max → ∞ as λ n → ∞ (time spent in stage C n approaches zero), in which case and the noise contribution from bursty synthesis is minimal.
In summary, if bursty expression is the dominant source of noise (high B and low α), then CV 2 in minimized for a given x when the protein is made in the shortest time window just before cell division (Fig. 3). On the other hand, if randomness in partitioning error is dominant (low B and high α), the optimal strategy is to make the protein just after cell division. Finally, we point out that these optimal strategies also minimize stochastic variation in protein counts among synchronized cells, where all cells are in the same cell-cycle stage (see Appendix E in SI).

Discussion
Theoretical model of stochastic gene expression have played a pivotal role in understanding how noise mechanisms and biologically relevant parameters generate differences in protein/mRNA population counts between isogenic cells [60][61][62][63][64][65][66]. Here we have expanded this theory to consider cell-cycle regulated genes.
Our approach involves a general model of cell cycle, where a cell transitioning through an arbitrary number of stages from birth to division. The protein is assumed to be expressed in random bursts, and the rate at which bursts arrive varies arbitrarily with cell-cycle stage. In the case of translational bursting of proteins from mRNA, the burst arrive rate corresponds to the mRNA synthesis (transcription) rate. In contrast, for transcriptional bursting of mRNAs, the burst arrive rate corresponds to the frequency with which a promoter become transcriptionally active. The key contribution of this work is derivation of (12) and (22) that predict the protein mean and noise levels for a given form of cell-cycle regulation.
Derivation of noise formulas enable uncovering of optimal cell-cycle regulation strategies to minimize CV 2 for a fixed mean protein level. In the physiological case of large bursts (B 1) and binomial partitioning of proteins between daughter cells (α = 1), the contribution from bursty synthesis dominates CV 2 . Our results show that in this scenario, expression of the protein just before division is the optimal strategy (Fig. 3). Intuitively, such a strategy can be understood in the context of the number of burst events from birth to division needed to maintain a given x throughout the cell-cycle. It turns out that this number Protein level in an individual cell across multiple cell cycles for two strategies: a fixed transcription rate throughout the cell cycle (top) and transcription only occurring just before cell division (bottom). Trajectories obtained via Monte Carlo simulations are shown for the stochastic model (blue) and a reduced model where noise mechanisms are modeled deterministically (gray). These levels are subtracted to obtain a zeromean stochastic process z(t), where fluctuations resulting from cell cycle are removed (black). Steady-state distribution of z obtained from 10, 000 MC simulation runs is shown on the right, and the bottom strategy leads to lower variability in z for the same mean protein level. Cell cycle and expression was modeled as in Fig. 1 and burst arrival rates were chosen so as to ensure a average protein copy number of 150 molecules per cell in both cases.
is highly dependent on the form of cell-cycle regulation. Hence, any strategy that requires more burst events to maintain the same mean protein level, lowers noise through more effective averaging of the underlying bursty process, albeit being more energy inefficient. For example, if protein production only occurs at the end of cell cycle, then on average, x number of proteins have to added just before cell division. This corresponds to x / B number of burst events per cell cycle. If proteins were only expressed at the start of cell cycle, then one needs to add only x /2 number of molecules, half as much as the earlier strategy. If proteins were made at a constant synthesis rate throughout the cell cycle, then on average, 2 x /3 number of protein are added per cell cycle, which is higher than the early-expression strategy but lower than the late-expression strategy. In summary, gene product synthesis just before division requires production of the most number of protein molecules to maintain a fixed mean level within the cell-cycle, and hence, provides the most effective noise buffering through averaging of burst events. Next, we provide two recent examples of proteins that are indeed expressed in this fashion.
The green alga C. reinhardtii has a prolonged G 1 phase, where the size of a newborn cell increases by more than 2-fold. This long G 1 phase is followed by an S/M phase. Here the cell undergoes multiple DNA replication and fission cycles creating 2 d daughter cells, where d is number of rounds of division.
Recent studies suggest that the number of rounds of division is controlled by a protein CDKG1, that is only expressed just before exit from G 1 [67]. Another example, is the protein Whi5 in budding yeast S. cerevisiae and its level controls the transition of cells past the Start checkpoint. This protein in not expressed in G 1 , and is only synthesizes late in the cell cycle [68,69]. While such selective expression of these proteins plays a critical role in coupling cell size to cell-cycle decision, it may also minimize intrinsic fluctuations in protein levels from the innate stochasticity in gene expression. Clearly, a more systematic study exploring the role of noisy expression on the fidelity of these cell-cyle decisions is warranted.
It is important to point out that our analysis made various simplifying assumptions, such as, i) Excluding time evolution of cell size and size-dependent expression; ii) Instantaneous transcriptional and translational bursts that correspond to short-lived mRNAs and active promoter states; iii) Cell-cycle durations being independent random variables, implying no correlation between the division times of mother and daughter cells.
While many of these assumptions are clearly violated for cellular systems, they were necessary to obtain exact analytical solutions that provide novel insights into noise control by synchronizing gene expression to cell cycle. Further work will focus on relaxing these assumptions, in particular, the first assumption of incorporating cell size into the model. This will allow investigation of both concentration and copy number of gene products in single cells, and some recent work on modeling stochastic dynamics of cell size has already been done [70][71][72].

Author contributions
AS defined the problem and formulated the approach. MS did the mathematical derivations and both authors collaborated on writing the paper.

Acknowledgments
AS is supported by the National Science Foundation Grant DMS-1312926. We thank Cesar Vargas-Garcia and Khem Ghusinga for feedback on the manuscript.

A Moment equations describing the model
Based on standard stochastic formulation of chemical kinetics [73,74], the model describing x contains the following stochastic events Protein production: x Cell stage evolution: where the probability of having a burst of j molecules is given by p j . Whenever an event occurs, the states of the system change based on the stochiometries given in (31). On top of the arrows we showed the event propensity function ψ(x, c i ), which determines how often reactions occur, i.e., the probability that an event occurs in the next infinitesimal time interval (t, t + dt] is ψ(x, c i )dt. Time derivative of the expected value of any function ϕ(x, c i ) for this system can be written as [75] d Choosing ϕ to be x and c i , i = {1, 2, . . . , n} results in the equation (6) in the main article.

B Moment dynamics of y
The model describing x and y includes the stochastic events Protein production: x Cell stage evolution: Cell division: and the deterministic production of yẏ Time derivative of the expected value of any function ϕ(x, y, c i ) for this system can be written as [75] d ϕ(x, y, where the first term in the right-hand side is contributed from stochastic events and the second one is contributed from (34). The propensity function of events is given by ψ(x, y, c i ). The mean dynamics of y can be written by choosing ϕ to be y Dynamics of y is not closed and depends to moments yc n , hence in order to have a closed set of equations we add new moments dynamics by selecting ϕ to be yc i Dynamics of y and yc i , j ∈ {1, . . . , n} are the same as dynamics of x and xc i , j ∈ {1, . . . , n} presented in (6b) and (8) in the main text, hence x = y and xc i = yc i .
Further, dynamics of xy can be written as In order to have a closed set of equations we add dynamics of xyc i By having a closed set of equations related to xy, in the next step we add dynamics of y 2 and y 2 c i Using the fact that x = y and xc i = yc i , equations (40), (38), and (39) in steady-state results in y 2 = xy and y 2 c i = xyc i .

C Calculation of z 2
The random variable z is governed via Further in the time of division, z + is defined as Hence the model by taking into account z contains the following stochastic events Protein production: x Cell stage evolution: Cell division: and deterministic dynamics of z given in (41b). Time derivative of the expected value of any function ϕ(x, z, c i ) for this system can be written as [75] d ϕ(x, z, where the first term in the right-hand side is contributed from stochastic events and the second one is contributed from (41b). The propensity function of events is given by ψ(x, z, c i ).
By choosing ϕ to be z 2 and z 2 c i , i = {1, . . . , i} we have the following moment dynamics Note that just one of the binary states c i can be 1 at a time, thus z 2 = n i=1 z 2 c i . In order to calculate the terms z 2 c i we need to express the term z 2 c n as the first step. This term can be calculated by analyzing equation (45a) in steady-state By using a recursive process we calculate moments z 2 c i : we calculate z 2 c 1 by substituting equation (46) in equation (45b). Then we use the definition of z 2 c 1 to calculate z 2 c 2 from equation (45c) and so on Summing up all the term in equation (47) results in z 2 Finally, protein noise level can be written as where
Note that a 1 > a 2 > . . . > a n ⇒ β ≤ a 1 a n , where equality happens when all k i s are zero except k n . Using the same methodology one can see that minimum of β happens when all the rates are zero except k 1 . The minimum value of β is one.

E Noise in synchronized cells
Statistical moments conditioned on the cell cycle stage C i can be obtained using In order to calculate noise in synchronized cells we need to calculate x 2 c i d x 2 c 1 dt = 2k 1 xc 1 + k 1 B 2 + 1 4 αλ n xc n + λ n 4 x 2 c n − λ 1 x 2 c 1 , (55a) In order to calculate x 2 c n we introduce the moment dynamics of x 2 hence in steady-state By having xc i and x 2 c i from (10) and (58), we can calculate mean and noise in synchronized cells.
Using (54) yields the following conditional mean Further, the protein noise level given that cells are in stage C i is given by where