Making sense of snapshot data: ergodic principle for clonal cell populations

Population growth is often ignored when quantifying gene expression levels across clonal cell populations. We develop a framework for obtaining the molecule number distributions in an exponentially growing cell population taking into account its age structure. In the presence of generation time variability, the average acquired across a population snapshot does not obey the average of a dividing cell over time, apparently contradicting ergodicity between single cells and the population. Instead, we show that the variation observed across snapshots with known cell age is captured by cell histories, a single-cell measure obtained from tracking an arbitrary cell of the population back to the ancestor from which it originated. The correspondence between cells of known age in a population with their histories represents an ergodic principle that provides a new interpretation of population snapshot data. We illustrate the principle using analytical solutions of stochastic gene expression models in cell populations with arbitrary generation time distributions. We further elucidate that the principle breaks down for biochemical reactions that are under selection, such as the expression of genes conveying antibiotic resistance, which gives rise to an experimental criterion with which to probe selection on gene expression fluctuations.


Introduction
Exploring the consequences of cell-to-cell variability is crucial to understand the functioning of endogenous and synthetic circuitry in living cells [1][2][3][4]. In clonal cell populations, differences between cells arise from several dynamical effects. An important source of heterogeneity is the intrinsic stochasticity in biochemical reactions [5][6][7]. Equally important, but often overlooked, is the substantial variability of individual cell division timings. In fact, two sister cells do not divide at the same time [8][9][10], which leads to more heterogeneous cell ages in clonal populations. Although both of these factors have been studied independently, their contributions cannot easily be separated when cells are growing and dividing [11,12]. Approaches that allow to investigate the interplay of these effects remain widely unexplored.
Stochasticity in the levels of molecules per cell is commonly modelled using the stochastic simulation algorithm [5,13]. While this approach fares well for non-growing cells, it does not take into account the fact that molecule levels per cell need to double over the cell division cycle, at least on average. A number of studies therefore considered lineages that track the biochemical dynamics inside a single dividing cell over many generations [14][15][16][17][18][19]. Such approaches are well equipped to model the statistics observed in mother machines [20,21], for example, or using cell tracking [22] that allow monitoring intracellular biochemistry in isolated cells over time. Modelling approaches that also account for the substantial variations observed in cell division timing are only currently being developed [23,24]. Generation times of single Escherichia coli [10] and budding yeast cells [25], for example, vary up to 40% and 30% from their respective means, and similar values have been observed in mammalian cells [26].
On the other hand, population snapshots are commonly used to quantify heterogeneity across clonal cell populations. Such data are obtained from flow cytometry [27] or smFISH [28], for instance. An important source of heterogeneity in these datasets stems from the unknown cell-cycle positions [29]. Sorting cells by physiological features-such as using cell-cycle markers, DNA content or cell size as a proxy for cell-cycle stage-are used to reduce this uncertainty [27,30,31]. It has also been suggested that simultaneous measurements of cell age, i.e. the time interval since the last division, could allow monitoring the progression of cells through the cell cycle from fixed images [30][31][32][33]. Presently, however, there exists no theoretical framework that addresses both cell-cycle variability and biochemical fluctuations measured across a growing cell population, and thus we lack the principles that allow us to establish such a correspondence.
In applications, it is often assumed that the statistics observed over successive cell divisions of a single cell equals the average over a population with marked cell-cycle stages at a single point in time [34]. In statistical physics, such an assumption is referred to as an ergodic hypothesis, which once it is verified leads to an ergodic principle. Such principles certainly fare well for non-dividing cell populations, but it is less clear whether they also apply to growing populations, in particular, in the presence of fluctuating division times of single cells. While this relationship can be tested experimentally [35,36], we demonstrate that it is also amenable to theoretical investigation.
In this article, we develop a framework to analyse the distribution of stochastic biochemical reactions across a growing cell population. We first note that the molecule distribution across a population snapshot sorted by cell ages disagrees with the statistics of single cells observed in isolation, similarly to what has been described for the statistics of cell-cycle durations [8,37,38]. We go on to show that a cell history, a single cell measure obtained from tree data describing typical lineages in a population [39][40][41][42][43], agrees exactly with agesorted snapshots of molecule numbers. The correspondence between histories and population snapshots thus reveals an ergodic principle relating the cell-cycle progression of single cells to the population. The principle gives important biological insights because it provides a new interpretation to population snapshot data.
In the results, we investigate the differences of the statistics of isolated cell lineages and population snapshots. Section 2.1 develops a novel approach to model the stochastic biochemical dynamics in a growing cell population. We derive the governing equations for an age-sorted population and formulate the ergodic principle. In §2.2, we demonstrate this principle using explicit analytical solutions for stochastic gene expression in forward lineages and populations of growing and dividing cells. Our results are compared with stochastic simulations directly sampling the histories of cells in the population. Finally, in §2.3, we elucidate using experimental fluorescence data of an antibiotic-resistance gene that testing the principle allows us to discriminate whether a biochemical process is under selection.

Results
Several statistical measures can be used to quantify the levels of gene expression in single cells and populations. Distributions obtained across a cell population, such as those taken from static images, represent the final state of a growing population (figure 1a, shaded green cells). Cells in isolation, by contrast, can be modelled as random paths in the lineage tree, denoted as forward lineages (figure 1a, black line), that follow either of the two daughter cells with equal probability. Stochastic simulations show that the distributions of these two statistics disagree (figure 1b), and thus, apparently, the population dynamics violates the ergodic hypothesis. In the following analysis, we provide a novel ergodic principle relating how single cells progress through the cell cycle to the distribution of a growing cell population. To formulate the principle, we introduce histories (figure 1a, red line), a single-cell statistic that has the same distribution as a population that is sorted by cell ages (figure 1c,d). Such histories are obtained by choosing an arbitrary cell in the population and tracing its evolution back to the ancestor from which the whole population originated.

Biochemical dynamics in cell populations
We model stochastic biochemical reactions in a population of growing and dividing cells. We assume that each cell contains a pool of interacting biochemical species X 1 , X 2 , . . ., X N S reacting via a network of R intracellular reactions of the form where r ¼ 1, . . ., R and n + j,r are the stoichiometric coefficients. To model the effect of cell divisions, we associate to each cell an age t measuring the time interval from cell birth. If cells divide with an age-dependent rate g(t), which is independent of the number of molecules x in the network under consideration, the division times t d of each cell are distributed according to Equivalently, for a given division-time distribution one obtains the corresponding rate function via : While the forward-lineage approach is summarized in appendix A, we here focus on the population dynamics. To this end, we associate with the state of the population the density of cells n(t, x, t) with molecule count x and age between t and t þ dt at time t. This quantity represents the outcome of repeated snapshots of the population growth process, which averages out variations in the total number of cells (see appendix B for a detailed derivation). The mean number of cells in the population is then given by ð2:2Þ Since the probability for a cell to divide at age t is given by g(t) dt, the rate of change in the cell density obeys where Q is the transition matrix for the biochemical reactions [w r (x À n r )n(x À n r , t, t) À w r (x)n(x, t, t)], ð2:3bÞ w r is the propensity and (n r ) i ¼ n þ ir 2 n 2 ir is the stochiometric vector of the r-th reaction. Equation (2.3a) must be supplemented by a boundary condition accounting for cell divisions. Because the number of newborn cells equals twice the number of dividing cells, the condition reads ð2:3cÞ The division kernel B(x j x 0 ) in equation (2.3c) models the inheritance of x daughter molecules from x 0 mother molecules and is given by  where B 1 (x j x 0 ) and B 1 (x 0 2 x j x 0 ) are the marginal distributions of inherited molecules for the two daughter cells (see appendix B for details). Note that the total molecule numbers are conserved during cell division. For example, when molecules are partitioned with equal probability between daughter cells, B 1 and B are binomial [16,44].
A simple algorithm that enables simulating the biochemical dynamics in the population exactly, which we will refer to as the First Division Algorithm, is given in box 1.
Step 2 simulates the transitions due to biochemical reactions, equation (2.3b), while step 3 implements the boundary condition (2.3c) for cell divisions. The density of cells with molecule count x and age t obtained from several snapshots then obeys equations (2.3).
While stochastic simulations are relatively easy to carry out, equations (2.3) are generally difficult to solve because they represent an integro-partial differential equation. To allow for analytical progress, we consider the long-term behaviour in which the population grows exponentially with rate l, n(t, x, t) N 0 e lt P(t, x) ð2:5Þ and identify P(t, x) as the joint distribution of cell ages and intracellular molecule counts in the population. We treat this distribution synonymously with the population snapshot. Using ansatz (2.5) in equation (2.3), we obtain Similarly, the corresponding boundary condition is given by equation (2.3c) but with replacing n(t, x) by P(t, x).

Age distribution in a population
The age distribution measures the fraction of cells in the population that reach a given age. It is obtained from marginalizing the population distribution over the molecule numbers Summing equation (2.6) and solving (see appendix C for details), we find that the age distribution obeys which recovers the results in [8,9,45]. The integral in the above equation denotes the probability that a cell has not divided before reaching age t. The population growth rate l needs to be determined from the boundary condition (appendix C), which leads to the characteristic equation ð2:9Þ also known as the Euler -Lotka equation [8,9,45]. The largest root of this equation yields the population growth rate l. For a discussion of the relation between the population growth rate and the mean division time, see [10,45].

Molecule-number distribution in an age-sorted population
Next, we consider the probability of observing x molecules in a cell of age t. This conditional probability is given by the number of cells with age t and molecule count x divided by the number of cells at that age It can be verified by plugging equation (2.10) into (2.6) that the distribution obeys the master equation Similarly, inserting equation (2.10) with (2.5) into boundary condition (2.3c), we deduce that the distribution of inherited molecules obeys We identify the density r with the ancestral division-time distribution [8,40] describing the past division-times in a growing cell population. Taken together, equations (2.11) describe the distribution in an age-sorted population. Combining these equations with the age distribution given in the previous section they contain the full statistical information of the population distribution. An extension of this result to heritable division times is provided in appendix C.
Since histories contain only ancestral cells, their division times are distributed according to the ancestral divisiontime distribution r, as has been pointed by Wakamoto et al. [40]. In figure 2a, we compare the division-time distributions in forward lineages and histories for variable division times. We observe that the distribution tails of w are suppressed in cell histories due to exponential dependence on the growth rate l. Specifically, we highlight the fact that cells with division times shorter than ln 2/l are over-represented in histories compared with forward lineages, while cells with longer division times are over-represented.
Box 1: First Division Algorithm to simulate a cell population up to time t f . 1. Initialization. At time t ¼ 0, initialize the cell population by assigning to each cell an age t i , a division time t d,i and molecule count x i . 2. Biochemical reactions. Determine the next dividing cell from j ¼ argmin i (t d,i 2 t i ) and Dt ¼ min i (t d,i 2 t i ). Advance the molecule numbers of each cell independently from age t i to t i þ Dt using the Gillespie algorithm and advance time from t to t þ Dt. 3. Cell division. Replace the dividing cell by two newborn daughter cells of zero age. Assign to one of these a molecule number distributed according to B 1 (x j x j ), depending on the mother's molecule count x j , and assign the remaining molecules to the other daughter. Assign to each daughter independently a division time distributed according to w(t d ).

Ergodic principle for cell populations
We now explain how these quantities can be computed from population data. To this end, we use the approach presented by Nozoe et al. [42] enumerating all lineages of a given population tree. We denote the molecule numbers in lineage j by (x 1,j (t 1,j ), . . ., x D,j (t d,j )) with D j completed cell divisions, where x i,j (t) is the time-course from birth to division of molecule numbers of the ith cell in that lineage. We then evaluate the average of a function f(x) over a lineage of cells at a given age t, is the number of cells in the lineage that reach age t before dividing.
Because forward lineages track either one of the two daughter cells with equal probability, the probability of choosing lineage j from the tree is 2 2Dj , which decreases with the number of cell divisions [42]. Because the division times in forward lineages follow the distribution w, we have The distribution under the integral is the lineage-probability described in appendix A. Next, we consider the distribution along histories that track an arbitrary cell from the population and trace back its evolution. The probability of choosing such a history is the same for every lineage and therefore equals the inverse number of cells N(t) in the population. The statistics of histories can thus be interpreted as a typical lineage. We note that equations (2.11) can be understood as describing the distribution along a lineage, similar to the forward-lineage approach develop in appendix A (cf. equations (A 1)), but by replacing the division-time distribution w with the one of the ancestral population r. Since histories are composed entirely of ancestral cells, their division times follow the ancestral division-time distribution r. It hence follows that histories posses the same statistics as in the age-sorted population, that is The distribution P(x j t) is the lineage distribution given by the solution of equations (2.11) and equals the distribution in an age-sorted population, as we have shown in §2.1.2. We thus formulate the ergodic principle as the average of the histories of single cells in a population obtained over many cell divisions equals the average over an age-sorted cell population at every time point. A more intuitive way of stating the result is that the endpoints of all lineages have the same distribution P(x j t) as cells of t in the population. Note that equations (2.13) and (2.14) are equal only for deterministic division times. In fact, most lineages f j are close to the history-average given by the right-hand side of equation (2.14), as we show in the following section through the use of examples ( §2.2). Before proceeding, we investigate the age distribution that yields the fraction of cells reaching age t in a history. Because division times in histories are distributed according to r and the probability that a cell has not divided before age t is where kt d l r is the mean of the distribution r, which arises as a normalizing constant (see appendix D). The result is notably different from the age-structure in a population P(t), compare equation (2.8). For example, the corresponding age distributions for gamma-distributed division times are compared in figure 2b. It thus follows that the average over a history of molecule numbers is not generally the same as the population average irrespective of age. The only exception to the rule are memoryless division times, i.e. constant division rates resulting in exponentially distributed division timings. In this case, equation (2.8) and (2.15) agree, but interestingly, the population distribution differs from the forward-lineage distribution.

Analytical solutions for forward lineages, histories and snapshot distributions
To obtain analytical solutions we make use of the generating function approach. For this purpose, we restrict ourselves to The boundary condition (2.11b) for the probability generating In the following, we demonstrate how explicit solutions for the lineage, histories and snapshot distributions can be obtained for simple examples of stochastic gene expression but arbitrary division-time distributions.

Molecule synthesis and degradation
As a first application, we consider a birth-death process.
Molecules are synthesized at a constant rate k 0 and are degraded via a first-order reaction with rate k 1 , To test whether differences between forward lineages and histories could be significant in a single population, we used the First Division Algorithm (box 1) to simulate a population tree corresponding to figure 1a. We then select histories by choosing cells at random from the population and tracing their evolution back through the population tree. Similarly, we choose forward lineages by following the first cell in the population over many cell cycles selecting either daughter cell with equal probability. A qualitative comparison of few representative time-courses shows that forward lineages exhibit fewer cell divisions but higher molecule counts than histories (figure 3a), consistent with the predicted agedistributions, which contain more old cells in forward lineages than in histories (figure 2b).
To quantify whether histories represent typical lineages in a finite population, we simulate 40 lineage trees up to a population size of N ¼ 100, 1000 and 10 000. We then calculate mean and second moments for each individual lineage in the tree and evaluate their distributions across all lineages (figure 3b,c). As the population size increases the distribution of first and second moments becomes centred about the moments of the history distribution which are smaller than the corresponding ones in forward lineages. In particular, the moments of histories correspond to the distribution modes verifying that histories are typical lineages, even for finite populations.
Next, we investigate how to analytically characterize the distributions of both processes. The generating function of the corresponding master equation obeys the evolution equation Because molecules are produced and degraded independently, their number x per cell can be separated into a sum of two independent contributions: (i) the amount of molecules newly produced and degraded up to age t, and (ii) the amount of molecules inherited after cell division. Clearly, the first contribution is Poissonian and its mean is given by m(t) ¼ (k 0 /k 1 )(1 2 e 2k1 t).
The second contribution needs to be determined from the boundary condition as we show in the following. The solution to equation (2.19) can be written as a product of generating functions of the aforementioned contributions where the first factor is the generating function of newly produced and degraded molecules until age t. Using this solution in equation (2.17) we find the condition ð2:21Þ If cell divisions occur at deterministic time intervals, the integral is straightforward. In this case, the distribution of inherited molecules p 0 is Poissonian with mean m p (t) ¼ ((1 2 e Àk1t d )/ (2 2 e Àk1t d ))(k 0 /k 1 ) e Àk1t and depends on the division-time t d . This result has been obtained earlier by Schwabe et al. [29], and as a particular case by Johnston et al. [19]. The solution turns out to be more involved in the presence of division-time variability. Certainly, equation (2.20) could be solved for particular choices of r(t d ). Since, however, the division-time variability is difficult to characterize, in general, we aim to solve this equation for arbitrary distributions. To this end, we expand G 0 (z À 1) ¼ X 1 n¼0 a n (z À 1) n ð2:22Þ and write an equation for the series coefficients a n . Plugging equation ( wherer is the Laplace transform of the division-time distribution in equation (2.11c). From the above formula all coefficients can be computed recursively using a 0 ¼ 1. It follows using G 0 ((z 2 1) e 2k1t ) in equation (2.22) and differentiating at z ¼ 1 that the probability of observing x molecules inherited after cell division is ) nÀx a n e Ànk1t : ð2:24Þ Note that the distribution depends on the cell age t because the inherited molecules are degraded over the cell cycle. Because the number of molecules produced up to age t is Poissonian, the total amount of molecules for a cell of age t obeys The distributions of forward lineages are obtained by substituting w for r in equation (2.23) (appendix A).
In figure 4a, we show the resulting distributions of inherited molecules (t ¼ 0) for different levels of cell-cycle variability (figure 4a,i) for the cases of an age-sorted population (solid lines) and for the lineage statistics (dashed lines). Division times are assumed to be gamma-distributed, which fits cell-cycle variability observed in many bacteria [8,10]. The molecule number distributions are obtained using either r or w in equation (2.23) and truncating equation (2.24) after the first 150 terms. In agreement with the theoretical predictions, lineage and histories distributions are similar for small division time variability (red lines) but become significantly different as variability increases (yellow lines). Also shown are the corresponding distribution for varying degradation rates (figure 4a,ii). We observe that forward lineages and history statistics are comparable for unstable molecules (fast degradation) but not for stable ones (slow degradation). This finding is in line with the intuition that rapid degradation averages out timing fluctuations.
To support the numerical results, we evaluate the mean number of molecules at birth E[x j 0] ¼ a 1 ¼ (k 0 =k 1 )(1 Àr(k 1 ))= (2 Àr(k 1 )). For fast degradation, the expression reduces to E[x j 0] 1 2 (k 0 =k 1 ) in both forward lineages and histories because it is independent of the division times. For slow degradation, however, we obtain E[x j 0] k 0 E r (t d ) in histories and with equality for deterministic division times [10], we conclude that cells in histories and the population contain fewer molecules than in forward lineages. One can further show that the difference between population and forward lineages E[ increases with the coefficient of variation of timing fluctuations.
In figure 4b, we illustrate the corresponding distributions for t ¼ 0.5 corresponding to half of the mean generation time.
We observe that both cell-cycle variation (figure 4b,i) and degradation (figure 4b,ii) mitigate the discrepancies between lineage and history statistics. Intuitively, this could also be concluded from the fact that equation (2.19), and similarly the distribution of molecules equation (2.11a), approaches a steady state independent of the number of inherited molecules for large t. For all cases, we verified the ergodic principle by stochastic simulations using the First Division  To investigate the effect of the overall distribution in the population irrespectively of cell age, we numerically integrate the analytical solution (2.25) over the agedistribution P(t) of the population given by equation (2.8), ; the result is shown in figure 4c. The corresponding age distribution in a forward lineage is given in appendix D. We find that these predictions (solid lines) are in excellent agreement with the simulated distributions across populations (shaded areas) but not with the distributions across forward lineages (dashed lines).

Expression of a stable protein
Characteristic mRNA half-lives in E. coli are of the order of 5 min, while doubling times range from 20 min to several hours. Given the findings of the previous section, this suggests that generation time variability has little effect on mRNA distributions. Protein half-lives in living cells, however, occur on timescales much longer than the doubling time, meaning that many proteins are essentially stable and diluted mainly through cell divisions.
To investigate protein variability in growing cell populations, we model the expression of such a protein including mRNA transcription via the reactions and mRNA ! k2 mRNA þ protein: ) ð2:26Þ Using the First Division Algorithm (box 1), we simulate a population of dividing cells and collect representative forward lineages and histories (figure 5a). Given that empirical distributions of rescaled log-division times in E. coli are invariant and bell-shaped across conditions [46], we assume lognormal distributed division times. We observe that protein time-courses corresponding to forward lineages have higher protein levels and are noisier than histories. Deriving analytical expressions for the distribution of proteins is more involved because the amount of produced protein generally depends on the amount of inherited mRNA. We use the simplifying assumption that the mRNA half-life is much shorter than the cell-cycle time, for which the reactions (2.26) reduce to a single reaction [47,48] ; ! k0 m Â protein; ð2:27Þ with proteins being produced in stochastic bursts of size m with distribution n m . For the reactions (2.26), the burst size distribution n m is geometric with mean b ¼ k 2 /k 1 corresponding to the mean number of proteins produced per mRNA lifetime [48]. The rate of change for the generating function of this process satisfies where g(z À 1) ¼ Àk 0 P 1 m¼0 n m (z m À 1) ¼ bk 0 z=ðbz À 1Þ. It can be verified using equation (2.28) and the boundary condition (2.17) that the exact generating function solution is ð2:29Þ where K(z) ¼ P 1 n¼1 lnr(g((z À 1)=2 n )) andr is the Laplace transform of r. Marginalizing the above equation over the age distribution, we obtain the generating function for the protein number distribution in the population G(z) ¼ e K(z)P (g(z À 1)): ð2:30Þ An explicit expression for the Laplace transform of the age-distributionP is given in appendix D, equation (D 3).
The distributions corresponding to the generating function, equation (2.29), are obtained numerically via the inverse transform duG(e iu j t) e Àiux : ð2:31Þ Note that because the moment-generating function does not exist for the lognormal distribution, we computed the Laplace transform of the division-time distribution directly. In practise, it was sufficient to truncate the sum in K(z) after the first 10 terms. In figure 5b, we show the resulting distribution of inherited protein in a population (t ¼ 0, red line). We observe excellent agreement between the theoretical solution and the history statistics obtained from stochastic simulations of 40 population trees (shaded red area), which verifies the ergodic principle. Instead, the distribution obtained across forward cell lineages shows a broader

Breakdown of the ergodic principle under selection
So far, we considered traits that do not affect the cell division rate. To test the validity of the principle in the opposite case, we simulate the production of a biomolecule whose levels increase division rate. We find that the distributions in histories do not coincide with the age-sorted population (figure 6a). The history distribution exhibits significantly larger tails resulting in a higher mean number of molecules (21.6 in histories versus 12.4 in population). This breakdown suggests that one could in principle discriminate the factors that affect cell fitness.
To investigate this effect, we consider Fisher's reproductive value [49], denoted by n, which counts the relative number of future offspring for a cell in a given phenotypic state. Interestingly, it can be defined as the ratio of the distribution of histories and the age-sorted population statistics [50]. For a given trait x observed in cells of age t, we here use their conditional distributions that provide the following factorization The first factor is the reproductive value due expression of x, involving the distributions across the age-sorted population P(x j t) and across histories r(x j t). The second factor, n(t) ¼ P h (t)/P(t), denotes the contribution from other factors affecting the division rate and is given by the ratio of the age distribution of histories P h (t) and the age distribution of the population P(t), which are generally different [10,40]. According to the ergodic principle the first contribution equals 1, when x does not affect the division rate, and thus the reproductive value is solely determined by 'other' factors that are not explicitly known and modelled via a stochastic interdivision time. When x increases division rate, higher levels of x can be selected upon and the reproductive value n(x j t) increases with x (figure 6a, inset). Thus, when the molecule levels are selected upon, ancestral cells contain more molecules on average than cells of the present population (figure 6a). We investigate this dependence using recently acquired tree data of the antibiotic-resistance gene smR conveying resistance to the antibiotic streptomycin [42]. The protein is fused to a fluorescent reporter allowing direct observation. We obtain age-sorted distributions using measured total fluorescence after cell division (t ¼ 0), while lineage-weighted distributions are computed using equations (2.13) and (2.14). In the absence of antibiotic treatment, the distributions of new-born cells in the age-sorted, histories and forward lineages are essentially the same (figure 6b). In agreement with the ergodic principle, the reproductive value due to the expression of the protein is constant (figure 6b, inset). In the presence of sub-inhibitory antibiotic doses, the distribution in histories differ from the one acquired across the population (figure 6c) and, moreover, the reproductive value increases with fluorescence (inset), which indicates selection on higher expression levels. These results are in agreement with the known function of the gene and highlight that testing the proposed ergodic principle can be used to probe selection in cell populations.

Discussion
We presented a modelling approach for stochastic biochemical reactions in dividing cell populations. The theory enabled us to identify an ergodic principle between single cells and the population, which can be stated as the average of the histories of single cells in a population obtained over many cell divisions equals the average over an age-sorted cell population at any single time point. Cell histories capture the statistics of typical lineages in a population, which are obtained by choosing an arbitrary individual of a clonal population and tracking back its evolution to the ancestor from which the population originated. This principle provides an interpretation of population snapshot data because it identifies the sample paths that are associated with age-sorted snapshot data with typical cell histories. Histories thus allow us to characterize the progression of single cells in the population through the cell cycle. We emphasized that the statistical difference between forward lineages and histories arises from variable division times, cells dividing faster than the population growth rate being over-represented in histories as compared to forward lineages. These deviations are expected to be significant since generation times of single cells are highly variable in microbial cell populations.
It is important to point out that the principle requires the cell age to be known. The exception to this rule is the case of exponential division times for which the time-average of histories equals the one across snapshots. In the general case, gating techniques as used in flow cytometry can be used to narrow the range of physiological differences and thus correspond to age-sorted distributions, which according to the principle agree with the histories of single cells. Mother machines, on the other hand, provide statistics of individual forward lineages assuming that cells divide symmetrically. Thus our findings suggest potential differences when studying the dynamics of gene expression in different experimental devices, even when cell-cycle positions are explicitly known.
Different notions of ergodicity have been discussed in the literature. Rocco et al. [17], for instance, challenged the ergodic hypothesis between an ensemble of individual lineages and their time-averages. They found that the hypothesis only holds true for fast degrading molecules. We found a similar dependence for the ergodicity between snapshots of a growing population and individual histories, which is justified when studying the dynamics of short-lived molecules such as mRNAs. The majority of proteins in E. coli and yeast, however, have half-lives much longer than typical population doubling times [51,52], for which the present ergodic principle accounts for. We demonstrated this dependence using analytical solutions to lineage, history and population distributions of simple stochastic models of gene expression.
Although we developed the approach for biochemical reactions, the general principle applies to heritable non-genetic traits, such as plasmid numbers or chromatin states. Specifically, it applies to any trait of interest that does not affect the cell division rate. A counterexample of such a trait is cell size to which the history analysis has recently been applied [43]. Here, we found the that the likelihood to find an ancestral cell with a positively selected trait value is higher than for cells of the present population. Thus testing the ergodic principle could be used to experimentally probe this effect on cell fitness for any trait of interest.

Apendix B. Master equation description of snapshot probabilities and snapshot densities
We here derive a stochastic description for the outcome of snapshots in a dividing cell population. The state of the system may be described by observations of molecule content and cellular age ft i , x i g i¼N , where N is the present number of cells. Note that t i and x i but also N are random variables for a single snapshot [9]. When cells are indistinguishable, a snapshot is equivalently characterized by the empirical distribution function Generally, more than a single snapshot is available, for instance by collecting snapshots from different microcolonies. Aggregating snapshots averages out the dependence on the individual stochastic realizations of ft i , x i g i¼N , and yields the snapshot density quantifying the frequency with which cells with molecule count x and age t are observed in the population. In the following, we first derive a master equation description for the functional probabilities P[s, t] of observing a particular realization of the snapshot s(x, t) at time t. Second, we perform the average over all possible snapshots to obtain the density n(x, t). An alternative approach to the one presented here would extend the kinetic framework given in [53] to incorporate biochemical reactions.
We consider two types of transitions: (i) cell division in which molecules are partitioned between the two daughter cells and (ii) biochemical reactions. The rate of cell division for a single cell of age t 0 is g(t 0 ) and hence the propensity of cell divisions in the population is ðB 3Þ At division, the mother cell is replaced with two new cells of zero age. The snapshot, therefore, changes instantaneously by where t 0 and x 0 are the respective age and molecule count of the mother cell, and x 1 and x 2 the molecule counts of the daughter cells. When a biochemical reaction with stoichiometry n r occurs, we replace the mother cell with molecule count x 0 by a cell of the same age but x 0 þ n r molecules. The snapshot therefore changes as follows: ðB 5Þ For brevity of notation, we define the step-operator, which acts as E +1 x0,t0 f[s( ] on any functional f of the snapshot function s(x, t). Combining the changes from divisions and R possible reactions, we obtain the change in the snapshot probability P[s, t], describing the probability of all possible snapshots, which obeys where we assume that B(x 1 , x 2 j x 0 ) is the probability of partitioning x 0 molecules into the proportions x 1 and x 2 of the daughter cells. The snapshot density n(x, t, t) from repeated experiments is obtained as follows: Thus multiplying equation (B 6) by s(x, t), integrating over all possible snapshots, and using the definition of the total derivative, we obtain the master equation for the snapshot density @ @t nðx; t, tÞ þ @ @t nðx, t, tÞ ¼ ÀgðxÞnðx, tÞ þ dðtÞ X ðw r ðx À n r Þnðx À n r , t, tÞ À w r ðxÞnðx, t, tÞÞ, where B 1 (x j x 0 ) ¼ P x2 B(x, x 2 j x 0 ) and B 2 (x j x 0 ) ¼ P x1 B(x 1 , x j x 0 ) are the marginal distributions of the partition probability. If the molecule numbers are conserved during cell division, we must have B 2 (x j x 0 ) ¼ B 1 (x 0 2 x j x 0 ). Integrating the above equation over a small interval Ð e Àe dt containing the newborn cells and using n(x, e) ! n(x, 0), n(x, 2 e) ! 0 and definition (2.4) of the main text, we find that the second and third term on the right-hand side of the above equation give the boundary condition (2.3c) of the main text. The remaining terms describe the rate of change in between divisions and are equal to equation (2.3a) with (2.3b) of the main text.
To account for cell divisions, we supplement the above equation by a boundary condition, which again replaces each mother cell by two daughter cells of zero age but with the trait x 0 replaced by x after cell division. The latter can be expressed as P(0, t d , x) ¼ 2 B(x j x 0 , t)P(t, t 0 d , x 0 ): ðC 5bÞ Summing equation (C 5a) over all possible trait values x and solving for P(t, t d ), we find that the age-division-time distribution is where u denotes the Heaviside function. The division times of newborn cells in the population follow P(t d j 0), which is obtained from The constant P(0) can be determined by integrating equations (C 5) over all possible values of t, t d and x and combining the results to obtain P(0) ¼ 2l: ðC 8Þ Specifically, marginalizing equation (C 6) over the division times t d , the age distribution becomes Interestingly, when division time is inherited this distribution is different from the one in a forward lineage. The above equation has first been derived by Powell [8] to investigate the effect of mother -daughter correlations, albeit using a different approach. The population growth rate is implicitly defined through the normalizing condition In particular, we note that the distribution below the integral is the ancestral division time distribution This distribution counts all divisions that occurred in a stationary population up to a certain point in time. The distribution of 'future' division times is P(t d ) which satisfies the relation

C.3. Distribution of a trait along histories
Finally, we turn our attention to the distribution of molecule numbers for cells of a given age. To this end, it is important to note that the molecule numbers are non-anticipating in that they are statistically independent of division time t d but they only depend on cell age P(x j t, t d ) ¼ P(x j t): ðC 15Þ It is worth pointing out that this statement assumes the division times to be independent of the trait of interest.
In effect, the distribution of molecules at a certain cell age evolves according to @ @t P(x j t) ¼ Q P(x j t): ðC 16aÞ The boundary condition can be derived by letting P(t, x, t d ) ¼ P(x j t)P(t, t d ) in equation (C 5b), which gives P(x j 0) ¼ B(x j x 0 , t d )r(t d )P(x 0 j t d ): ðC 16bÞ We observe that this is the same equation as obtained in a forward lineage but replacing the lineage distribution of division times with the ancestral distribution r(t d ), equation (C 12). Because histories contain only ancestral cells, the above equation also determines the distribution in cell histories. In summary, we have shown that the distribution of a cellular trait along a history equals the distribution in a dividing cell population with inherited division times.