Making sense of snapshot data: ergodic principle for clonal cell populations
Abstract
Population growth is often ignored when quantifying gene expression levels across clonal cell populations. We develop a framework for obtaining the molecule number distributions in an exponentially growing cell population taking into account its age structure. In the presence of generation time variability, the average acquired across a population snapshot does not obey the average of a dividing cell over time, apparently contradicting ergodicity between single cells and the population. Instead, we show that the variation observed across snapshots with known cell age is captured by cell histories, a single-cell measure obtained from tracking an arbitrary cell of the population back to the ancestor from which it originated. The correspondence between cells of known age in a population with their histories represents an ergodic principle that provides a new interpretation of population snapshot data. We illustrate the principle using analytical solutions of stochastic gene expression models in cell populations with arbitrary generation time distributions. We further elucidate that the principle breaks down for biochemical reactions that are under selection, such as the expression of genes conveying antibiotic resistance, which gives rise to an experimental criterion with which to probe selection on gene expression fluctuations.
1. Introduction
Exploring the consequences of cell-to-cell variability is crucial to understand the functioning of endogenous and synthetic circuitry in living cells [1–4]. In clonal cell populations, differences between cells arise from several dynamical effects. An important source of heterogeneity is the intrinsic stochasticity in biochemical reactions [5–7]. Equally important, but often overlooked, is the substantial variability of individual cell division timings. In fact, two sister cells do not divide at the same time [8–10], which leads to more heterogeneous cell ages in clonal populations. Although both of these factors have been studied independently, their contributions cannot easily be separated when cells are growing and dividing [11,12]. Approaches that allow to investigate the interplay of these effects remain widely unexplored.
Stochasticity in the levels of molecules per cell is commonly modelled using the stochastic simulation algorithm [5,13]. While this approach fares well for non-growing cells, it does not take into account the fact that molecule levels per cell need to double over the cell division cycle, at least on average. A number of studies therefore considered lineages that track the biochemical dynamics inside a single dividing cell over many generations [14–19]. Such approaches are well equipped to model the statistics observed in mother machines [20,21], for example, or using cell tracking [22] that allow monitoring intracellular biochemistry in isolated cells over time. Modelling approaches that also account for the substantial variations observed in cell division timing are only currently being developed [23,24]. Generation times of single Escherichia coli [10] and budding yeast cells [25], for example, vary up to 40% and 30% from their respective means, and similar values have been observed in mammalian cells [26].
On the other hand, population snapshots are commonly used to quantify heterogeneity across clonal cell populations. Such data are obtained from flow cytometry [27] or smFISH [28], for instance. An important source of heterogeneity in these datasets stems from the unknown cell-cycle positions [29]. Sorting cells by physiological features—such as using cell-cycle markers, DNA content or cell size as a proxy for cell-cycle stage—are used to reduce this uncertainty [27,30,31]. It has also been suggested that simultaneous measurements of cell age, i.e. the time interval since the last division, could allow monitoring the progression of cells through the cell cycle from fixed images [30–33]. Presently, however, there exists no theoretical framework that addresses both cell-cycle variability and biochemical fluctuations measured across a growing cell population, and thus we lack the principles that allow us to establish such a correspondence.
In applications, it is often assumed that the statistics observed over successive cell divisions of a single cell equals the average over a population with marked cell-cycle stages at a single point in time [34]. In statistical physics, such an assumption is referred to as an ergodic hypothesis, which once it is verified leads to an ergodic principle. Such principles certainly fare well for non-dividing cell populations, but it is less clear whether they also apply to growing populations, in particular, in the presence of fluctuating division times of single cells. While this relationship can be tested experimentally [35,36], we demonstrate that it is also amenable to theoretical investigation.
In this article, we develop a framework to analyse the distribution of stochastic biochemical reactions across a growing cell population. We first note that the molecule distribution across a population snapshot sorted by cell ages disagrees with the statistics of single cells observed in isolation, similarly to what has been described for the statistics of cell-cycle durations [8,37,38]. We go on to show that a cell history, a single cell measure obtained from tree data describing typical lineages in a population [39–43], agrees exactly with age-sorted snapshots of molecule numbers. The correspondence between histories and population snapshots thus reveals an ergodic principle relating the cell-cycle progression of single cells to the population. The principle gives important biological insights because it provides a new interpretation to population snapshot data.
In the results, we investigate the differences of the statistics of isolated cell lineages and population snapshots. Section 2.1 develops a novel approach to model the stochastic biochemical dynamics in a growing cell population. We derive the governing equations for an age-sorted population and formulate the ergodic principle. In §2.2, we demonstrate this principle using explicit analytical solutions for stochastic gene expression in forward lineages and populations of growing and dividing cells. Our results are compared with stochastic simulations directly sampling the histories of cells in the population. Finally, in §2.3, we elucidate using experimental fluorescence data of an antibiotic-resistance gene that testing the principle allows us to discriminate whether a biochemical process is under selection.
2. Results
Several statistical measures can be used to quantify the levels of gene expression in single cells and populations. Distributions obtained across a cell population, such as those taken from static images, represent the final state of a growing population (figure 1a, shaded green cells). Cells in isolation, by contrast, can be modelled as random paths in the lineage tree, denoted as forward lineages (figure 1a, black line), that follow either of the two daughter cells with equal probability. Stochastic simulations show that the distributions of these two statistics disagree (figure 1b), and thus, apparently, the population dynamics violates the ergodic hypothesis. In the following analysis, we provide a novel ergodic principle relating how single cells progress through the cell cycle to the distribution of a growing cell population. To formulate the principle, we introduce histories (figure 1a, red line), a single-cell statistic that has the same distribution as a population that is sorted by cell ages (figure 1c,d). Such histories are obtained by choosing an arbitrary cell in the population and tracing its evolution back to the ancestor from which the whole population originated.
Figure 1. Ergodic principle between single cells and the population. (a) Lineage tree of a clonal population leading to heterogeneous distribution of molecule numbers per cell (green colour). Two types of lineage statistics characterize the tree: (i) forward lineages (black line) originate from a common ancestor, end at an arbitrary cell in the population, and (ii) histories start from an arbitrarily chosen cell in the population and end at a common ancestor (red line). The conceptual difference between these measures are the probabilities with which these lineages are selected. (b) Statistics of simulated forward lineages do not agree with snapshot distributions (see box 1 and caption of figure 5 for simulation details). (c) Statistics observed across the population and the statistics after sorting cells by age, which allows inferring the cell-cycle progression. (d) The ergodic principle states that the statistics of age-sorted cells equals the distribution observed along cell histories but not along forward lineages.
2.1. Biochemical dynamics in cell populations
We model stochastic biochemical reactions in a population of growing and dividing cells. We assume that each cell contains a pool of interacting biochemical species X1, X2, …, XNS reacting via a network of R intracellular reactions of the form
While the forward-lineage approach is summarized in appendix A, we here focus on the population dynamics. To this end, we associate with the state of the population the density of cells n(τ, x, t) with molecule count x and age between τ and τ + dτ at time t. This quantity represents the outcome of repeated snapshots of the population growth process, which averages out variations in the total number of cells (see appendix B for a detailed derivation). The mean number of cells in the population is then given by

A simple algorithm that enables simulating the biochemical dynamics in the population exactly, which we will refer to as the First Division Algorithm, is given in box 1. Step 2 simulates the transitions due to biochemical reactions, equation (2.3b), while step 3 implements the boundary condition (2.3c) for cell divisions. The density of cells with molecule count x and age τ obtained from several snapshots then obeys equations (2.3).
Box 1: First Division Algorithm to simulate a cell population up to time tf.
-
1. Initialization. At time t = 0, initialize the cell population by assigning to each cell an age τi, a division time τd,i and molecule count xi.
-
2. Biochemical reactions. Determine the next dividing cell from j = argmini (τd,i − τi) and Δt = mini (τd,i − τi). Advance the molecule numbers of each cell independently from age τi to τi + Δt using the Gillespie algorithm and advance time from t to t + Δt.
-
3. Cell division. Replace the dividing cell by two newborn daughter cells of zero age. Assign to one of these a molecule number distributed according to B1(x | xj), depending on the mother's molecule count xj, and assign the remaining molecules to the other daughter. Assign to each daughter independently a division time distributed according to φ(τd).
-
4. Repeat. Repeat from 2 until t = tf.
While stochastic simulations are relatively easy to carry out, equations (2.3) are generally difficult to solve because they represent an integro-partial differential equation. To allow for analytical progress, we consider the long-term behaviour in which the population grows exponentially with rate λ,
2.1.1. Age distribution in a population
The age distribution measures the fraction of cells in the population that reach a given age. It is obtained from marginalizing the population distribution over the molecule numbers
The population growth rate λ needs to be determined from the boundary condition (appendix C), which leads to the characteristic equation
2.1.2. Molecule-number distribution in an age-sorted population
Next, we consider the probability of observing x molecules in a cell of age τ. This conditional probability is given by the number of cells with age τ and molecule count x divided by the number of cells at that age
Since histories contain only ancestral cells, their division times are distributed according to the ancestral division-time distribution ρ, as has been pointed by Wakamoto et al. [40]. In figure 2a, we compare the division-time distributions in forward lineages and histories for variable division times. We observe that the distribution tails of φ are suppressed in cell histories due to exponential dependence on the growth rate λ. Specifically, we highlight the fact that cells with division times shorter than ln 2/λ are over-represented in histories compared with forward lineages, while cells with longer division times are over-represented.
Figure 2. Generation time variability in forward lineages and histories. (a) Division times in forward lineages (dashed lines) following a gamma distribution with unit mean and coefficients of variation of 10% (red, scaled by factor ), 50% (blue) and 100% (yellow dashed line). The corresponding division time distributions in cell histories are shown by solid lines. (b) The corresponding age distributions in forward lineages (dashed), histories (solid lines) and the population (light dashed). Note that the age-distribution in a history is the same as the population for the exponential distribution (CV 100%, yellow line), and the distributions in histories and forward lineages are very similar for small cell-cycle variability (CV 10%, red lines).
2.1.3. Ergodic principle for cell populations
We now explain how these quantities can be computed from population data. To this end, we use the approach presented by Nozoe et al. [42] enumerating all lineages of a given population tree. We denote the molecule numbers in lineage j by (x1,j(τ1,j), …, xD,j(τd, j)) with Dj completed cell divisions, where xi,j(τ) is the time-course from birth to division of molecule numbers of the ith cell in that lineage. We then evaluate the average of a function f(x) over a lineage of cells at a given age τ,

Because forward lineages track either one of the two daughter cells with equal probability, the probability of choosing lineage j from the tree is 2−Dj, which decreases with the number of cell divisions [42]. Because the division times in forward lineages follow the distribution φ, we have
Next, we consider the distribution along histories that track an arbitrary cell from the population and trace back its evolution. The probability of choosing such a history is the same for every lineage and therefore equals the inverse number of cells N(t) in the population. The statistics of histories can thus be interpreted as a typical lineage. We note that equations (2.11) can be understood as describing the distribution along a lineage, similar to the forward-lineage approach develop in appendix A (cf. equations (A 1)), but by replacing the division-time distribution φ with the one of the ancestral population ρ. Since histories are composed entirely of ancestral cells, their division times follow the ancestral division-time distribution ρ. It hence follows that histories posses the same statistics as in the age-sorted population, that is

Before proceeding, we investigate the age distribution that yields the fraction of cells reaching age τ in a history. Because division times in histories are distributed according to ρ and the probability that a cell has not divided before age τ is , we have
It thus follows that the average over a history of molecule numbers is not generally the same as the population average irrespective of age. The only exception to the rule are memoryless division times, i.e. constant division rates resulting in exponentially distributed division timings. In this case, equation (2.8) and (2.15) agree, but interestingly, the population distribution differs from the forward-lineage distribution.
2.2. Analytical solutions for forward lineages, histories and snapshot distributions
To obtain analytical solutions we make use of the generating function approach. For this purpose, we restrict ourselves to binomial partitioning of molecules

2.2.1. Molecule synthesis and degradation
As a first application, we consider a birth–death process. Molecules are synthesized at a constant rate k0 and are degraded via a first-order reaction with rate k1,

Figure 3. Histories represent typical lineages of finite population trees. (a) Representative forward lineages (blue) and histories (red lines) from a single population for the birth–death process (2.18) with rates k0 = 5 and k1 = 0.1 and division times are exponentially distributed with unit rate. Simulated time-courses show molecule synthesis by biochemical reactions (step 1 in box 1) interrupted by sudden decreases in molecule number due to binomial partitioning at cell division (step 2 in box 1). (b) Smoothed frequency distributions of the mean number of inherited molecules computed for each lineage of 10 population trees with N = 100 (teal), 1000 (blue) and 10 000 cells (magenta) at the final time and rates k0 = 50 and k1 = 1. The distribution mode is centred about the mean of histories (solid grey line) but not about the means of forward lineages (dashed grey line). (c) The corresponding second moment E[x2 | 0] is shown. The result indicates that the statistics of most lineages is close to the one of histories.
To quantify whether histories represent typical lineages in a finite population, we simulate 40 lineage trees up to a population size of N = 100, 1000 and 10 000. We then calculate mean and second moments for each individual lineage in the tree and evaluate their distributions across all lineages (figure 3b,c). As the population size increases the distribution of first and second moments becomes centred about the moments of the history distribution which are smaller than the corresponding ones in forward lineages. In particular, the moments of histories correspond to the distribution modes verifying that histories are typical lineages, even for finite populations.
Next, we investigate how to analytically characterize the distributions of both processes. The generating function of the corresponding master equation obeys the evolution equation
The solution to equation (2.19) can be written as a product of generating functions of the aforementioned contributions



The solution turns out to be more involved in the presence of division-time variability. Certainly, equation (2.20) could be solved for particular choices of ρ(τd). Since, however, the division-time variability is difficult to characterize, in general, we aim to solve this equation for arbitrary distributions. To this end, we expand

In figure 4a, we show the resulting distributions of inherited molecules (τ = 0) for different levels of cell-cycle variability (figure 4a,i) for the cases of an age-sorted population (solid lines) and for the lineage statistics (dashed lines). Division times are assumed to be gamma-distributed, which fits cell-cycle variability observed in many bacteria [8,10]. The molecule number distributions are obtained using either ρ or φ in equation (2.23) and truncating equation (2.24) after the first 150 terms. In agreement with the theoretical predictions, lineage and histories distributions are similar for small division time variability (red lines) but become significantly different as variability increases (yellow lines). Also shown are the corresponding distribution for varying degradation rates (figure 4a,ii). We observe that forward lineages and history statistics are comparable for unstable molecules (fast degradation) but not for stable ones (slow degradation). This finding is in line with the intuition that rapid degradation averages out timing fluctuations.
Figure 4. Analytical solutions for a birth–death process. (a)(i) Analytical distributions of inherited molecules in a population and in forward lineages are depicted for gamma-distributed division times with 10% (red), 50% (blue) and 100% (yellow line) variations about the mean (as shown in figure 2). Simulated distributions of inherited molecules in histories are shown as shaded areas. Lineages and histories agree only for small division times variability. Parameters are k0 = 50 and k1 = 1. (a)(ii) Analytical solutions for different degradation rates, k1 = 0.1 (teal), k1 = 1 (blue) and k1 = 10 (magenta). The distribution of inherited molecules in a population (τ = 0, solid lines) coincides with the simulated histories (shaded ares), which verifies the ergodic principle, but not with the lineage distribution (dashed lines). Histories and forward lineages agree well only for large degradation rates. Parameters are k0 = 50k1 and exponentially distributed division times with unit mean (CV[τ] = 1). (b) The corresponding distributions are shown for the mid-cell cycle τ = 0.5. (c) Simulated distributions estimated irrespectively of cell age across the population (shaded areas) agree well with analytical solutions (solid lines). Also shown are the theoretical distributions across forward lineages (dashed lines).
To support the numerical results, we evaluate the mean number of molecules at birth
. For fast degradation, the expression reduces to
in both forward lineages and histories because it is independent of the division times. For slow degradation, however, we obtain E[x | 0] ∼ k0Eρ(τd) in histories and Efw[x | 0] ∼ k0Eφ(τd) in forward lineages. Since Eρ(τd) ≤ Eφ(τd) with equality for deterministic division times [10], we conclude that cells in histories and the population contain fewer molecules than in forward lineages. One can further show that the difference between population and forward lineages E[x | 0] − Efw[x | 0] = − k0ln 2 CV2φ[τ] + O(CV4φ) increases with the coefficient of variation of timing fluctuations.
In figure 4b, we illustrate the corresponding distributions for τ = 0.5 corresponding to half of the mean generation time. We observe that both cell-cycle variation (figure 4b,i) and degradation (figure 4b,ii) mitigate the discrepancies between lineage and history statistics. Intuitively, this could also be concluded from the fact that equation (2.19), and similarly the distribution of molecules equation (2.11a), approaches a steady state independent of the number of inherited molecules for large τ. For all cases, we verified the ergodic principle by stochastic simulations using the First Division Algorithm (box 1) of 40 population trees from which we chose histories at random. The corresponding distributions are shown as shaded areas in figure 4a,b, which are in excellent agreement with our analytical solutions (solid lines).
To investigate the effect of the overall distribution in the population irrespectively of cell age, we numerically integrate the analytical solution (2.25) over the age-distribution Π(τ) of the population given by equation (2.8), ; the result is shown in figure 4c. The corresponding age distribution in a forward lineage is given in appendix D. We find that these predictions (solid lines) are in excellent agreement with the simulated distributions across populations (shaded areas) but not with the distributions across forward lineages (dashed lines).
2.2.2. Expression of a stable protein
Characteristic mRNA half-lives in E. coli are of the order of 5 min, while doubling times range from 20 min to several hours. Given the findings of the previous section, this suggests that generation time variability has little effect on mRNA distributions. Protein half-lives in living cells, however, occur on timescales much longer than the doubling time, meaning that many proteins are essentially stable and diluted mainly through cell divisions.
To investigate protein variability in growing cell populations, we model the expression of such a protein including mRNA transcription via the reactions

Figure 5. Lineage and history characterization of the expression of a stable protein. (a) Representative time courses of forward lineages (blue) and histories (red) of a stable protein collected from a single population tree. (b) Simulated distributions of inherited molecules in forward lineages (blue) and histories (red) are shown as shaded areas. The analytical solution for a population (solid red line) agrees with the simulated histories but not with the lineage verifying the ergodic principle. For comparison, we also show the analytical solution of a lineage (solid blue line). (c) Distributions obtained irrespectively of cell age from forward lineages (shaded blue) and across the population (shaded blue), both of which are in excellent agreement with the simulations. In all panels, the distribution of division time is lognormal with unit mean and s.d. 3 resulting in a growth rate of approximately 1.25. Parameters for the simulation of the full gene expression model (2.26) are k0 = 20, k1 = 20 and k2 = 100.
Deriving analytical expressions for the distribution of proteins is more involved because the amount of produced protein generally depends on the amount of inherited mRNA. We use the simplifying assumption that the mRNA half-life is much shorter than the cell-cycle time, for which the reactions (2.26) reduce to a single reaction [47,48]
The rate of change for the generating function of this process satisfies




The distributions corresponding to the generating function, equation (2.29), are obtained numerically via the inverse transform
2.3. Breakdown of the ergodic principle under selection
So far, we considered traits that do not affect the cell division rate. To test the validity of the principle in the opposite case, we simulate the production of a biomolecule whose levels increase division rate. We find that the distributions in histories do not coincide with the age-sorted population (figure 6a). The history distribution exhibits significantly larger tails resulting in a higher mean number of molecules (21.6 in histories versus 12.4 in population). This breakdown suggests that one could in principle discriminate the factors that affect cell fitness.
Figure 6. The ergodic principle breaks down in the presence of selection. (a) Simulated distributions of inherited molecules for a biomolecule that increases division rate. It is seen that histories and age-sorted distributions are different and thus the ergodic principle is violated. The ratio of these probabilities yields the reproductive value, ν(x | 0), which increases with the number of molecules in the presence of selection (inset). In the simulation molecules are produced with rate k0 and divisions occur after a period τd = 50K10/(K10 + x100), where k0 = 100, K = 10 and x0 is the molecule count of that cell at birth. (b) Statistics of tree data of an antibiotic-resistance gene in the absence of antibiotic taken from Nozoe et al. [42]. The history distribution agrees very well with the population distribution. The ergodic principle is obeyed and the reproductive value of cells is independent of the gene expression (inset). (c) By contrast, in the presence of the drug, the distributions of histories and the age-sorted population disagree. The increase of the reproductive value with fluorescence indicates selection on high protein levels (inset).
To investigate this effect, we consider Fisher's reproductive value [49], denoted by ν, which counts the relative number of future offspring for a cell in a given phenotypic state. Interestingly, it can be defined as the ratio of the distribution of histories and the age-sorted population statistics [50]. For a given trait x observed in cells of age τ, we here use their conditional distributions that provide the following factorization
We investigate this dependence using recently acquired tree data of the antibiotic-resistance gene smR conveying resistance to the antibiotic streptomycin [42]. The protein is fused to a fluorescent reporter allowing direct observation. We obtain age-sorted distributions using measured total fluorescence after cell division (τ = 0), while lineage-weighted distributions are computed using equations (2.13) and (2.14). In the absence of antibiotic treatment, the distributions of new-born cells in the age-sorted, histories and forward lineages are essentially the same (figure 6b). In agreement with the ergodic principle, the reproductive value due to the expression of the protein is constant (figure 6b, inset). In the presence of sub-inhibitory antibiotic doses, the distribution in histories differ from the one acquired across the population (figure 6c) and, moreover, the reproductive value increases with fluorescence (inset), which indicates selection on higher expression levels. These results are in agreement with the known function of the gene and highlight that testing the proposed ergodic principle can be used to probe selection in cell populations.
3. Discussion
We presented a modelling approach for stochastic biochemical reactions in dividing cell populations. The theory enabled us to identify an ergodic principle between single cells and the population, which can be stated as the average of the histories of single cells in a population obtained over many cell divisions equals the average over an age-sorted cell population at any single time point. Cell histories capture the statistics of typical lineages in a population, which are obtained by choosing an arbitrary individual of a clonal population and tracking back its evolution to the ancestor from which the population originated.
This principle provides an interpretation of population snapshot data because it identifies the sample paths that are associated with age-sorted snapshot data with typical cell histories. Histories thus allow us to characterize the progression of single cells in the population through the cell cycle. We emphasized that the statistical difference between forward lineages and histories arises from variable division times, cells dividing faster than the population growth rate being over-represented in histories as compared to forward lineages. These deviations are expected to be significant since generation times of single cells are highly variable in microbial cell populations.
It is important to point out that the principle requires the cell age to be known. The exception to this rule is the case of exponential division times for which the time-average of histories equals the one across snapshots. In the general case, gating techniques as used in flow cytometry can be used to narrow the range of physiological differences and thus correspond to age-sorted distributions, which according to the principle agree with the histories of single cells. Mother machines, on the other hand, provide statistics of individual forward lineages assuming that cells divide symmetrically. Thus our findings suggest potential differences when studying the dynamics of gene expression in different experimental devices, even when cell-cycle positions are explicitly known.
Different notions of ergodicity have been discussed in the literature. Rocco et al. [17], for instance, challenged the ergodic hypothesis between an ensemble of individual lineages and their time-averages. They found that the hypothesis only holds true for fast degrading molecules. We found a similar dependence for the ergodicity between snapshots of a growing population and individual histories, which is justified when studying the dynamics of short-lived molecules such as mRNAs. The majority of proteins in E. coli and yeast, however, have half-lives much longer than typical population doubling times [51,52], for which the present ergodic principle accounts for. We demonstrated this dependence using analytical solutions to lineage, history and population distributions of simple stochastic models of gene expression.
Although we developed the approach for biochemical reactions, the general principle applies to heritable non-genetic traits, such as plasmid numbers or chromatin states. Specifically, it applies to any trait of interest that does not affect the cell division rate. A counterexample of such a trait is cell size to which the history analysis has recently been applied [43]. Here, we found the that the likelihood to find an ancestral cell with a positively selected trait value is higher than for cells of the present population. Thus testing the ergodic principle could be used to experimentally probe this effect on cell fitness for any trait of interest.
Data accessibility
This article has no additional data.
Competing interests
I declare I have no competing interests.
Funding
P.T. acknowledges a fellowship by The Royal Commission for the Exhibition of 1851 and support by the EPSRC Centre for Mathematics of Precision Healthcare (EP/N014529/1).
Acknowledgements
It is a pleasure to thank Diego Oyarzún and Vahid Shahrezaei for valuable feedback.
Appendix A. Biochemical dynamics in forward lineages
The lineage description tracks a single cell over many cell division cycles but retains information about only one daughter at each division. We consider the lineage-probability pD(x|τ) of observing x molecules in a cell of age τ after Dth division. In between divisions the distribution satisfies the chemical master equation
Apendix B. Master equation description of snapshot probabilities and snapshot densities
We here derive a stochastic description for the outcome of snapshots in a dividing cell population. The state of the system may be described by observations of molecule content and cellular age {τi, xi}i=N, where N is the present number of cells. Note that τi and xi but also N are random variables for a single snapshot [9]. When cells are indistinguishable, a snapshot is equivalently characterized by the empirical distribution function
Generally, more than a single snapshot is available, for instance by collecting snapshots from different microcolonies. Aggregating snapshots averages out the dependence on the individual stochastic realizations of {τi, xi}i=N, and yields the snapshot density
We consider two types of transitions: (i) cell division in which molecules are partitioned between the two daughter cells and (ii) biochemical reactions. The rate of cell division for a single cell of age τ′ is γ(τ′) and hence the propensity of cell divisions in the population is
When a biochemical reaction with stoichiometry νr occurs, we replace the mother cell with molecule count x′ by a cell of the same age but x′ + νr molecules. The snapshot therefore changes as follows:
For brevity of notation, we define the step-operator, which acts as on any functional f of the snapshot function s(x, τ). Combining the changes from divisions and R possible reactions, we obtain the change in the snapshot probability P[s, t], describing the probability of all possible snapshots, which obeys
The snapshot density n(x, τ, t) from repeated experiments is obtained as follows:



Integrating the above equation over a small interval containing the newborn cells and using n(x, ε) → n(x, 0), n(x, − ε) → 0 and definition (2.4) of the main text, we find that the second and third term on the right-hand side of the above equation give the boundary condition (2.3c) of the main text. The remaining terms describe the rate of change in between divisions and are equal to equation (2.3a) with (2.3b) of the main text.
Appendix C. Extension to heritable interdivision times
We here discuss an extension of the ergodic principle. We consider a general trait x and division times that are statistically dependent between mother daughter cells. Such correlations could arise, for instance, due to heritable factors such as cell size. To proceed we characterize the state of each cell by a cell age τ and a interdivision time τd. The latter is a random quantity that yields the age at which a cell will eventually divide. The distribution of cell ages at division must then satisfy
C.1. Forward lineages
At stationarity, the division time distribution along a forward lineage can then be described by
C.2. Distribution of age and division times
For the cell population, we describe the density of cells with age τ, future division time τd and the trait x, which follows the evolution equation
Summing equation (C 5a) over all possible trait values x and solving for Π(τ, τd), we find that the age-division-time distribution is
Specifically, marginalizing equation (C 6) over the division times τd, the age distribution becomes
C.3. Distribution of a trait along histories
Finally, we turn our attention to the distribution of molecule numbers for cells of a given age. To this end, it is important to note that the molecule numbers are non-anticipating in that they are statistically independent of division time τd but they only depend on cell age
Appendix D. Properties of the age distributions
Consider an age distribution of the form



