Modeling microbial communities using biochemical resource allocation analysis

To understand the functioning and dynamics of microbial communities remains a fundamental challenge at the forefront of current biology. To tackle this challenge, the construction of computational models of interacting microbes is an indispensable tool. Currently, however, there is a large chasm between ecologically-motivated descriptions of microbial growth used in ecosystems simulations, and detailed metabolic pathway and genome-based descriptions developed within systems and synthetic biology. Here, we seek to demonstrate how current biochemical resource allocation models of microbial growth offer the potential to advance ecosystem simulations and their parameterization. In particular, recent work on quantitative microbial growth and cellular resource allocation allow us to formulate mechanistic models of microbial growth that are physiologically meaningful while remaining computationally tractable. Biochemical resource allocation models go beyond Michaelis-Menten and Monod-type growth models, and allow to account for emergent properties that underlie the remarkable plasticity of microbial growth. We exemplify our approach using a coarse-grained model of cyanobacterial phototrophic growth, and demonstrate how the model allows us to represent physiological acclimation to different environments, co-limitation of growth by several nutrients, as well as emergent switches between alternative nutrient sources. Our approach has implications for building models of microbial communities to understand their interactions, dynamics and response to environmental changes.

genome sequencing and quantitative high-throughout methods, the complexity of mechanistic ecosystem models, and in particular the description of microbial growth within these, have not changed substantially where ρ denotes the concentration of cells (in units of cells per volume), µ denotes the specific growth rate and D is the dilution or death rate. The specific growth rate µ is a function of the respective environment, and depends on the concentrations of one or more nutrients. Typically, a limiting nutrient n with concentration [n] is considered that is supplied with the inflow of fresh medium at a concentration [n x ]. The respective nutrient dynamics are described by where Y denotes the yield coefficient, defined as the number of microbial cells (or units biomass) per unit 95 of nutrient. The model can be readily extended to multiple microbial strains with concentrations ρ i and 96 several nutrients n k . The dynamics of a chemostat have been studied extensively [58] and equations of 97 the form (1) and (2), as well as the respective extension to multiple strains and nutrients, are commonly 98 utilized in ecosystem simulations [16,15,58,57]. 99 To evaluate the dynamics of the system requires knowledge of the specific growth rate µ as a function of the concentration of the limiting nutrient n. To this end, the most widely used approach is still to make use of the hyperbolic dependency proposed by Jacques Monod in 1949 [39], where µ max denotes the maximum specific growth rate of the microorganism in this environment and K A denotes the half-saturation constant. The Monod equation is identical to the Michaelis-Menten equation of enzyme kinetics and represents an empirical description of microbial growth. Its constant parameters are typically estimated for specific environmental conditions and reflect a particular strain or species and its functional traits related to nutrient uptake and growth [5,30]. Over the past decades, there have been several advances and alternative formulations of growth models, such as the Droop model [9] that introduces internal nutrient quotas. For phototrophic microorganisms, modifications are typically required to account for the effects of photoinhibition-the decrease of the specific growth rate for high light intensities [11]. A widely equation for phototrophic growth in dependence of the light intensity I is the Haldane equation, µ(I) = µ max · I K A + I + (I/K I ) 2 , where K I denotes the impact of photoinhibition. In the absence of photoinhibition (K I → ∞) the model is review on empirical growth models and their parameterization for different microalgae. The use of the (metabolites) within a cell or organism, and therefore allow to accurately estimate the stoichiometric 120 and energetic synthesis costs of cellular constituents. GMRs have been highly successful to predict 121 maximal growth yields of microbial organisms and other proerties of biotechnological relevance [45]. 122 More recently, large-scale constraint-based resource allocation models [18,42,19,49] were introduced 123 that allow to predict protein expression and cell compositions of microbes in specified (albeit, with the 124 exception of [50,49], constant) environments. These models are based on the insight that the (maximal) 125 flux of an enzyme-catalyzed biochemical reaction is typically constraint by the amount of the respective 126 enzyme. Since enzymes are itself the products of metabolism, incorporating enzyme-dependent flux 127 constraints gives rise to a self-consistent description of microbial growth: for any given growth rate 128 µ the set of cellular enzymes must be sufficient to sustain the synthesis of the required precursors to 129 allow for the translation of the set of catalyzing enzymes itself, as well as for the synthesis of all other 130 (non-enzymatic) compounds within a cell.

131
More formally, the synthesis rate of a cellular protein P k can be described by the equation where γ k denotes the translation rate of the protein that is required to match the dilution term µ · [P k ] of 132 cellular growth (protein degradation can be readily included but is neglected in the following). The sum 133 of all translation rates is constrained by the available ribosomal capacity and hence by the number of 134 ribosomes.

135
To account for the synthesis of metabolic precursors and other cellular components, the interconversion of internal metabolites m is described by a stoichiometric matrix N and a vector ν that denotes the rates of (spontaneous or enzyme-catalyzed) interconversion rates, Typically, intracellular metabolism is assumed to be at steady state and the dilution terms for intracellular 136 metabolites are neglected due to the high turnover of metabolites compared to their dilution by growth. In 137 this case, the mass-balance constraint on intracellular reaction fluxes simplifies to N · ν = 0.

138
To account for biochemical resource allocation, the rates of those reactions that are catalyzed by proteins are constrained by the amount of the respective catalyzing proteins where k cat,k denotes the specific activity of the enzyme or protein. The maximal uptake rate ν T of an external nutrient n x can be further constrained by the concentration of the respective nutrient and the amount of the respective transporter complex P T .
The uptake constraints can be modified to, for example, also account for diffusion limitations of nutrient   details. In the following, we refer to the type of models outlined above as biochemical resource allocation 162 models (BRAMs).

164
To exemplify the utility of biochemical resource allocation models for microbial ecology, we consider  proteins by ribosomes (R), which are itself protein complexes. The fraction of non-enzymatic proteins is represented by a (quota) protein component P Q . The remaining biomass is lumped into a metabolic 180 component Q that is synthesized from the cellular precursor C 3 by the protein complex M Q . All proteins 181 complexes represent aggregates of individual proteins. The model assumes a constant cellular density.

182
The specific growth rate is not dependent on cell size, but cell size may constraints parameters, such as 183 the surface to volume ratio. The full set of equations is provided in the Materials and Methods.

184
All enzyme-catalyzed reactions are constrained by the amount of the respective enzymes. For example, carbon uptake is constrained by the equation where [T C ] denotes the amount of uptake transporter (in molecules per cell), k cat,TC denotes the specific catalytic activity of the transport, and K C denotes the half-saturation constant of the uptake complex. We note that Equation (9) provides an upper limit only, the actual flux can be less (for example by inactivating a fraction of the uptake transporter). Likewise, additional constraints can be included, such as an upper limit on the uptake flux induced by diffusion limitations [5]. Other intracellular reactions are constrained by the upper limits induced by the amount of the respective enzymes, e.g., for the carbon assimilation reaction The model is parameterized using information about the individual enzymatic and biochemical

209
Biochemical resource allocation models go beyond describing nutrient uptake and the specific growth 210 rate, and allow us to obtain insights into acclimation, co-limitation and cellular trade-offs. In particular, 211 concomitant to the cellular growth curves, we obtain the distribution of protein resources within the cell 212 as a function of environmental parameters and growth rate.
213 Figure 3 and Table 1 show the relative protein fractions invested into the different biochemical 214 processes dependent on environmental conditions. The resource allocation framework allows the model 215 to acclimate to the respective environmental condition and invest cellular resources into processes that 216 would otherwise limit growth. As a consequence, the maximal uptake rate of the nitrogen transporter Table 1. Cellular protein allocation in dependence of environmental conditions. Values denote the relative abundance (% relative to total proteome) of protein complexes under low and high nutrient conditions. The symbols hold same meaning as Figure 1.  strongly depend on the light intensity, at the expense of investments in other metabolic processes (Table 1).

224
Closely related to trade-offs in resource allocation, it is an important challenge for empirical growth models to describe the dependence of growth on several potentially limiting nutrients, see Saito et al. [51] for a discussion on the concept and types of co-limitation. The most common ways to implement multiple limitation scenarios relies on either Liebig's law of the minimum, or the multiplicative form without problems and there is no clear empirical evidence to assess the merits of either representation.

227
Given its simplicity, the multiplicative form is commonly employed in multi-nutrient models [16,57].

228
For biochemical resource allocation models, however, the description of growth limitations as a 229 function of two or more nutrients emerges without further assumptions about the functional form of 230 growth equations. In particular, the coarse-grained model described above is not consistent with Liebig's 231 law of the minimum, as growth on a single nutrient, as shown in Figure 2, does not exhibit any hard 232 threshold. The absence of such a threshold is due to the fact that, for scarce nutrients, resources are 233 increasingly invested into the respective uptake reactions.

234
More relevant, however, the emergent growth curve is also not consistent with a multiplicative 235 functional form. Figure 4B shows a Lineweaver-Burk plot of the growth rate as a function of nitrogen complexes (as would be expected, for example, for cyanobacterial Prochlorococcus strains).

249
More importantly, however, cellular diversity also arises due to genetically-encoded differences 250 between organisms. Firstly, diversity may arise due to differences in enzyme-kinetic parameters. The in the genome, we consider phototrophic growth with two alternative sources of extracellular nitrogen. 262 We assume that, in addition to the nitrogen source N x considered above, there is a second source of 263 extracellular nitrogen N y , whose uptake and conversion to the intracellular nitrogen precurser N is  In the following, we assume that the extracellular nitrogen source N y is constantly available (analogous 274 to, e.g., atmospheric dinitrogen), whereas the availability of the nitrogen source N x varies.

286
Hence, we expect that the (T N +T Y )-strain will be outcompeted in any constant environment, but will have 287 a competitive advantage in (some) environments with variable nitrogen availability.

288
For our purposes, the example serves to illustrate the following points: (i) biochemical resource 289 allocation models build upon the genome of an organism and hence allow us to represent genetic diversity 290 within strains, including genomes that encode several potential metabolic strategies and differences is of carbon uptake, as well as nitrogen metabolism and uptake, differs between strains-reflecting the diversity of strains. As noted above, our premise is that enzyme-kinetic parameters are subject to physico-310 chemical trade-offs, for example trade-offs between the half-saturation constant and the maximal catalytic 311 rate of an enzyme. We emphasize that such trade-offs are not an outcome of our modelling approach but 312 need to be specified independently, for example based on detailed biochemical surveys and analysis [14]. 313 We consider two strains of phytoplankton. The differences between both strains (detailed in Materials 314 and Methods and Table 2) reflect trade-offs in the half-saturation constants and catalytic activities of the 315 uptake mechanisms of nitrogen and carbon, and result in two functional groups of phytoplankton, gleaners 316 and opportunists. The respective growth curves are shown in Figure 6. Gleaners are characterised by a 317 higher affinity towards extracellular nitrogen, and an overall lower maximal growth rate. Opportunists are 318 characterised by a high overall specific growth rate, but a lower affinity for extracellular nitrogen.
319 Figure 6. The growth curves of two competing strains of phytoplankton, opportunists and gleaners, in dependence of external nitrogen availability. The strains differ in the enzyme-kinetic parameters of their constituent enzyme complexes. Gleaners (K-strategists) have a growth advantage during phases of low external nitrogen availability, whereas opportunists (r-strategists) have a growth advantage at high concentrations of external nitrogen.
Following Tsakalakis et al.
[57], we simulate the growth of both strains in two different environments: a constant light environment (control) and a light environment with seasonal variations in average light intensity. Extracellular inorganic carbon is assumed to be constant, a (single) source of extracellular nitrogen is is supplied via a constant influx. The dynamics of the abundances of gleaners (ρ G ) and opportunists (ρ O ) are described by the following ODEs the dynamics of external nitrogen is described by where V N denotes a constant influx, and ν n,O and ν n,G denote the specific cellular uptake rates (as emerging 320 from the respective models) of external nitrogen by the gleaners and opportunists, respectively. The their synthesis is enforced using fixed quotas (except otherwise noted, the quota protein component P Q is 426 assumed to be 20% of total protein, the non-protein biomass Q is assumed to be 50% of total biomass).

427
The biochemical resource allocation model of phototrophic growth 428 The biochemical resource allocation model shown in Figure 1 is assembled following the stoichiometry and data described by Faizi et al. [11]. We note that the model of Faizi et al.
[11] is a nonlinear kinetic ODE model, hence computationally different from the model described here. Growth is facilitated by 8 protein complexes: 6 enzyme and transport complexes, ribosomes R and a non-catalyzing quota protein compenent P Q . The enzyme and transporter catalyze the following reactions: Protein translation is described by the equation where n p denotes the size of the respective protein in amino acids.

429
Capacity constraints of catalytic enzymes 430 All enzyme-catalyzed reactions are constrained by the amount of the respective enzyme, according to equation (7). The constraints for uptake and light harvesting reactions also depend on the availability of the respective substrates. In particular, for (i) the uptake of inorganic carbon, (ii) for uptake of extracellular nitrogen, For any given set of external parameters C x , N x , I and specific growth rate µ, the model implemented as a linear program LP(µ). The problem is described by three matrices N N N, B B B, and C C C, the vector of reaction rates v v v = (v i , γ k ) T (including metabolic and translation rates), and the vector P P P of macromolecules. The constraints are

15/19
v v v ≥ 0 0 0 (22) C C C · v v v ≤ P P P with P P P = (R, T N , T C , PSU, M C , M Q ,CB, P Q , Q) T (23) P P P ≥ P P P lb with P P P lb = R lb , T lb N , T lb C , PSU lb , M lb C , M lb Q ,CB lb , P lb Q , Q lb T (24) ω ω ω · P P P = d. amino acids per cell).

444
The above described LP is solved as a feasibility problem for a given µ. To obtain a solution for the 445 maximal specific growth rate in a given environment, the global optimum of µ is found using bisection, To simulate the growth on two alternative sources of external nitrogen, we used the set of additional 455 parameters given in Table 3. 456 Table 3. Specific parameters used to model growth on two alternative sources of external nitrogen. The remaining parameters are same as described in Table 2 . Symbol Definition (T N )-strain (T Y )-strain (T N + T Y )-strain e N Energy units per uptake reaction T N