Modelling microbial communities using biochemical resource allocation analysis

To understand the functioning and dynamics of microbial communities is a fundamental challenge in current biology. To tackle this challenge, the construction of computational models of interacting microbes is an indispensable tool. There is, however, a large chasm between ecologically motivated descriptions of microbial growth used in many current ecosystems simulations, and detailed metabolic pathway and genome-based descriptions developed in the context of systems and synthetic biology. Here, we seek to demonstrate how resource allocation models of microbial growth offer the potential to advance ecosystem simulations and their parametrization. In particular, recent work on quantitative resource allocation allow us to formulate mechanistic models of microbial growth that are physiologically meaningful while remaining computationally tractable. These models go beyond Michaelis–Menten and Monod-type growth models, and are capable of accounting for emergent properties that underlie the remarkable plasticity of microbial growth. We outline the utility and advantages of using biochemical resource allocation models by considering a coarse-grained model of cyanobacterial growth and demonstrate how the model allows us to address specific questions of relevance for the simulation of marine microbial ecosystems, including the physiological acclimation of protein expression to different environments, the description of co-limitation by several nutrients and the differential use of alternative nutrient sources, as well as the description of metabolic diversity based on our increasing knowledge about quantitative cell physiology.


Introduction
Microbial organisms are integral parts of the Earth's biogeochemical cycles and play key roles in almost all environments and ecosystems. Microbes typically form complex, interacting, and dynamically changing communities, with examples ranging from marine plankton communities to the human microbiome. To understand the organizing principles and the functioning of such communities is of paramount importance for a vast number of basic and applied research questions, including questions pertinent to biotechnology, climate change and human health [1][2][3][4]. Despite the significant advances in our ability to observe and characterize biological systems, however, understanding the interactions and the emergent dynamics of microbial communities remains a fundamental, and truly transdisciplinary, challenge [5][6][7][8][9].
Traditionally, ecosystem dynamics and microbial communities are the realm of microbial ecology, with a long history and a wealth of results concerning the organization, stability and functioning of (microbial) ecosystems [1,10]. In the past century, a variety of modelling approaches have been developed to address fundamental ecological questions, ranging from understanding patterns of biodiversity to predicting the response of ecosystems to changing environmental conditions [5,7,[10][11][12]. In this context, descriptions of microbial growth range from phenomenological 'black box' models to more elaborate trait-based models of growth [5,7]. It has been noted, though, that current theoretical approaches to microbial growth are still dominated by the classic Monod or Michaelis-Menten functional form [6,13]. While undoubtedly highly successful, Monod-type models of growth exhibit a number of limitations. For example, it has been argued that the constant parameters used in the Monod equation cannot account for the observed plasticity of microbial physiology [14]. Likewise, it has been emphasized that, despite the significant advances in genome sequencing and quantitative high-throughout methods, the complexity of mechanistic ecosystem models, and in particular the description of microbial growth within these, have not changed substantially since they were developed in the 1970s [13].
Parallel to progress in theoretical and experimental microbial ecology, the past two decades have seen an unprecedented advance in our understanding of microbial molecular physiology-mainly driven by advances in our ability to monitor, measure and modify the inner workings of cells. Theoretical and computational descriptions of microbial metabolism, facilitated by large-scale metabolic network reconstruction and constraint-based analysis, have become established tools in molecular systems biology [15][16][17]. Curated genome-scale reconstructions (GSRs) of metabolic networks are available for an increasing number of microbial organisms [18][19][20], and are increasingly recognized as a promising resource for studying metabolic interactions within microbial communities [2][3][4]8,9,21,22], More recently, also the quantitative growth physiology of bacteria has gained renewed attention, with numerous studies providing novel insights into the principles of microbial growth and resource allocation [23][24][25][26][27]. Key observations concern the 'laws' and trade-offs of microbial growth. In continuation of the classic studies of bacterial growth physiology [23], a number of studies have recently addressed the covariation between the cellular composition and the growth rate of microorganisms [24,25]. Theoretical descriptions of microbial resource allocation include coarse-grained models that describe the fundamental processes of cellular growth by partitioning the proteome into few essential classes [26,[28][29][30][31], as well as large-scale constraint-based models that take into account the costs and benefits of each individual gene [32][33][34][35]. In particular, the concept of resource balance analysis [32,34,36] and related methods, such as metabolism and macromolecular expression models [33], metabolic modelling with enzyme kinetics [37], and conditional flux-balance analysis (FBA) [31], show that quantitative models that predict quantitative protein expression are feasible on the genome-scale-and can be extended to timevarying environments [35,38]. As yet, however, quantitative modelling of microbial resource allocation is mostly restricted to well characterized model organisms in typical laboratory or biotechnological environments.
The purpose of this work is therefore to outline a bridge between these recent studies on microbial resource allocation and current models of microbial ecology. We argue that biochemical resource allocation models offer significant potential to advance ecosystem simulations beyond current applications of constraint-based analysis of microbial metabolism. In particular, we seek to demonstrate that biochemical resource allocation models, as defined below, can be constructed and parametrized for large classes of microbial organisms based on available biochemical and physiological data; and are, unlike Monod-type models, capable of exhibiting emergent properties of growth, such as photoacclimation or a preferential hierarchy of nutrient sources. Our study is motivated by recent calls for a new generation of plankton models to better capture the emergent properties of marine ecosystems [6,13]. As will be demonstrated below, biochemical resource allocation models follow the rationale described by Allen & Polimene [6] to design a generic cell model that captures the essence of key physiological activities and that is based on a robust physiological formulation of competing physiological activities-and therefore is able to reproduce biogeochemical and ecological dynamics as emergent properties.
The paper is organized as follows: in the first section, we briefly recapitulate computational models of microbial growth. In the second section, we provide an overview of metabolic network reconstruction and recent biochemical models of cellular resource allocation. In the following section, we consider a coarse-grained model of phototrophic (cyanobacterial) growth and describe its parametrization. In the subsequent section, we discuss properties of relevance for the simulation of microbial ecosystems, such as metabolic plasticity, cellular growth laws and co-limitation by several nutrients, as well as the representation of microbial diversity and the emergence of a preferred hierarchy of nutrient uptake. In the sixth and seventh sections, we briefly present a case study: the co-existence of two phytoplankton species with a gleaner-opportunist trade-off. In the final section, we provide a discussion and outlook.

Models of microbial growth
Assuming a chemostat-like setting, the growth dynamics of a population of (genetically homogeneous and well mixed) cells can be described by an ordinary differential equation, where ρ denotes the concentration of cells (described here in units of cells per volume), μ denotes the specific growth rate and D is the dilution or death rate. The specific growth rate μ is a function of the respective environment, and depends on the concentrations of one or more nutrients. Typically, a limiting nutrient n with concentration [n] is considered that is supplied with the inflow of fresh medium at a concentration [n x ]. The respective nutrient dynamics are then described by where Y denotes the yield coefficient, defined as the number of microbial cells (or units biomass) per unit of nutrient.
To evaluate the dynamics of the system requires knowledge of the specific growth rate μ as a function of the concentration of the limiting nutrient n. To this end, among the most widely used approaches is to make use of the hyperbolic dependency proposed by Monod in 1949 [39], description of microbial growth. Its constant parameters are typically estimated for specific environmental conditions and reflect a particular strain or species, and its respective functional traits relate to nutrient uptake and growth [7,14]. Over the past decades, there have been several advances and alternative formulations of growth models, such as the Droop model [40] that introduces internal nutrient quotas. For phototrophic microorganisms, modifications are typically required to account for the effects of photoinhibition-the decrease of the specific growth rate for high light intensities [26]. A widely used equation to describe phototrophic growth in dependence of the light intensity I is the Haldane equation [16],  .2) have been studied extensively [10] and are commonly used in ecosystem simulations [5,10,11,42]. The description can be readily extended to multiple microbial strains with concentrations ρ i and several nutrients n k . We also note that term 'nutrient' refers here to any compound or variable that may affect the specific growth rate, including temperature or modulation of growth by quorum sensing. In this case, knowledge of the functional forms of all specific growth rates μ i in dependence of the variables is required (typical examples for multi-nutrient rate equations are discussed below in the section 'Acclimation, trade-offs and co-limitation').
The use of such multi-variable phenomenological rate equations remains ubiquitous in current models of ecosystems [7,11,13,14,42,43]. It has been emphasized recently [6,13], however, that these empirical growth models do not necessarily reflect our vast recent increase in knowledge about the quantitative physiology of microbial growth. The challenge before us is therefore to combine the conceptual simplicity of empirical growth models with molecular properties of microbial growth.

Metabolic reconstructions and beyond
Models of microbial growth that incorporate internal structure and aspects of physiology are not new. Examples include the (still empirical) model of Droop [40] as well as other 'internalquota' models-each representing a cell with one or more internal variables, and typically allowing for adjustments in the composition of cellular biomass [5]. Likewise, models that incorporate cost-benefit consideration have been proposed, most notably by JA Raven [44] and RJ Geider [45]. In the following, we build on these ideas and incorporate recent approaches to biochemical models of microbial growth [16,17].
In particular, over the past two decades, GSRs of microbial metabolism have reached maturity and are available for a rapidly increasing number of (sequenced) microbial organisms [18][19][20]. GSRs provide a comprehensive account of biochemical interconversions between small molecules (metabolites) within a cell or organism. Constraint-based methods, such as FBA, then allow to efficiently interrogate GSRs, and thereby enable accurate estimation of the stoichiometric and energetic synthesis costs of cellular constituents.
The predictive success of FBA is based on the fact that the analysis does not require extensive knowledge about kinetic parameters and regulatory interactions. Rather, the implementation of FBA and related methods are based only on knowledge of the stoichiometric matrix (i.e. the GSRs itself) and typically involve the assumption of evolutionary optimality: unknown regulatory interactions are replaced by the assumption that the resulting flux solutions satisfy a given optimality criterion, such as the maximization of a biomass objective function [15]. Based on the combination of GSRs with linear programming, FBA and related methods have been highly successful to predict maximal growth yields of microbial organisms and other properties of biotechnological relevance [15].
Despite their predictive success, however, the conventional formulation of FBA also exhibits several shortcomings. In particular, traditional analyses of GSRs fail to account for the enzymatic costs required for cellular metabolism, and hence growth. To this end, among other improvements detailed elsewhere [46], a new generation of constraintbased models were recently developed that offer significant potential to advance our understanding of ecosystems.

Models of cellular resource allocation
Building upon GSRs, a new generation of constraint-based microbial growth models has been developed recently, based on the concept of cellular resource allocation [32][33][34][35][36]. Different from FBA, these models aim to predict protein expression and cell compositions of microbes in specified (albeit, with the exception of [31,35,38], constant) environments. Resource allocation models are based on the insight that the (maximal) flux of an enzyme-catalysed biochemical reaction is typically constrained by the amount of the respective enzyme. Since enzymes are themselves the products of metabolism, incorporating enzyme-dependent flux constraints gives rise to a self-consistent description of microbial growth: for any given growth rate μ the set of cellular enzymes must be sufficient to sustain the synthesis of the required precursors to allow for the translation of the set of catalysing enzymes itself, as well as for the synthesis of all other (non-enzymatic) compounds within a cell.
In the following, while many of our arguments hold also for a more general class of resource allocation models [26,28], we will focus on a particular implementation of resource balance models based on linear constraints [31,32,34,35]. These models are formulated as linear programs (LPs) and are based (only) on linear constraints between reaction rates and intracellular macromolecules, and hence can be solved efficiently.
More formally, the models are based on a description of the intracellular protein allocation of growing cells. The concentration of a protein P k can be described by the equation, where γ k denotes the respective translation rate. In steady state, the translation rate has to match the dilution term μ · [P k ] of cellular growth (additional protein degradation can be readily included). The sum of all translation rates is constrained by the available ribosomal capacity and hence by the number of ribosomes.
To account for the synthesis of metabolic precursors and other cellular components, the interconversion of internal metabolites m is described by a stoichiometric matrix N and a vector ν that denotes the rates of (spontaneous or enzyme-catalysed) interconversion rates, Typically, intracellular metabolism is assumed to be at steady state and the dilution terms for intracellular metabolites are neglected due to the high turnover rate of metabolites compared to their dilution rate by growth. Hence, the massbalance constraint on intracellular reaction fluxes simplifies to N · ν = 0.
To account for biochemical resource allocation, the rates of those reactions that are catalysed by proteins are constrained by the amount of the respective catalysing proteins where k cat,k denotes the specific activity of the enzyme or protein.
The maximal uptake rate ν T of an external nutrient n x can be further constrained by the concentration of the respective nutrient and the amount of the respective transporter complex P T , The uptake constraints can be modified, for example, to also account for diffusion limitations of nutrient uptake as described by Bonachela et al. [14] by adding an additional constraint.
The constraints and equations summarized above, together with the assumption of a constant cell density, provide a quantitative description of microbial growth. To obtain a prediction of the growth rate in a specific environment, the model is solved using the assumption of parsimonious resource allocation. That is, the assumption that metabolic fluxes and the corresponding protein levels are organized such that they give rise to a maximal growth rate in the respective environment (assumption of evolutionary optimality). Hence, similar to FBA [15] and other constraint-based analysis, the assumption of optimality replaces unknown regulatory mechanisms.
The required parameters for model construction are: (i) the metabolic network (as encoded in the stoichiometric matrix N and the associated enzyme-reaction relationships). These data are available as part of a metabolic network reconstruction; (ii) the composition of the catalysing enzymes (in terms of amino acids and possible co-factors). For most enzymes this information is readily available and part of reaction databases; as well (iii) as the specific activity k cat of each catalysing enzyme and, if required, the half-saturation constants for transporter reactions. While quantitative data are still scarce, in particular for non-model organisms, specific activities for a wide range of enzymes can be sourced from suitable databases, such as BRENDA [47], and are therefore (at least approximately) available. As we have argued previously [31,35], reasonable estimates for all required parameters exist.
We note that the respective models may involve different levels of complexity: the biochemical resource allocation model can be constructed on a genome-scale. That is, the models include all known individual enzymatic reaction steps of the respective organisms, see for example Goelzer et al. [34] or Reimers et al. [35]. Alternatively, to reduce the computational burden, simplified models can be constructed by defining coarse-grained enzyme complexes that represent classes of reactions or pathways, see for example Rügen et al. [31]. The definition of coarse-grained enzyme complexes, however, entails an approximation that has to reflect the specific research question-and general rules are currently not established. For future work, we envision automated stoichiometric reduction algorithms (i.e. the merging of several individual enzymatic steps into coarse-grained metabolic processes) that preserve desired functionalities of the network. Such algorithms were recently developed for genome-scale metabolic reconstructions [48], but have yet to be adapted for resource allocation models.
In the following, irrespective of their size, we refer to the class of models outlined above as biochemical resource allocation models (BRAMs). Computationally, for any given growth rate, the resource allocation model gives rise to a LP and hence can be solved efficiently. The maximal growth rate is then identified using bisection, see Material and methods for computational details.

A model of phototrophic growth
To exemplify the utility of BRAMs for microbial ecology, we consider the construction and analysis of a coarse-grained model of cyanobacterial growth. Model reduction was based on previous works [26,31,35,49] to reflect the core limitations of phototrophic growth by light, nitrogen and inorganic carbon. The model is depicted in figure 1 and consists of a light harvesting reaction, five metabolic reactions involving six internal metabolites, as well as eight catalysing protein complexes and their respective translation reactions. In brief, inorganic carbon (C x ) is taken up using an energydependent transporter (T C ). Intracellular inorganic carbon is The biosynthesis of amino acids requires a source of nitrogen (N) that is taken up from the environment using an energy-dependent transporter and associated nitrogen metabolism (T N ). For amino acid synthesis, we assume a N : C ratio of ∼1/3 (the cellular N : C ratio is lower due to the remaining non-protein biomass component Q). Light harvesting and the photosynthetic electron transport chain are represented by a coarse-grained photosynthetic unit (PSU). The PSU protein complex regenerates cellular energy units e (combining ATP and the reductant NADPH into a single energy unit). Amino acids are translated into proteins by ribosomes (R), which are themselves protein complexes. The fraction of non-enzymatic proteins is represented by a (quota) protein component P Q . The remaining biomass is lumped into a metabolic component Q that is synthesized from the cellular precursor C 3 by the protein complex M Q . All proteins complexes represent aggregates of individual proteins. The model assumes a constant cellular density. The specific growth rate does not directly depend on cell size, but cell size may constrain certain parameters, such as the surface to volume ratio. The full set of equations is provided in the Material and methods.
All enzyme-catalysed reactions are constrained by the amount of the respective enzymes. For example, carbon uptake is constrained by the equation where [T C ] denotes the amount of uptake transporter (in molecules per cell), k cat,T C denotes the specific catalytic activity of the transport and K C denotes the half-saturation constant of the uptake complex. We note that equation (5.1) provides an upper limit only, the actual flux can be less (for example by inactivating a fraction of the uptake transporter). Likewise, additional constraints can be included, such as an upper limit on the uptake flux induced by diffusion limitations [14]. Intracellular reactions are similarly constrained by the maximal catalytic capacities of the respective enzymes and their amounts, e.g. for the carbon assimilation reaction The model is parametrized using information about the individual enzymatic and biochemical processes. Using data from Faizi et al. [26], the effective size of the (coarse-grained) protein complexes can be approximated by the number of enzymes involved in amino acid synthesis multiplied by the average size (in units of amino acids) per enzyme. Catalytic turnover numbers k cat are assigned according to typical values for the respective reactions. For example, the rate of translation per ribosome is approximately 20 amino acids per second, the photosynthetic unit (with photosystems II as rate limiting complex) is assumed to give rise to approximately 250 interconversion per second, the k cat,T C of the carbon transporter is set to 20 s −1 , the catalytic activity of central metabolism (protein complex M C ) is set to k cat,MC = 10 s −1 . Reasonable parameter ranges for many enzymatic processes (for a generic cell) can be obtained, for example, from Milo & Phillips [50]. The full set of parameters used in the following is provided in the Material and methods.
Given the stoichiometric constraints and the assigned parameters, the model gives rise to a computational optimization problem. The optimization problem can be solved as a series of LPs to identify the maximal specific growth rate μ in dependence of the availability of extracellular nitrogen and carbon and light intensity I (assumption of evolutionary optimality). Figure 2 shows the resulting growth curves as a function of environmental parameters. Similar to previous models [26], the resulting growth curves with respect to external nitrogen (N x ) and carbon (C x ) concentrations are consistent with Monod kinetics, the dependence of the specific growth rate on the light intensity is consistent with the Haldane equation. Figure 2 shows the respective comparison of the growth rate with the phenomenological rate equations (2.3) and (2.4).
We note that the growth curves shown in figure 2 are emergent properties of the underlying constraints and parameters, i.e. they result as a consequence of the constraints and parameters that characterize the individual biochemical process.  While the dependencies on single nutrients are (in this case) similar to the respective phenomenological rate equations, changes in model constraints and parameters entail (sometimes complex) changes in overall growth properties. For example, the apparent half-saturation constants of the organismal growth curves are markedly different from the half-saturation constants of the respective transporter complexes, due to the fact that cells can acclimate to low nutrient conditions by changing the expression of the respective protein complex. More importantly, biochemical resource allocation models allow us to further address trade-offs in protein allocation, limitations by multiple nutrients, including hierarchies of nutrient uptake, as well as other properties relevant for computational models in microbial ecology.

Acclimation, trade-offs and co-limitation
Biochemical resource allocation models go beyond phenomenological rate equations, and provide insights into acclimation, co-limitation and cellular trade-offs. In particular, concomitant to the overall cellular growth rate, we obtain the distribution of protein resources within the cell as a function of the environmental parameters. Figure 3 and table 1 show the relative protein fractions invested into the different biochemical processes in dependence on the environmental conditions. The resource allocation framework allows cells to acclimate to the respective environmental condition and to invest cellular resources into processes that would otherwise limit growth. For example, the maximal uptake rate of the nitrogen transporter complex (V max ¼ k cat,TN Á [T N ]) and hence the affinity A = V max /K N for the extracellular nitrogen source is not constant, but increases with decreasing concentrations of the extracellular nitrogen source. Figure 4a shows the maximal uptake rate V max , as well as the actual nitrogen uptake flux, as a function of available extracellular nitrogen. Similar to the analysis by Bonachela et al. [14], and unlike descriptions using the Monod equation, the model accounts for the acclimation of the cell to low nutrient availability-with important   Figure 3. Cellular protein allocation in dependence of environmental conditions. Shown are the relative abundances of ribosomes and coarse-grained protein complexes (relative to total protein) under different growth conditions. The superscripts (*, y and z) indicate the parameter values used to specify environmental conditions. (Online version in colour.) Table 1. Cellular protein allocation in dependence of environmental conditions. Values denote the relative abundance (% relative to total proteome) of protein complexes under low and high nutrient conditions. The abbreviations are as in figure 1.  (figure 3  and table 1). Importantly, several recent experimental studies on cyanobacterial growth physiology are in good agreement with the predictions of biochemical resource allocation model [27,51], with changes in protein allocation in vivo approximately matching in silico predictions. We therefore consider the predictions from biochemical resource allocation models to be a reasonable starting point also for ecological simulations.
Another challenge for phenomenological growth models is to describe the dependence of growth on several potentially limiting nutrients, see Saito et al. [52] for a discussion on the concept and types of co-limitation. The most common approaches to implement multiple limitation scenarios rely on either Liebig's law of the minimum, , ( 6 :1) or the multiplicative form where [n 1 ] and [n 2 ] denote the concentrations of two potentially limiting nutrients and K m1 K m2 the respective half-saturation constants, respectively. As discussed by Saito et al. [52] both functional forms are not without problems and there is no clear empirical evidence to assess the merits of either representation. Given its simplicity, the multiplicative form is commonly employed in multi-nutrient models [11,42]. For biochemical resource allocation models the description of growth limitations as a function of two or more nutrients emerges without further assumptions about the functional form of growth equations. As expected, the coarse-grained model described above is not consistent with Liebig's law of the minimum: growth on a single nutrient, as shown in figure 2, does not exhibit any hard threshold.
The absence of such a threshold is due to the fact that, for scarce nutrients, resources are increasingly invested into the respective uptake reactions-hence the limitation between two limiting nutrients is not independent.
More importantly, however, the interdependence of growth limitations by two nutrients implies that the emergent growth curve is also not consistent with the multiplicative functional form: figure 4b shows a Lineweaver-Burk plot of the growth rate as a function of nitrogen availability for different values of the external carbon concentration. Parallel lines in a Lineweaver-Burk plot correspond to non-competitive inhibition, whereas the multiplicative functional form of equation (6.2) would result in lines with an identical x-intercept. Hence, the absence of carbon acts analogous to uncompetitive inhibition, and affects both the apparent organismal half-saturation constant of growth as well as the maximal growth rate of the cell-again with important consequences when modelling, for example, growth limitations and nutrient dynamics in coupled ecosystem models. While we are not aware of dedicated growth studies to quantitatively investigate the functional dependencies of co-limitation in cyanobacteria, the plasticity of metabolism and re-allocation of proteins makes the strict multiplicative dependence of equation (6.2), as assumed in most current models, highly implausible.

Metabolic diversity and the cost of regulation
Recent studies have emphasized the representation of microbial community diversity as a fundamental requirement of model ecosystems [5]. Several principles and mechanisms can be used to describe microbial diversity in biochemical resource allocation models. A shown above, cells may acclimate to different environmental conditions, resulting in an inhomogeneous population. If required, the possibility for physiological acclimation can also be restricted within simulations. For example, depending on its evolutionary history, a strain might not be able to access the full range of possible proteins allocated (such a restriction would, for example, be expected for cyanobacterial Prochlorococcus strains). The model definition may  Beyond different acclimation states, however, cellular diversity also arises due to genetically encoded differences between organisms. Firstly, microbial organisms exhibit metabolic diversity with respect to the encoded metabolic functionality within their genomes. As shown by recent studies of the cyanobacterial pan-genome and pan-metabolism [53][54][55], genome sizes differ significantly-reflecting the different adaptations and lifestyles of organisms. Differences in the set of encoded proteins correspond to different metabolic strategies that are accessible to an organism. For example the accessible modes of energy generation depends on the genetic repertoire of strains [56]. Likewise the accessibility of different nutrient sources may be restricted by the lack of the respective transporter proteins. Secondly, diversity may arise due to differences in enzyme-kinetic parameters. The evolution of enzyme-kinetic parameters is constrained by physico-chemical limits that result in trade-offs between parameters. The protein complex ribulose-1,5-bisphosphate carboxylase/oxygenase (RuBisCO) is a prominent example [57]. As will be shown below, such differences and tradeoffs between parameter values may give rise to different cellular growth curves.
To illustrate how BRAMs may represent such genetic diversity, and to illustrate the functional consequences of different metabolic strategies in model simulations, we consider the uptake of two alternative sources of extracellular nitrogen. We assume that, in addition to the nitrogen source N x considered above, there is a second source of extracellular nitrogen N y that is permanently available in the environment (analogous to, e.g. atmospheric dinitrogen). The uptake of the permanently available nitrogen source N y and its conversion to the intracellular nitrogen precurser N is facilitated by a set of proteins that is represented in our models by a coarsegrained protein complex T Y . The synthesis of the protein complex T Y , however, requires more amino acids and its catalytic turnover number is lower, as compared to the complex T N . Within their genome, the cyanobacterial strains may either encode one of the two (coarse-grained) uptake protein complexes, T N or T Y , or both. The respective strains are denoted as (T N )-strain, (T Y )-strain and (T N + T Y )-strain. The inclusion of both protein complexes within a single genome entails additional cellular costs: a larger genome corresponds to a (slightly) higher fraction of the non-protein biomass Q. In addition, further protein machinery is required to facilitate cellular decisions that control the expression of both enzyme complexes. The increased protein machinery is represented by an increased fraction of non-catalysing (quota) proteins P Q . The exemplary parametrization of the three strains is provided in the Material and methods. Figure 5a shows the growth curves of all three strains as a function of N x in the presence of a constant basal availability of N y (figure 5a). Figure 5b shows the expression of the respective uptake complexes for the (T N + T Y )-strain harbouring both uptake complexes. As expected, the (T Y )-strain exhibits a constant growth rate, due to the constant basal availability of N Y . The (T N )-strain exhibits a Monod-type dependence on the availability of N x , as already shown in figure 2. The combined (T N + T Y )-strain, however, exhibits a switch between two growth regimes: for low availability of external N x , the strain expresses the protein complex T Y and uses the nitrogen source N Y . In this regime, the (constant) specific growth rate is slightly below the rate observed for the (T Y )-strain due to the increased burden of non-catalytic biomass. If the availability of N x exceeds a certain threshold, the (T N + T Y )-strain switches its preferred nitrogen source and expresses the protein complex T N . The growth rate then increases with increasing availability of N x , but always remains below the growth rate of the (T N )-strain (again due to the increased burden of non-catalytic biomass). Hence, we expect that the (T N + T Y )-strain will be outcompeted in any constant environment, but will have a competitive advantage in (some) environments with variable nitrogen availability.
For our purposes, the example serves to illustrate the following points: (i) biochemical resource allocation models allow us to represent genetic diversity within strains, including differences in genome size and genomes that encode several potential royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 16: 20190474 metabolic strategies; (ii) the associated costs of larger genomes, including the costs for additional expression of regulatory proteins can be incorporated into the parametrization of the model (see also Lynch & Marinov [58] for a discussion on the bioenergetic costs of a gene); (iii) the optimal metabolic strategy for any given environment does not have to be specified in advance but is an emergent outcome of model simulation. Strains may switch between different strategies depending on the environment-with important implications for ecosystem models; (iv) within simulations, cells typically exhibit a hierarchy of preferred nutrients. That is, if two nutrient sources are available in the environment, these might either be sequentially consumed (with a preferred nutrient first) or they are simultaneously consumed. Such behaviour is non-trivial to represent within phenomenological Monod-type growth models but emerges naturally for resource allocation models, as recently shown by Wortel et al. [59] and Müller et al. [60]. A recent study highlights this fact [61] and argues that arguments based on optimal protein allocation indeed enable prediction of microbial nutrient uptake hierarchies in good agreement with experimental observations. The fact that BRAMs can represent and predict such hierarchies in nutrient uptake has again implications for the simulation of ecosystem models.

A case study: seasonal variation and co-existence
To exemplify the use of BRAMs within ecosystem simulations, we consider a model of phytoplankton diversity recently proposed by Tsakalakis et al. [42]. In the following, we do not aim to recapitulate the full study of Tsakalakis et al. [42], but focus only on the competition outcomes between different strains of phytoplankton in a constant versus a time-varying environment. As shown above, the growth physiology of BRAMs is an emergent property of the underlying biochemical parameters. We therefore assume that the biochemical parameters of carbon uptake, as well as nitrogen uptake and metabolism, differ between strains-reflecting microbial diversity of strains. As noted above, our premise is that enzyme-kinetic parameters are subject to physico-chemical trade-offs, for example trade-offs between the half-saturation constant and the maximal catalytic rate of an enzyme. We emphasize that such trade-offs are not an outcome of our modelling approach but need to be specified independently, for example based on detailed biochemical surveys and analysis [57].
In the following, we consider two strains of cyanobacteria (or other phytoplankton). The differences between both strains reflect trade-offs between the maximal specific catalytic activities and the substrate affinities in the protein complexes involved in carbon and nitrogen uptake. The parametrization of both strains is provided in Material and methods and table 2. Differences in parameters give rise to two functional groups of phytoplankton, gleaners (K-strategists) and opportunists (r-strategists). The respective growth curves are shown in figure 6. Gleaners are characterized by a higher affinity towards extracellular nitrogen, and an overall lower maximal growth rate. Opportunists are characterized by a high overall specific growth rate, but a lower affinity for extracellular nitrogen.
Following Tsakalakis et al. [42], we simulate the growth of both strains in two different environments: a constant light environment (control) and an environment with seasonal variations in average light intensity. Extracellular inorganic carbon is assumed to be constant, a (single) source of extracellular nitrogen is supplied via a constant influx. The dynamics of the abundances of gleaners (ρ G ) and opportunists (ρ O ) are described by the following ODEs ) (8:1) and the dynamics of external nitrogen is described by where V N denotes a constant influx, and ν n,O and ν n,G denote the specific cellular uptake rates (as emergent properties of the respective models) of external nitrogen by the gleaners and opportunists, respectively. The population dynamics of both strains in constant and time-varying environments are shown in figure 7. Simulations were performed using a Python ODE solver, the growth models are implemented a (series of) LP problems and solved at each time step. The procedure is computationally similar to dynamic FBA (dFBA), an established method for constraint-based analysis [66]. See Material and methods for details. As shown in figure 7, gleaners outcompete opportunists in a constant light environment, consistent with the competitive exclusion principle. Seasonally changing light intensities, however, induce changes in strain abundances, and hence nitrogen availability. Temporal changes in nitrogen availability then result in the co-existence of both strains. During periods of low light availability, overall strain abundance decreases and the availability of extracellular nitrogen increases. With increasing light intensities, opportunists have a competitive advantage and quickly increase in abundance, thereby decreasing the availability of extracellular nitrogen and shifting the competitive advantage to gleaners until a decreasing light intensity restart the cycle. The simulation results are consistent the corresponding simulations of Tsakalakis et al. [42] and demonstrate the feasibility of using biochemical resource allocation models to generate diverse microbial populations for ecosystem simulations.

Discussion and outlook
As noted by Follows & Dutkiewicz [5], there is currently a vast chasm between the ecologically and biogeochemically oriented parametrizations of growth used in ecological modelling and the metabolic-pathway perspective of microbial growth originating from systems biology and modern genomics. The purpose of this study was therefore to outline the manifold connections between both fields and to show how recent biochemical models of microbial growth may contribute to close the chasm. To this end, we described recent computational models of microbial growth that are based on a description of cellular resource allocation [17,26,30,32,35]. These models allow us to obtain a quantitative account of protein expression that constrain biochemical processes and growth.
The aim of our study was to heed the call of Allen & Polimene [6] to provide growth models based on a robust physiological formulation that allow for trade-offs between royalsocietypublishing.org/journal/rsif J. R. Soc. Interface 16: 20190474 resource allocation of competing physiological activities. We propose that biochemical resource allocation models, such as the ones described here, fulfil this paradigm towards a new generation of plankton models. While mechanistic growth models [67], resource-allocation and cost-benefit analysis [5,7,44,45], as well as models based on optimality [68,69], are well established in ecological modelling, biochemical resource allocation models directly build upon metabolic network reconstruction and constraint-based analysis-and therefore reflect the advances in quantitative growth physiology. The predictions from biochemical resource allocation models are often in excellent agreement with detailed physiological and biochemical studies of model strains [27,34,51]. The premise of our study is therefore that these model are a suitable and reasonable starting point for the description of microbial growth also within ecosystems simulations.
Biochemical resource allocation models can be formulated for almost all microbial strains for which a reference genome is available. Supported by recent analysis of the cyanobacterial pan-genome [53][54][55], and the diversity of energy metabolism in microbes [56], we hypothesize that such models will Table 2. Parameters of the model. The parameter values follow the data used in Faizi et al. [26]. If no data were available in the literature, the remaining parameters are estimated S based on generic values. The column for 'Gleaner' represents the default values.  follow a modular paradigm: there is only a limited number of fundamentally different metabolic strategies available and microbial organisms are a mix-and-match conglomerate of these strategies (with many combinations excluded for biophysical or energetic reasons). The enormous diversity of microbial metabolism then arises from further variations and adaptations of biochemical parameters (including physicochemical trade-offs), as well as from differences in cellular resource allocation. For example, recent studies indicate that the observed significant differences in the maximal specific growth rates between genetically similar cyanobacterial strains are linked to differences in resource allocation strategies (such as the amount of storage compounds or differences in the PSII/PSI ratio), see Zavrěl et al. [27] for a brief discussion.
In this study, our aim was not a comprehensive review of computational resource allocation models, but rather to highlight the specific aspects and properties that are of particular importance in simulations of ecosystems, specifically acclimation to different growth environments, co-limitation, and the representation of metabolic diversity. In this respect, the merits of biochemical resource allocation models are as follows: -BRAMs can be formulated using different levels of complexity, ranging from GSRs that take into account the comprehensive set of cellular proteins and enzymes [34,35], to intermediate representations [31] and smaller coarse-grained models that describe protein complexes that correspond to (agglomerated) cellular processes (an example of the latter was provided in this work). While systematic rules for model reduction are still lacking, we envision algorithmic approaches, similar to the method of Erdrich et al. [48], that facilitate the construction of meaningful coarse-grained models is a semi-automated way. -Model parametrization can be based on available biochemical knowledge provided in databases, such as BRENDA [47], as well as on our increasing knowledge about quantitative cell physiology [50]. The models therefore provide a link between the physico-chemical constraints of enzyme-kinetic parameters and observed growth kinetics. Key quantities for model parametrization are enzyme costs (in terms of amino acids and cofactor requirements) and enzymatic catalytic activities. Information about regulatory mechanisms is not required, but may be added as an additional constraints. -BRAMs allow us to represent metabolic diversity by taking distributions of parameters (and possible physico-chemical trade-offs between them) into account. Biochemical resource allocation models therefore enable implementation of selection-based approaches-following the Baas-Becking paradigm 'everything is everywhere but environment selects' (cited after Follows & Dutkiewicz [5]). -The analysis of biochemical resource allocation models reveals complex metabolic behaviour, such as switches between different metabolic strategies. Most microbes are capable of more than one metabolic strategy and phenomenological Monod-type models face difficulties to describe transitions between different metabolic strategies. For biochemical resource allocation models the modes of energy generation or nutrient uptake strategies (and their respective hierarchies) emerge as part of the optimization procedure without further specification. and hence are computationally tractable. While it is (currently) not possible to formulate kinetic models at the genome-scale, the implementation of biochemical resource allocation models is computationally feasible even for large models [34,35]. Coarse-grained models, such as the one discussed above, can typically be solved fast and efficiently and hence are suitable for ecosystems simulations. In case computational capacity is limiting, it is possible to devise approximate computational schemes (such as lookup tables and interpolation). We note, however, that not all possible constraints may be formulated as a LP problem.
Despite their merits, however, current biochemical resource allocation models are not (yet) the panacea for ecological simulations. We expect that diverse approaches are still needed, along with further improvements of biochemical resource allocation models and other whole-cell systems biology models. In particular, current biochemical resource allocation models may be extended and improved along the following lines: (i) current simulations typically focus on steady state analysis. While it has been shown that biochemical resource allocation models can be solved for time-varying environments [31,35], the computational burden is still significant. It is also of paramount importance to be able to represent phenomena such as storage, bet-hedging or luxury uptake of scarce nutrients (i.e, the uptake of nutrient beyond what is currently required in anticipation of possible future limitations) for ecosystems simulations. These phenomena are, in principle, aspects of resource allocation strategies and hence can be represented by appropriate models; (ii) currently models are based on a metabolic perspective of growth. However, also trade-offs between growth and other cellular properties can be considered, such as the resilience against stress or predation (and the energetic expenses associated with it). Likewise, effects of parameters like temperature and pH currently cannot be adequately described by resource allocation models; (iii) a better understanding of physico-chemical trade-offs in enzyme-kinetic parameters is required, as well as further quantitative growth studies, similar to Zavrěl et al. [27]. In particular, as yet, laboratory studies concerning resource allocation are often motivated by biotechnological applications and typically do not consider limitations by nutrients other than (inorganic) carbon and light-whereas applications in ecology require that the focus be shifted on limitations typically encountered in natural environments.
Overall, we are confident that biochemical resource allocation models, whose construction is based on reference genomes and increasingly automated [36], offer significant potential for the the computational study of ecosystems and microbial communities-and go beyond current growth models based either on Monod-type equations or application of FBA. Biochemical resource allocation models will allow us to represent the microbial diversity observed in almost all environments and will open up new avenues to interface biogeochemical and ecological questions with recent knowledge obtained from quantitative microbial growth physiology.

Biochemical resource allocation models
We consider a particular variant of biochemical resource allocation models following the methods and algorithms described in [31,32,35]. A model consists of two types of components: steadystate metabolites and cellular macromolecules (which include catalytic protein complexes and quota components). We assume that the internal metabolites are at a quasi-steady state, i.e. metabolites adjust faster than any changes in the environment. Thus, the concentrations of internal metabolites are not explicitly represented in the model, and the metabolic network is assumed to be balanced at all times. We also neglect dilution by growth of internal metabolites. Catalytic components (including enzymes and ribosomes) are synthesized from the precursors provided by cellular metabolism. The amounts of catalytic components provide an upper bound to the respective rates of the reaction catalysed by the components. The quota components (protein P Q and remaining biomass Q) fulfil no explicit functional role within our model and their synthesis is enforced using fixed quotas (except otherwise noted, the quota protein component P Q is assumed to be 20% of total protein, the non-protein biomass Q is assumed to be 50% of total biomass).

The biochemical resource allocation model of phototrophic growth
The biochemical resource allocation model shown in figure 1 is assembled using the stoichiometry and data described by Faizi et al. [26] (with modifications described below). We note that the model of Faizi et al. [26] is a nonlinear kinetic ODE model, hence computationally different from the model described here. Growth is facilitated by eight protein complexes: six enzyme and transport complexes, ribosome R and a non-catalysing quota protein component P Q . The enzyme and transporter complexes catalyse the following reactions: where n p denotes the size of the respective protein in amino acids. Protein decay is neglected except for the protein complex PSU that is damaged by light ( photoinhibition), such that its abundance follows the equation with v d : PSU À ! n PSU Á AA: (10:4) The dynamics of all other protein complexes are modelled according to equation (4.1).

Capacity constraints of catalytic enzymes
All enzyme-catalysed reactions are constrained by the amount of the respective catalysing enzyme, according to equation (4.3). The constraints for nutrient uptake and light harvesting also depend on the availability of the respective external substrates. In particular, for the uptake of inorganic carbon, The equations for light harvesting, photosynthesis and photodamage, are derived from a simple two-state model of photosynthesis, see [26] for details. We note that, different from most other constraints, the rate of photodamage is subject to an equality constraint. That is, the reaction is enforced to proceed according to the specified rate. Along similar lines, also other external parameters or chemical compounds may modulate or enforce specific reactions if required for a specific model. The constraints on the ribosomal capacity are X p g p Á n p g max Á R, (10:8) where n p denotes the protein size (in amino acids per molecule), γ p its translation rate and γ max denotes the maximal translation rate of ribosomes. We note that all capacity constraints are implemented as linear constraints.

Solving the resource allocation model as a LP
For any given set of external parameters C x , N x , I and specific growth rate μ, the model implemented as a LP(μ). The LP problem is described by three matrices N, W ≤ , and W = , as well as the vector of reaction rates v = (v i , γ k ) T (combining metabolic and translation rates), the vector P of (the concentrations of ) macromolecules, and a vector ω that describes the sizes of the macromolecules (in units of amino acids per molecule). For each macromolecule, we define a lower bound that ensures the synthesis of quota components, P lb ¼ (R lb , T lb N , T lb C , PSU lb , M lb C , M lb Q , CB lb , P lb Q , Q lb ) T : (10:10) The matrix N describes the stoichiometry of the system, including metabolic and translation reactions, as well as the stoichiometry of the decay of the PSU as a result of photodamage (v d ). The matrices W ≤ and W = describe the relationship between reaction fluxes and P. The entries of both matrices may depend on the external parameters C x , N x and I.
The full set of constraints is The above described LP is solved as a feasibility problem for a given μ. To obtain a solution for the maximal specific growth rate in a given environment, the global optimum of μ is found using bisection as described in [31].

Model parametrization
A complete list of model parameters is provided in table 2. Parametrization follows the data used in Faizi et al. [26]. The size of macromolecules is estimated using the size of an average enzyme times the approximate number of steps used in the pathway. The size of the protein complex P Q and the biomass component Q is arbitrary. Turnover rates are chosen according to average values described in [50]. The description of the photosystem is adopted from Faizi et al. [26], with σ denoting the effective absorption cross-section per photosystems, and k d the rate of photodamage.
To simulate the growth on two alternative sources of external nitrogen, we used the set of additional parameters given in table 3.

The computational modelling framework
A computational model is developed using Python as a programming language. The framework uses functionality from the following packages: numpy [70], scipy [71], matplotlib [72], pandas [73], sundials [74] and Gurobi [75]. In particular, we use Gurobi for solving the LP-based optimization and CVODE Table 3. Specific parameters used to model growth on two alternative sources of external nitrogen. The remaining parameters are same as described in