Enzyme activities predicted by metabolite concentrations and solvent capacity in the cell
Abstract
Experimental measurements or computational model predictions of the post-translational regulation of enzymes needed in a metabolic pathway is a difficult problem. Consequently, regulation is mostly known only for well-studied reactions of central metabolism in various model organisms. In this study, we use two approaches to predict enzyme regulation policies and investigate the hypothesis that regulation is driven by the need to maintain the solvent capacity in the cell. The first predictive method uses a statistical thermodynamics and metabolic control theory framework while the second method is performed using a hybrid optimization–reinforcement learning approach. Efficient regulation schemes were learned from experimental data that either agree with theoretical calculations or result in a higher cell fitness using maximum useful work as a metric. As previously hypothesized, regulation is herein shown to control the concentrations of both immediate and downstream product concentrations at physiological levels. Model predictions provide the following two novel general principles: (1) the regulation itself causes the reactions to be much further from equilibrium instead of the common assumption that highly non-equilibrium reactions are the targets for regulation; and (2) the minimal regulation needed to maintain metabolite levels at physiological concentrations maximizes the free energy dissipation rate instead of preserving a specific energy charge. The resulting energy dissipation rate is an emergent property of regulation which may be represented by a high value of the adenylate energy charge. In addition, the predictions demonstrate that the amount of regulation needed can be minimized if it is applied at the beginning or branch point of a pathway, in agreement with common notions. The approach is demonstrated for three pathways in the central metabolism of E. coli (gluconeogenesis, glycolysis-tricarboxylic acid (TCA) and pentose phosphate-TCA) that each require different regulation schemes. It is shown quantitatively that hexokinase, glucose 6-phosphate dehydrogenase and glyceraldehyde phosphate dehydrogenase, all branch points of pathways, play the largest roles in regulating central metabolism.
1. Introduction
While our understanding of regulation of transcription and post-transcriptional processes has blossomed in the past 25 years due to advances in high-throughput experimental technologies such as RNA expression, ChIP-Seq and mass spectrometry-based proteomics, our understanding of post-translational regulation has advanced [1–4], but not as rapidly or as far [5].
Fifty years ago, it was postulated that the purpose of post-translational regulation in metabolism is to either maintain a balance of the energy charge of the adenylate pool [6], or to control solvent properties [7]. Solvent properties have long been recognized as important determinants of cellular activity and function. Atkinson recognized that the maintenance of physiological concentrations of metabolites may well be the most pressing problem of metabolic control [7]. Metabolite concentrations are both a function of the reaction kinetics and a molecule’s standard chemical potential, of which the latter varies only over a small range for each individual metabolite because solution conditions inside a cell also vary over a small range. Interestingly, the set of enzymes which are post-translationally regulated is relatively well-conserved across species as well [3], despite the fact that the rate constants for the same enzymes can vary dramatically [8].
In addition to metabolite concentrations per se, solvent capacity in the cell has recently focused on molecular crowding [9,10] and the impairment of diffusion [11]. As a cell approaches equilibrium, the cell’s cytoplasm can become glassy such that diffusion is limited. At the same time, control of metabolites through regulation of enzyme activities is no longer effective near equilibrium [12]. The equilibrium constant, K, for a reaction is the ratio of the exponent of the standard chemical potentials. Consequently, metabolite concentrations may potentially approach values determined by their standard chemical potentials in solution, which can be quite large for highly charged metabolites like fructose 1,6-bisphosphate and acetyl-coenzyme A. Not only will metabolite levels rise, but also less water will be produced by metabolism inside the cell. In E. coli, up to 50% of the bulk water is produced by metabolism [13]. Even away from equilibrium, cells clearly must regulate metabolite levels to prevent high concentrations that would be detrimental to diffusional processes necessary for life.
The degree to which the concentration of intermediate metabolites are homeostatically maintained can be rationalized by analogy to economic supply and demand concepts formalized in metabolic control analysis (MCA), a powerful tool for providing insight into metabolic regulation [14]. For example, when flux is controlled by demand, the supply determines the degree to which the concentrations of intermediate metabolites are homeostatically maintained [15]. It should be noted that supply and demand concepts in economics are closely related to thermodynamics [16].
More recent breakthroughs in the study of metabolic regulation include work in which mass spectrometry and NMR measured metabolite and protein levels, along with fluxes modelled from 13C isotope labelling were used with Michaelis–Menten kinetics to determine whether the predicted reaction fluxes matched fluxes modelled from isotope labelling data [2]. The correlation between predicted fluxes were evaluated with and without regulation. If the match was better with regulation, then regulation was assumed. The work was a tour de force in that chemostat studies were used to carefully measure both absolute and relative metabolomics data while at the same time cover as much of the proteome as possible. In addition, Michaelis–Menten kinetic models addressed multiple levels of regulation. The pay-off was not only predictions of which enzymes might be regulated, but also inferences about the regulating molecule.
In addressing possible scalability (or at least cost of experimentation) in the previously mentioned study, a similarly sophisticated informatics approach was used to develop a model of small molecule regulatory networks from curated databases of enzymes, integrate the regulatory network with a metabolic model of E. coli, and distill information on how substrates and inhibitors contribute to metabolic flux regulation [3]. Interestingly, this work did not find support for the common notion that reactions which are furthest from equilibrium are those that are most likely regulated.
Here, we investigate the hypothesis that the post-translational regulation of enzymes is at least in part driven by the need to maintain the solvent capacity in the cell. We evaluate this hypothesis by comparing experimental metabolomics data with steady-state concentrations predicted computationally from equations for reformulated mass action kinetics. Using quantitative metabolomics data as well as physical and biological principles, MCA and alternatively reinforcement learning (RL) are used to predict the control of activity required to bring metabolite levels down to observed values. Consequently, the machine learning results confirm that an optimal control policy can be formulated which directly achieves minimal regulation by efficiently reducing excessive metabolite concentrations.
The predictions agree with known regulation of central metabolism in model organisms. Moreover, these results show that regulated reactions are further away from equilibrium precisely because of the regulation, turning common wisdom about enzyme regulation upside-down. Instead of highly non-equilibrium reactions being the targets for regulation in metabolic pathways [17,18], regulation results in reactions being much further from equilibrium than non-regulated reactions. Being further away from equilibrium than other reactions is an effect, not a cause, of regulation.
2. Results
We solve the prediction problem of which enzyme to regulate by a novel combination of methods from statistical thermodynamics, control theory and RL. The initial step is to determine steady-state concentrations without applying regulation by using numerical optimization of the respective ordinary differential equations such that each reaction in the pathway is the same distance from equilibrium. That is, a maximum path entropy solution [19,20]. This is done using the Marcelin dynamical force equation for mass action kinetics [21] by assuming that the time dependence is the same for all reactions. Marcelin proposed in 1910 that the rate of a reaction is proportional by a factor, c, to the exponent of the thermodynamic driving force on the reaction [22]. In modern terms, the driving force on a reaction j is the reaction affinity Aj = dG/dξj with G being the system free energy, ξj being the extent of the reaction, kB is Boltzmann’s constant and T is the temperature. For the reverse reaction, the driving force is A−j = dG/dξ−j, giving a net rate of,

Figure 1. Initial steady-state properties before any regulation is applied in the form of reduced activity coefficients for glycolysis–pentose phosphate pathway (PPP)–tricarboxylic acid (TCA) cycle with high NAD/NADH and low NADP/NADPH conditions (a,b) as well as high NAD/NADH and high NADP/NADPH conditions (c,d). The steady state is determined by maximizing the reaction path entropy such that the net thermodynamic driving force on each reaction is proportioned according to the governing equation for metabolite kinetics, equation (4.10). (a) Unregulated reaction fluxes under conditions of high NAD/NADH and low NADP/NADPH. Reactions in upper glycolysis (HEX1 to FBA on the x-axis) have a flux that is twice that of lower glycolysis (GAPD to PDH) and the TCA cycle (CSM to MDH) because one molecule of fructose 1,6-bisphosphate in upper glycolysis becomes two molecules of glyceraldehyde 3-phosphate at the beginning of lower glycolysis (FBA and TPI reactions); reactions of the PPP (PGL to TKT2) have no significant flux under these conditions. (b) Unregulated reaction free energies under conditions of high NAD/NADH and low NADP/NADPH. Reactions of upper glycolysis have a free energy change of approximately −RgTlog KQ−1 = −3.8 kJ mol−1, while reactions of lower glycolysis and the TCA cycle have reaction free energies of −RgTlog (2 · KQ−1) = −3.8 − RgTlog (2), where the factor of two is due to the stoichiometry difference between upper glycolysis and lower glycolysis/TCA cycle. Under identical nutrient (boundary) conditions, reduction of the activity coefficients to values less than 1.0 reduces both the steady-state fluxes and the reaction free energies as shown in figure 3. (c) Unregulated reaction fluxes under conditions of high NAD/NADH and high NADP/NADPH. Reaction fluxes are similar to (a) with the exception of the PPP which has a flux similar to upper glycolysis. PGI is the branch point for flux into PPP from upper glycolysis. (d) Unregulated reaction free energies under conditions of high NAD/NADH and high NADP/NADPH. The reaction free energies of PPP now are further removed from equilibrium and all have roughly similar values, as do the sets of reactions for upper glycolysis, lower glycolysis and the TCA cycle. Under identical nutrient (boundary) conditions, reduction of the activity coefficients to values less than 1.0 reduces both the steady-state fluxes and the reaction free energies as shown in figure 4. Upper glycolysis: HEX1, hexokinase; PGI, phosphoglucose isomerase; PFK, phosphofructokinase; FBA, fructose bisphosphatase. Lower glycolysis: TPI, triosephosphate isomerase; GAPD, glyceraldehyde 3-phosphate dehydrogenase; PGK, phosphoglycerate kinase; PGM, phosphoglycerate mutase; ENO, enolase; PYK, pyruvate kinase; PYRt2m, pyruvate transporter; PDH, pyruvate dehydrogenase. PPP: G6PDH, glucose 6-phosphate dehydrogenase; PGL, phosphogluconolactonase; GND, phosphogluconate dehydrogenase; RPI, ribose 5-phosphate isomerase; RPE, ribose 5-phosphate epimerase; TKT1, transketolase 1; TALA, transaldolase; TKT2, transketolase 2. TCA cycle: CSM, citrate synthase; ACONT, aconitase; ICDH, isocitrate dehydrogenase; AKDG, a-ketoglutarate dehydrogenase; SUCOAS, succinyl-CoA synthetase; SUCD, succinate dehydrogenase; FUM, fumarase; MDH, malate dehydrogenase; GOGAT, glutamine oxoglutarate aminotransferase.
If there are no constraints, the configuration also results in a maximal entropy distribution of metabolites. However, the metabolites will be constrained to be away from the equilibrium distribution if there are non-equilibrium boundary conditions. Since the initially predicted concentrations will then be proportional to their Boltzmann probabilities, the initially predicted concentrations may be exceedingly high [7] compared to experimentally observed values from isotope-labelled, mass spectrometry measurements [24,25].
These high concentrations allow for highly effective inference of regulation to control the concentrations. The predicted concentrations, , are brought into alignment with experimental observations, ni, by applying regulation. Regulation is determined using either an MCA approach, or a hybrid optimization–RL approach (Methods). In both cases, regulation is applied in the form of an activity coefficient, αj, that scales the thermodynamic driving force for reaction j, where αj = 1.0 indicates no regulation while αj = 0.0 indicates complete regulation.
In the two MCA-based methods that were developed, reactions are regulated based on the sensitivity of the predicted concentrations to the activity coefficient that modulates each reaction, which is carried out by a specific enzyme. The sensitivity of the ith metabolite with concentration ni (observed or predicted) to the activity, αj, of enzyme j, is described by the concentration control coefficient, ,
A reaction j selected for regulation if it is the one whose change in activity results in the largest change in the loss functions of all metabolites whose predicted concentrations exceed the experimentally observed concentrations, as determined by . Regulation is considered complete when predicted metabolite concentrations are brought into agreement with experimental measurements.
Two approaches were taken with MCA: an unrestricted control approach (unrestricted MCA) in which any enzyme could be a regulator for any metabolite, and a restricted approach in which only enzymes whose immediate products exceeded the target values could be considered as a regulator. While the unrestricted approach optimizes the system as a whole, the restricted approach is consistent with the concept of modularity in biological systems. We refer to the latter as a local-control approach (local MCA) since an enzyme’s immediate products (and possibly other metabolites) are being controlled.
The RL method (figure 2) formulates the problem of regulation in terms of a Markov decision process [26], which is commonly represented as a tuple {S, A, P, R}, where S represents the set of possible states (enzyme activities for each reaction), A represents the set of possible actions (reactions to regulate), P represents the transitional probabilities between states, and R represents the reward function. RL is used to obtain an optimal regulation scheme by learning from delayed environmental feedback [27,28]. Figure 2 illustrates how reactions are chosen using a policy function which returns the reaction to be regulated (action) given the current enzyme activities (state). Learning is performed by iteratively updating the state value function using environmental feedback (rewards) from solving the optimization routine. Specifically, we use a temporal difference bootstrapping technique [29] called n-step SARSA [30,31], an on-policy version of the recently popularized Q-learning method [32]. In the Methods and electronic supplementary material, we provide in-depth descriptions of the theories and approaches behind the steady-state optimization, the MCA methods and the RL method.
Figure 2. Schematic of in silico framework for learning regulation (grey box) with coupled simulation or optimization routine controlling environmental feedback. Initial framework input (green box) consists of target metabolite concentrations from experimental data. The output (red box) consists of a learned optimal enzyme regulation scheme necessary to reach the target concentrations. Learning is performed by repeatedly testing different regulation schemes and updating the value function, V, that returns a scalar value for a given set of enzyme activities. Enzyme activities, represented as states, are chosen for regulation by performing actions that are determined by a policy function. A given policy is determined by V. The new steady-state metabolite concentrations resulting from applied regulation are determined by an optimization routine. Alterations in metabolite concentrations are a direct result of moving into a state s′ from a state s after taking action a, i.e. performing regulation. These dynamic changes are used to define a reward function, R, that determines environmental feedback. Rewards are used to direct the agent as it explores and learns a policy that predicts optimal enzyme regulation.
We compare the three different regulation approaches (unrestricted MCA, local MCA and RL) by statistically characterizing the rate of energy flow across the reactions. The rate that energy is produced in metabolism has long been known to be one of the most significant factors in metabolic regulation [6]. We define the sum of the rate of free energy generated across all reactions as the free energy dissipation rate. The free energy of the jth reaction at steady state is , where Rg is the gas constant, and T is the temperature, Kj is the equilibrium constant and Qj is the reaction quotient. Defining the free energy dissipation rate as time-dependent free energy [33,34],
We evaluated three different versions of E. coli central metabolism under four different nutrient conditions. The three different versions of metabolism were (1) gluconeogenesis, (2) glycolysis and the TCA cycle, and (3) glycolysis, the PPP and the TCA cycle (glycolysis–PPP–TCA). Metabolite concentration data used in the analysis were from E. coli in exponential growth with glucose as the carbon source [24,25]. As an alternative to experimentally measured metabolite concentrations, rough estimates of concentrations can be used as well that give qualitatively similar results (see Methods and electronic supplementary material). In all cases, the predicted regulation matched known regulation points in central metabolism or were adjacent to known regulation points.
Below, we discuss the largest network, glycolysis–PPP–TCA, under two identical nutrient conditions except for the NADP/NADPH ratio, which is held fixed but at different values throughout each analysis. In condition 1, the NAD/NADH ratio is high (31.3) and the NADP/NADPH ratio is low (0.02), which favours flux through upper glycolysis rather than PPP. In condition 2, the NADP/NADPH ratio is also high such that NADP/NADPH = NAD/NADH = 31.3 [24]. The latter condition favours increased flux through PPP. Analyses of gluconeogenesis and glycolysis and the TCA cycle are included in the electronic supplementary material, figures S2 and S3. In all conditions, we compare regulation that is found by the RL method with that found by deterministic methods using only MCA.
2.1. High NAD/NADH requires regulation of metabolite levels in glycolysis
Prediction of enzyme activities using MCA methods are deterministic. Given the conditions for fixed metabolites in which the NAD/NADH ratio is high and the NADP/NADPH ratio is low, flux is favoured through upper glycolysis over PPP, and the local MCA method predicts (figure 3a, red ‘plus’) that five reactions in glycolysis are regulated due to the enzymes hexose kinase (HEX1), phosphofructokinase (PFK), glyceraldehyde-3-phosphate dehydrogenase (GAPD), phosphoglycerate kinase (PGK) and pyruvate dehydrogenase (PDH), while one enzyme in PPP is regulated, phosphogluconolactonase (PGL), near the beginning of the pathway. It is known that regulation of PPP occurs one enzyme up from PGL at glucose 6-phosphate dehydrogenase (G6PDH) instead. But the metabolite that is over produced and is predicted to have high concentration without regulation is phosphogluconate, the product of the PGL reaction. In practice, PGL may be a hard reaction to allosterically regulate since it is a unimolecular ring opening reaction that may be catalysed significantly by binding alone [36].
Figure 3. Glycolysis–PPP–TCA cycle predictions with high NAD/NADH and low NADP/NADPH conditions. (a) Predicted enzyme activities at terminal states are calculated using MCA, shown as red ‘plus’s and green ‘X’s, respectively. Results are compared to those found using an RL approach (black square). Reactions that are known to be post-translationally regulated from experimental studies are shown in bold. (b) Reaction free energy changes are no longer equally distributed across subpathways (figure 1, upper glycolysis, PPP, lower glycolysis, TCA cycle) but instead free energies are further from equilibrium at reactions where regulation is applied. (c) Free energy and energy dissipation rates, calculated using equations (2.4) and (2.5), respectively. Grey dots represent the population of terminal states found while training the RL agent. A terminal state is found when the agent has learned to reduce concentrations to values consistent with experimental observations by adjusting enzyme activities. Under identical nutrient (boundary) conditions, reduction of the activity coefficients to values less than 1.0 reduces both the steady-state fluxes and the reaction free energies compared to unregulated reactions shown in figure 1. Abbreviations are defined in figure 1.
The RL and unrestricted MCA methods both predict the same minimal regulation at HEX1 and GAPD to achieve the same goal of maintaining the predicted concentrations at or below the experimentally observed values. The RL method, however, additionally regulates PGK, pyruvate kinase (PYK), the pyruvate mitochondrial transporter (PYRt2m) and PDH to obtain a similar energy dissipation rate. As shown in figure 3a, four of these enzymes were also regulated in the local MCA method. The difference is that HEX1 and GAPD are more extensively regulated in the RL and the unrestricted MCA methods. Despite these differences in regulation, each regulated enzyme with the exception of the pyruvate transporter are known sites of regulation (known sites of regulation are highlighted in bold). Regulation of the pyruvate transporter was only predicted in the stochastic RL approach. It is likely that this regulation should be assigned to PYK or PDH as it was in the deterministic MCA approach.
As shown in figure 3b, whenever regulation is applied in the form of reducing the activity coefficient, the free energy of the reaction becomes more favourable compared to reactions in the same pathway (e.g. compare to the consistency of free energy changes in upper glycolysis, PPP, lower glycolysis and TCA cycle in figure 1). Reducing the activity of an enzyme in a non-equilibrium setting will cause the reactants to increase in concentration and the products to decrease in concentration, resulting in reaction free energies being further away from equilibrium. Despite the different sites of regulation and the difference in reaction free energies for the three methods, the free energy and energy dissipation rates are similar and are the most favourable rates found (figure 3c).
2.2. High NAD/NADH and high NADP/NADPH require additional regulation in PPP
In the second set of conditions, the NADP/NADPH ratio is also high, which in principle favours more flux through PPP. The resulting regulation is similar to the first conditions in which NADP/NADPH is low with a few exceptions (figure 4a,b). The local MCA method additionally regulated G6PDH, the entry point into the PPP as well as transketolase (TKT), while the RL method no longer regulated PYK and regulated the pyruvate mitochondrial transporter (PYRt2m) rather than PDH. The latter is likely incorrect, but the fact that the method was trying to regulate pyruvate concentrations suggests that PYK might be the true target of regulation. Like the local MCA method, the RL method also regulated HEX1, GAPD and PGK.
Figure 4. Glycolysis–PPP–TCA cycle predictions with high NAD/NADH and high NADP/NADPH conditions. (a) Predicted enzyme activities at terminal states are calculated using MCA, shown as red ‘plus’s and green ‘X’s, respectively. Results are compared to those found using an RL approach (black square). Reactions that are known to be post-translationally regulated from experimental studies are shown in bold. (b) Reaction free energies. (c) Free energy and energy dissipation rates, calculated using equations (2.4) and (2.5), respectively. Grey dots represent the population of terminal states found while training the RL agent.
By contrast, the unrestricted MCA method regulated the same reactions as in the low NADP/NADPH conditions, HEX1 and GAPD. The regulation under a high NADP/NADPH ratio is similar to the conditions in which NADP/NADPH is low primarily because increasing the NADP/NADPH ratio alone is insufficient to drive enough flux through PPP to drive metabolite concentrations high enough for the need for additional regulation specific to the PPP. Because of less total regulation compared to the local MCA, the unrestricted MCA and RL methods result in significantly higher energy dissipation rates than the local MCA method and are thus likely to be more optimal regulation schemes.
2.3. Regulation of PFK maximizes flux through PPP
Increased flux can be channelled through the PPP if PFK activity is regulated to a greater extent or is turned off completely. Then significant flux flows through PPP instead of upper glycolysis and does so in a cyclical manner, just as observed in early studies of carbon flow in pentose metabolism [37]. In Neurospora crassa, glycolysis and the PPP are circadianly regulated, with the PPP being regulated 180° out of phase with upper glycolysis [38]. In the extreme case when PFK activity is turned off in the model, then the cyclical operation of the PPP is such that three carbons are lost from each glucose molecule as CO2 before all the carbon reaches lower glycolysis as glyceraldehyde 3-phosphate, which also matches early isotope labelling studies of carbon flow through the PPP [37].
In the case when PFK activity is set to zero, all methods apply regulation to HEX1. This is enough for the unrestricted MCA and RL methods to bring concentrations to within the observed experimental range, and both methods result in maximal energy dissipation rates (figure 5). By contrast, the local MCA method additionally requires regulation in PPP at G6PDH, PGL and TKT. But even in this case, the local MCA method fails to completely bring sedoheptulose 7-phosphate into the range of the experimental observations. In attempting to control sedoheptulose 7-phosphate, the applied regulation is extensive enough such that the net flux through glycolysis, the PPP and the TCA cycle approaches zero. Thus, the local MCA method fails to obtain control. In several cases involving the local MCA method, the concentration of sedoheptulose 7-phosphate and sometimes 6-phospho d-gluconate become uncontrollable resulting in concentrations higher than what is observed experimentally. The reason for this is that the respective reactions producing these compounds approach equilibrium; it is known that when a reaction approaches equilibrium, the concentrations of the products are no longer controllable [12].
Figure 5. Glycolysis–PPP–TCA cycle predictions with high NAD/NADH and high NADP/NADPH conditions and PFK activity set to zero. (a) Predicted enzyme activities at terminal states are calculated using MCA, shown as red ‘plus’s and green ‘X’s, respectively. Results are compared to those found using an RL approach (black square). Reactions that are known to be post-translationally regulated from experimental studies are shown in bold. (b) Reaction free energies. (c) Free energy and energy dissipation rates, calculated using equations (2.4) and (2.5), respectively. Grey dots represent the population of terminal states found while training the RL agent. The local MCA method results in zero flux (electronic supplementary material, table S1) and is therefore not shown.
In these cases, the reactions and their metabolites are effectively uncoupled from the non-equilibrium reactions. Lack of control may result in the respective metabolites reaching high concentrations in the cytoplasm, and the cytoplasm consequently becoming glassy and diffusion limited. Experiments support this principle. Recent reports provide evidence that active metabolism promotes cytoplasmic fluidization while inactive metabolism results in a glass-like cytoplasm with limited diffusion in both bacteria [11,13] and eukaryotes [39].
However, it is not clear that the failure to maintain control when using the local MCA method reflects poorly on the concept of modularity whereby enzymes use local control. The failure to obtain control of sedoheptulose 7-phosphate can also be due to the incomplete nature of the model of metabolism used here. It may be that in a more extensive model of metabolism, such as the inclusion of purine and pyrimidine biosynthesis pathways branching off of d-ribose 5-phosphate, control of sedoheptulose 7-phosphate by the local MCA method may be possible. We present this possibility because TKT, the enzyme producing sedoheptulose 7-phosphate is a key post-translational regulation point into purine synthesis [40].
2.4. Regulation increases reaction free energies
In the MCA approaches, optimal solutions were selected based only on their ability to reduce metabolite concentrations to physiological levels, as quantified by the concentration control coefficients (equation (2.2)). In RL method, the optimal solutions were selected based on both reducing metabolite concentrations to physiological levels and maximizing the free energy dissipation rate. It is important to note that the RL used here is not a black box approach—cause and effect can be clearly attributed. In neither case of RL nor the MCA approaches was information regarding the regarding the relative distance of any reaction from equilibrium used in the selection criteria, and in fact, initially all reactions are equally distant from equilibrium. Yet, in all cases and in all solutions, when the reaction free energy changes are compared with the regulation scheme, the regulated reactions are further from equilibrium than the unregulated reactions, as shown in figures 3b, 4b and 5b. Furthermore, when the regulation is relaxed, the previously regulated reactions move closer to equilibrium, and in fact back to the starting maximum path entropy solution (figure 1). In the models, it is the act of regulating each reaction that causes the respective reactants to build up and products to become depleted, thereby causing the free energy change of the reaction to increase in magnitude. This observation has wide support in the literature [17,18], but the cause in many cases has been misinterpreted as being such that reactions are selected for regulation because they are far from equilibrium, rather than reactions being far from equilibrium because they are regulated.
At the heart of the issue of whether regulation causes reactions to be further from equilibrium is the question of what constitutes the regulation of enzyme activity. In the approaches presented here, the vector of reaction activities, α, directly modulates the thermodynamic force on all reactions. Consequently, the flux across any reaction is also scaled by α. For the jth reaction, as αj decreases from 1.0 toward 0.0, the flux decreases and the effective free energy of reaction, decreases in magnitude. Consequently, favourable reactions become less favourable and the ratio of reactants-to-products increases with the decreases in the effective driving force. This change in the reactant-to-product ratio for favourable reactions causes the actual reaction free energy, to increase. This approach to regulating activity is analogous to adjusting a rheostat up or down. As the rheostat (α) is turned down, the current decreases and the potential across the junction increases. Adjusting α downward is a direct cause of significant increases in reactant concentrations, decreases in product concentrations and movement of reaction free energy further away from equilibrium.
By contrast, the role of the enzyme activity could naively be assumed to be equivalent to gating both forward and reverse flux through the active site. We tested this interpretation by inferring rate constants from the maximum path entropy steady-state solution and modulating the forward and reverse rates linearly by the scalar α. All attempts at regulating concentrations in this manner failed to bring concentrations down to physiological levels. While modulating forward and reverse flux through the enzyme, this approach has little impact on the reaction thermodynamics as can be seen from the structure of the differential equations. For a reaction scheme,
The hypothesis regarding the cause of regulation found in textbooks and generally accepted in the literature is that highly non-equilibrium reactions are selected by nature for regulation. The reasoning that led to the hypothesis has to do with the established principle that biological systems activate metabolites for reactivity by covalently attaching high potential groups such as coenzyme A and phosphates. These reactions will then have much higher standard free energies of reaction than they would otherwise. Hence, modulating these reactions through regulation would seem a plausible explanation for the observation that regulated reactions tend to be far away from equilibrium. As far as we know, there is no direct evidence to support (test) this hypothesis.
However, the use of activators such as phosphoryl groups and coenzyme A to drive a reaction will not just result in the respective reaction being further from equilibrium, but in all reactions in the pathway being further from equilibrium. Increased product formation of the activated reaction will result in increased reactant concentration for the next reaction, and so forth, as the effect propagates down the pathway until a steady state is reached.
3. Discussion
We have shown that, as a result of the highly favourable chemical potentials of metabolites in a pathway, many reaction products may be produced in biologically unreasonable concentrations, as suggested by Atkinson [42]. We have also shown that this problem is solved, again as suggested by Atkinson, by reducing the activity of either the enzyme catalysing the reaction or upstream enzymes. While this role of regulation has recognized by some in the MCA research community and elsewhere [15], the wider biological literature predominantly discusses the notion that enzymes are post-translationally regulated to specifically control flux or maintain an energy charge within a narrow range [43–45], despite a number of exceptions regarding the latter [46,47]. While modulating the enzyme thermodynamics directly impacts concentrations as well fluxes, modulating fluxes directly does not necessarily impact concentrations.
If enzyme activities are taken to regulate the reaction thermodynamics, the reactions that have the most control of concentrations can be determined using concentration control coefficients. All predicted schemes discussed above enforce regulation on enzymes that are known to be regulation sites. Nine of these 11 enzymes are known to be sites of post-translational regulation in glycolysis and the PPP, either allosterically or through chemical modification (table 1): hexokinase, phospho-fructokinase, glyceraldehyde phosphate dehydrogenase, phosphoglycerate kinase, pyruvate kinase, PDH, G6PDH, TKT and pyruvate carboxylase (electronic supplementary material). Only the pyruvate mitochondrial transporter (PYRt2m) and PGL are not known to be regulated. The regulation assigned to the pyruvate transporter was done stochastically by the RL and likely should be assigned to PYK or PDH, as it was the deterministic MCA approaches. PGL presumably would be hard to control since it catalyses a highly favourable ring opening which may only require desolvation in the enzyme active site. It is worth noting the enzymes that are known to be regulated but were not indicated as being regulated in this study. Foremost among these is fructose bisphosphatase (FBA), an enzyme that is well-known to be regulated in gluconeogenesis. Under the limited number of conditions used in the study of gluconeogenesis herein, levels of fructose 6-phosphate or other downstream products never rose high enough to require regulation. Likewise, the products of enolase, phosphoglucose isomerase, PEP carboxykinase (electronic supplementary material, figure S2), glucose 6-phosphatase never rose to the level that these needed to be regulated, but it would be reasonable to expect that the respective enzymes may need to be controlled under conditions that were not tested here.
enzyme | pathway | prediction method | evidence | |
---|---|---|---|---|
HEX1 | upper glycolysis | 12.4 | RL, L-MCA, MCA | [17] |
PFK | upper glycolysis | 4.6 | L-MCA | [17,18] |
GAPD | lower glycolysis | 4.6 | RL, L-MCA, MCA | [18,48] |
PGK | lower glycolysis | 3.8 | RL, L-MCA | [18,49] |
PYK | lower glycolysis | 1.7 | RL | [50–52] |
PYRt2m | lower glycolysis | 1.1 | RL | — |
PDH | lower glycolysis | 0.6 | RL, L-MCA | [53] |
G6PDH | pentose phosphate | 16.8 | RL, L-MCA | [54] |
PGL | pentose phosphate | 16.0 | L-MCA | — |
TKT | pentose phosphate | 5.0 | L-MCA | [40] |
PC | gluconeogenesis | 3.7 | RL, L-MCA, MCA | [55] |
modelled and known to be regulated but not observed | ||||
PGM | lower glycolysis | 3.0 | — | [56] |
FBP | gluconeogenesis | 0.1 | — | [17,18] |
Of the 11 enzymes predicted to be regulated, outsized roles were played by the branch points of each of the pathways, as quantified by the influence of the enzyme activity coefficients, , on the respective reactants or products (table 1). The summary concentration control coefficient reports the total influence of the activity of the enzyme on all metabolites exceeding the experimentally observed values. The values reported in table 1 are consistent with recent experimental measurements on the effect of changes in expression levels of glycolytic enzymes on the concentrations of the same metabolites [5].
Hexokinase, the entry point into the model and entry point into upper glycolysis and the PPP, had the largest role with , meaning that hexokinase effectively had 100% control over 12.4 reactions. It is worth noting that both the RL and unrestricted MCA methods achieved successful control by regulating hexokinase, which is again consistent with recent experimental observations of glycolysis [5]. In the experimental studies, increased expression of hexokinase lead to increases in downstream phosphorylated compounds, including fructose 1,6-bisphosphate, sedoheptulose 7-phosphate, sedoheptulose 1,7-bisphosphate and 6-phosphogluconate, just as predicted here. Not surprisingly, increased levels of these metabolites due to increased hexokinase expression were correlated with a decrease of glycolytic rate, as one would expect if cytoplasmic solubility were adversely affected.
Likewise, G6PDH, the entry point into the PPP, had effectively 100% influence over 16.8 reactions, although this value is only seen this high when the PFK activity is set to 0.0 such that the PPP acts cyclically and three circuits around the cycle are made for each glucose metabolized. Likewise, for lower glycolysis the main control point, GAPD, is the entry into the pathway which is also where upper glycolysis and PPP converge. No regulation was needed for the TCA cycle under the conditions studied.
While the predictions align well with known sites of post-translational regulation, the predictions offer no information on whether the regulation would be due to allosteric interactions or chemical modification as might be inferred from more complex and expensive approaches that utilize (and require) absolute metabolite concentrations, fluxes inferred from isotope labelling studies, MS proteomics analyses and detailed kinetic models that include explicit enzyme binding, catalysis and product release [2]. The regulation predictions provided here, however, were done purely in silico with the optional use of absolute metabolite concentrations, if available. Although the regulatory effector cannot yet be inferred from this approach, it would seem reasonable to assume that control of metabolite concentrations would be due to allosteric regulation since allosteric interactions work on a faster timescale than post-translational modifications. It is likely that post-translational modifications act to redirect flux when either degradation of enzyme would be too slow, or when degradation and later resynthesis of the enzyme would be too costly [57], which is not the scenario addressed here.
Both MCA approaches were based only on adjusting the activities of enzymes that would have the most influence on reducing concentrations to physiological values. Only the RL approach rewards regulation schemes for maximizing the entropy production rate (equation (4.22)). Even though the RL and MCA methods have different aims, both maximized the energy dissipation rate, dE/dt, a principle alluded to by Lotka [58]. Furthermore, while the unrestricted MCA approach and the RL performed similarly, the local MCA approach did not always find a solution, which could reflect the incompleteness of the metabolic network that is modelled, or may simply indicate that modular regulation to this degree is insufficient. In addition, in at least one case the local MCA approach did not produce solutions with the highest energy dissipation rates. However, the set of enzymes predicted by the local MCA approach covers many more of the enzymes known to be classically regulated, as shown in table 1. The local MCA approach selects enzymes whose immediate products exceed physiological values. Since these same products will likely exceed physiological values in many other conditions as well, the local MCA approach may simply cover more of the set of enzymes requiring control in many different scenarios. It would be reasonable to expect, however, that the unrestricted MCA approach would be more accurate under any specific condition since it results in a more favourable free energy dissipation rate.
Consequently, we have shown how post-translational regulation results in the emergence of the general principle of maximal, entropy production rate for metabolism, and we can now also include the principle of maximization of the energy production rate, dE/dt, for pseudo-steady-state phenotypes. When these principles are used to make predictions, each prediction must also take into account the physico-chemical constraints on the system, such as the inherent constraints on the maximal rates of enzymes and thermodynamic costs and benefits, not simply metabolite solubilities [57]. These additional physicochemical constraints can explain the observed upper limit to free energy dissipation in microbial systems [59].
The observation of an upper limit to free energy dissipation is related to the concept of maintaining the adenylate energy charge ratio. The adenylate energy charge rule widely found in textbooks was defined in terms of concentrations as [(ATP) + 0.5 (ADP)]/ [(ATP) + (ADP) + (AMP)]. It was proposed that the role of regulation of enzyme activities is to maintain values of the energy charge between 0.75 and 0.90. There are many known exceptions where the adenylate charge falls below 0.75 yet the cells remain viable [46,47]. This proposed rule can no longer be regarded as a rule but as an emergent property, just as the maximization of energy production rates is an emergent property due to natural selection.
There is a nuanced but very important distinction between the concepts of whether the purpose of regulation is to literally maintain an energy charge or whether the energy charge measurement is an emergent property. The former concept associates a free will to the cell in deciding its fate—if the cell finds that the energy charge is getting to low for demand, it cranks up the supply from energy generating pathways to alleviate this. The role of thermodynamics is replaced by choice. The consequence of assigning free will to individual cells has led to the application of social theory and game theory to cells. The terms bet hedging strategies, cooperators, cheaters, non-cooperators, altruistic cells, etc., are commonly used in the literature. An unintended consequence of the application of these terms to cells has been that the anthropomorphic language frames and limits our thinking.
On the other hand, viewing the energy charge as an emergent property assigns no such choice to the cell. Instead, the energy state of the cell is a function of the external nutrients and environment. If an isolated cell is in an energy-rich environment, the energy charge will be high. Otherwise, the energy charge drops and turning up the energy generating pathways has no beneficial effect (unless, of course, there are alternate pathways that process alternate nutrients that are in abundance in the environment). For tissues and other communities of cells, the energy state of the community is the important criterion, and specific individual cells may be programmed off or on despite favourable thermodynamics at the individual level.
The increased application of physical concepts is transforming biology into a quantitative, physical science based less on pure observation and increasingly on fundamental principles. The model-based predictions of enzyme activities presented in this paper advance both the practice and theory of biology. The ability to predict from simulation or infer the free energy changes and control coefficients (in addition to fluxes) for each reaction allows the use of control theory and machine learning to analyse and explore the operations of the cell. In synthetic biology, the development of cell lines often requires additional circuits and can result in unforeseen consequences or lower cell growth rates. Simulation of cells with engineered or deleted circuits will allow prediction of the effects in place of difficult trial and error in experiments.
Finally, it is important to understand the principles behind post-translational regulation because regulation of metabolism is precisely what controls a cell’s energetic behaviour. From bacterial growth and reproduction, to developing cells or even halting the growth of cancer cells, regulation plays the central role. Learning how cells regulate and control themselves is essential for designing new organisms that have an intended purpose (synthetic biology), developing new strategies to target and control microbial and metabolic diseases (medicine) and understanding design principles of biology (fundamental science). Currently, no other experimental or computational approach has been shown to identify points of regulation in metabolism in a rapid manner.
4. Methods
4.1. Convex optimization approach for obtaining metabolic steady state
For a reversible chemical reaction, the reaction is described by the chemical equation
The law of mass action may be formulated in terms of chemical kinetics or thermodynamics. With respect to chemical kinetics, the law of mass action is expressed by the rate or net flux, Jnet,1, of the reaction where the forward and reverse rates are proportional to the respective reactants
Given a metabolic model with Z reactions, M metabolic species, and N total particles, we formulate the flux through each reaction using equation (4.5). In this work, the largest values of Z and M in a pathway are 29 and 47, respectively. If we assume the rate of change of the odds are equal and independent of concentrations, then the coupled reactions occur on the same timescale. Under these assumptions, the resulting equation for the jth reaction is the Marcelin equation [22]
The predicted concentrations from the optimization follow the multinomial Boltzmann distribution in which the concentration of each species is proportional to its standard chemical potential, , adjusted for aqueous solution at pH 7.0,
4.2. Metabolic regulation: a metabolic control theory approach
Regulation is applied to reactions by changing the scalar valued activity of the jth enzyme, αj ∈ [0.0, 1.0], where activity values of 0.0 and 1.0 represent complete reaction regulation and no enzyme regulation, respectively. The activity for each reaction j is represented by a multiplier to the net reaction flux Jj such that,
In MCA, the sensitivity of a concentration ni to the activity αj of enzyme j is defined as the concentration control coefficient (equation (2.2)). Concentration control coefficients can be used to determine how much to reduce the activities of an enzyme to bring the predicted concentrations into alignment with physiological values observed from experimental metabolomics assays. The detailed calculation is described in the electronic supplementary material. If concentrations ni for a metabolite i have not been measured, then target values are assumed to be 1.0 mM, which ensures that concentrations stay reasonable even for metabolites whose concentrations have not been measured. When predicted values exceed the measured or target values, regulation is applied to reactions by changing the scalar valued activity of the jth enzyme, αj.
Which reaction to regulate is determined from examining the concentration control coefficients with regard to the metabolites whose concentrations are higher than is observed in experiment. We denote the set of such metabolites as . An activity is then selected to be reduced based on the influence that the activity has on these concentrations
In practice, the activity that reduces the cost function, L, the greatest amount is chosen for regulation and is again determined using MCA. In MCA, the concentration control coefficient for metabolite i due to control by reaction j is defined by equation (2.2). Consequently, the change in concentration or counts due to a change in activity of reaction j is
The code implementing the MCA framework is available in the electronic supplementary material.
4.3. Exploring regulation: a reinforcement learning approach
The MCA method for bringing the predicted concentrations in alignment with observed concentrations is a deterministic approach based on an assumption that metabolite concentrations depend linearly on the enzyme activities. It is feasible that the assumption of linearity used in the MCA analysis (electronic supplementary material) results in sub-optimal regulation. Optimal regulation has been hypothesized, based on empirical data, as regulation that maintains a high energy charge, defined in terms of ATP, ADP and AMP [6]. A less ad hoc definition of optimal regulation would be the maximization of the entropy production rate, which has also long been hypothesized as an objective of biological systems [58,65]. Neither of these concepts are addressed in the MCA approaches discussed above. For steady-state systems, the entropy production rate (EPR) is the negative of dG/dt defined in equation (2.4) [33,34],
Specifically, we use an RL framework which can address decision problems that are otherwise combinatorially intractable. Even a small metabolic network may have of the order of 20–50 reactions. To explore the state space fully using the deterministic MCA approach, of the order of 100–500 decisions need to be made as to which reaction to regulate depending on the state of the system. The search space is then approximately between 20100 and 50500, a number much too large to tackle by an exhaustive search or Monte Carlo approach.
In our framework, optimal regulation of a metabolic network requires that the EPR be maximized while satisfying a stopping criteria: L = 0.0. A diverse set of reaction regulation schemes represented by enzyme activity values, {α1, …, αZ}, satisfy the stopping criteria, but each scheme results in a different EPR (figures 3c–5c, grey dots). Thus, we use a hybrid optimization-RL approach to iteratively search for the best regulation scheme. (A hybrid simulation-RL approach can also be used.) In this framework, the agent iteratively learns how to navigate a state space, S, by using different possible actions from an action space, A. States correspond to the value of enzyme activities while actions correspond to regulating a specified reaction. We therefore define the state space S as the subset of using range of each enzyme activity, [0.0, 1.0]Z, and the action space as the set of reactions, A = {1, 2, …, Z}. We also define a subset of S where learning terminates, ST = {s ∈ S|L(s) = 0.0}.
Because regulating the jth enzyme results in a deterministic step-size, Δαj, the resulting state is given by the following set of enzyme activities: {α1, …, αj − Δαj, …, αZ}. The goal of RL is to learn an optimal policy, π* : S → A, which results in a regulation scheme that maximize some defined notion of rewards, . In other words, learning the optimal policy corresponds to learning the regulation scheme for the chemical reaction network that results in the largest reward.
Each reaction that is regulated results in a scalar valued reward, or feedback, from the environment based on an action/regulation (figure 2) that indirectly defines optimal regulation schemes. Each regulation decision alters the steady-state metabolite concentrations, which are obtained from optimization or simulation of equation (4.14), and used to calculate rewards using a loss function, Λ, specified by
We define the environmental feedback, or reward function R as:
Learning is conducted by iteratively updating the current policy function, π : S → A, that determines the agent behaviour. The policy function determines which reaction j ∈ Z should be regulated based on the current enzyme activities, {α1, …, αZ} ∈ S. Here, we use an n-step SARSA algorithm [31] to perform fitted value function iteration. An optimal policy is therefore learned by iteratively updating the value function, , which is defined as the expected rewards to be received by following a fixed policy from a specified state, Vπ(st) = Eπ[rt:t+n|st]. In an n-step algorithm, the value function is meant to predict the discounted reward, rt:t+n, for n future steps. The n-step reward experienced by the agent is defined as rt : t+n = rt + γrt+1 + · · · + γn−1rt+n−1 + γnV(st+n), where γ ∈ [0.0, 1.0] is the discount factor. Each reward, rt = R(st−1, a, st), represents the feedback from moving into state st from st−1 after taking some action a. The first n steps represent the rewards experienced, while the term V(st+n) represents the future rewards. Once a terminal state is less than n steps away, the n-step reward is truncated to the appropriate length.
Learning the value function implicitly improves the policy. The relationship between the value of a state and the policy is given by an ε-greedy policy, which is defined as:
Finally, the state value function is estimated by using a neural network implemented in PyTorch [66] with a single hidden layer and hyperbolic tangent activation functions. Updates to the value function are performed by optimizing the neural network using stochastic gradient descent. This is done by backpropagating the squared loss between the predicted value and the n-step reward, [V(st) − rt : t+n]2. The code implementing this framework is available in the electronic supplementary material.
4.4. Model training
Prediction of network regulation was performed using a trained neural network to estimate the value function. Network weights were adjusted using stochastic gradient descent with a learning rate, lr ∈ {10−4, 10−5, 10−6}. Each algorithm learned and generated data using an ε-greedy policy with initial ε = 0.5 or 0.2 depending on the size of the pathway. ε was annealed by dividing by a factor of two every 25 learning episodes.
For each pathway, 10 agents are trained for each different value of n ∈ {2, 4, …, 12} and each learning rate. The resulting average of 10 RL runs for the glycolysis–PPP–TCA pathway (electronic supplementary material, figure S1) show the mean reward for the 350 training episodes. Optimal regulation is prescribed by analysing the agent with the largest cumulative reward averaged over the last 50 terminal states.
4.5. Metabolic models
The three metabolic models consisted of either the 10 reactions in gluconeogenesis or the 28 reactions of glycolysis, the PPP, the TCA cycle and the NADH-dependent l-glutamine:2-oxoglutarate amidotransferase (GOGAT) reaction. The largest model consisted of 47 metabolites, of which 20 were fixed boundary species and 27 were allowed to vary. The number of variable metabolites (27) was less than the number of reactions (29) due to one cycle of eight reactions but only seven intermediate species, and also because the GOGAT reaction, which maintains a defined number of species in the TCA cycle, had two fixed species (glutamate and glutamine) and shared a variable species with the TCA cycle (2-oxoglutarate). Without the GOGAT reaction, the concentrations of metabolites in the TCA cycle could have a range of values as long as they maintain consistent values in forward and backward thermodynamic odds KQ−1 solutions of equation (4.14). The fixed boundary metabolites were such that the entire system was non-equilibrium: β−d, l, l, ADP = 5.6 × 10−01 mM, ATP = 9.6 mM, orthophosphate = 20 mM, NADH = 8.3 × 10−02 mM, NAD+ = 2.6 mM, , CO2 = 0.1 mM, and coenzyme A = 1.4 mM. The complete model and submodels are available as computational notebooks in Jupyter and Python (electronic supplementary material).
Standard free energies of formation of metabolites in aqueous solution as well as equilibrium constants were determined using the eQuilibrator software, version 0.1.8, [67] using pH = 7.0 and ionic strength i = 0.15.
4.6. Data
The metabolomics data used in this study were from E. coli studies by Bennett et al. [24] and Park et al. [25]. Briefly, E. coli cells were grown in isotope-labelled media and then extracted in organic solvent containing unlabelled internal standards in known concentrations. Metabolites were extracted in cold solvent and analysed using chromatography-MS, and concentrations relative to the known standard concentrations were obtained using peak ratios of the labelled samples to unlabelled standards.
If no experimental data are available, the analysis is carried out using estimates of metabolite concentrations (electronic supplementary material). For this purpose, we use an estimate of 1.0 mM for each metabolite that is variable. For fixed metabolites that form the boundary conditions, specific values are required that induce appropriate non-equilibrium boundary conditions.
Standard free energies of reaction were calculated using eQuilibrator and the eQuilibrator API [67]. eQuilibrator uses well-curated gold standard data on the thermodynamics of reactions from the National Institute of Standards and Technology [68], which is the basis for subsequently adjusting reference free energies for pH and ionic strength. For reactions for which experimental data are not available, free energies are estimated using reliable reaction comparison methods [69] or electronic structure calculations [70].
Data accessibility
Links to the code repository and computer notebooks containing the code used in the study are provided in the electronic supplementary material.
Authors' contributions
S.B. developed and implemented the reinforcement learning and control theory code and co-wrote the manuscript. M.A. helped develop strategy, oversaw development work, and contributed to writing the manuscript. W.R.C. assisted in the reinforcement learning strategy, co-developed control theory approach, developed the optimization procedures and co-wrote the manuscript.
Competing interests
The authors declare no competing interests.
Funding
S.B. was supported by a US Department of Energy (DOE), Office of Science Graduate Student Research award and by funding from the DOE Office of Biological and Environmental Research, through project 74860. W.R.C. was supported by joint funding from the National Institute of Biomedical Imaging and Bioengineering through award U01EB022546 and by the DOE Office of Biological and Environmental Research, through project 69513. PNNL is operated by Battelle for the US DOE under Contract DE-AC06-76RLO. M.A. was supported by funding from the DOE Office of Biological and Environmental Research, through project 74860, the National Science Foundation, grant no. DMS-1762063, through the joint NSF DMS/NIH NIGMS Initiative to Support Research at the Interface of the Biological and Mathematical Sciences.