Maximizing average throughput in oscillatory biochemical synthesis systems: an optimal control approach

A dynamical system entrains to a periodic input if its state converges globally to an attractor with the same period. In particular, for a constant input, the state converges to a unique equilibrium point for any initial condition. We consider the problem of maximizing a weighted average of the system’s output along the periodic attractor. The gain of entrainment is the benefit achieved by using a non-constant periodic input relative to a constant input with the same time average. Such a problem amounts to optimal allocation of resources in a periodic manner. We formulate this problem as a periodic optimal control problem, which can be analysed by means of the Pontryagin maximum principle or solved numerically via powerful software packages. We then apply our framework to a class of nonlinear occupancy models that appear frequently in biological synthesis systems and other applications. We show that, perhaps surprisingly, constant inputs are optimal for various architectures. This suggests that the presence of non-constant periodic signals, which frequently appear in biological occupancy systems, is a signature of an underlying time-varying objective functional being optimized.

MM, 0000-0001-8319-8996; EDS, 0000-0001-8020-5783 A dynamical system entrains to a periodic input if its state converges globally to an attractor with the same period. In particular, for a constant input, the state converges to a unique equilibrium point for any initial condition. We consider the problem of maximizing a weighted average of the system's output along the periodic attractor. The gain of entrainment is the benefit achieved by using a non-constant periodic input relative to a constant input with the same time average. Such a problem amounts to optimal allocation of resources in a periodic manner. We formulate this problem as a periodic optimal control problem, which can be analysed by means of the Pontryagin maximum principle or solved numerically via powerful software packages. We then apply our framework to a class of nonlinear occupancy models that appear frequently in biological synthesis systems and other applications. We show that, perhaps surprisingly, constant inputs are optimal for various architectures. This suggests that the presence of non-constant periodic signals, which frequently appear in biological occupancy systems, is a signature of an underlying time-varying objective functional being optimized. intracellular and extracellular interactions [1,2]. In the presence of such excitations, proper functioning of biological systems often requires their internal states to synchronize with the periodic input signal. In the parlance of systems theory, this is known as entrainment, which means that the response of a system subject to a periodic input with period T will converge to a periodic trajectory of the same period T. There has been great recent interest in the study of this phenomenon [3][4][5][6][7]. Examples of external periodic influences include operation under the influence of sunlight, which requires the internal clocks of biological organisms to entrain to the 24-hour solar day. For instance, it has been shown that the plant Arabidopsis uses its circadian clock to anticipate times with an increased susceptibility to fungal pathogens and regulates its immune system resources accordingly [8]. Entrainment is also essential in many synthetic biological systems. For instance, synthetic oscillators can be used to emulate natural hormone release rhythms in the treatment of certain diseases [9]. More generally, robust and optimal synthetic oscillators constitute an important module in larger systems [10,11].
At the intracellular level, the cell cycle is a periodic routine that regulates DNA replication and cell division. This requires precise regulation of many interacting proteins and also appropriate resource allocation at different stages of the cell cycle. Deviations from the programme can lead to cell death or cancer.
An important underlying process is translation, which is a major component in the central dogma of molecular biology, and requires sophisticated coordination among ribosomes, mRNA and tRNA molecules, and various proteins. Two of the key underlying steps are initiation in which the ribosome attaches to an mRNA molecule, and elongation, in which the ribosome scans along the mRNA to produce a chain of amino acids. Regulation of initiation and elongation are an effective way to control protein concentrations [12,13].
One biological mechanism for cell cycle-regulated genes is based on codons whose corresponding tRNAs have low abundances (known as non-optimal codons) [14,15]. In particular, periodic variations in the level of these specific tRNAs can generate cell cycle-dependent oscillations in the corresponding protein levels [14]. In other words, the protein levels entrain to the periodic excitation provided by the tRNA levels. Similar oscillation-inducing regulation mechanisms during DNA damage response have also been reported [16]. Other works have indicated that the speed of translation is sensitive to fluctuating tRNA availability [17], that cells use tRNA to control protein abundance in stress conditions [18], and that tRNA dysregulation is a contributing factor in cancer progression [19]. In addition to tRNA regulation, many other intracellular oscillators have been identified as regulators of the cell cycle [20,21].
Here, we analytically investigate the hypothesis that periodic rates in the cellular environment are used to maximize gene expression throughput. To that end, we first describe, as a motivation, a class of mathematical models that are useful in modelling various processes involved in gene translation. Our focus is to analyse these models in the presence of periodic excitations modelled as periodic inputs that attempt to maximize a certain reward function that is proportional to the throughput of the system. We pose these problems in the rigorous language of optimal control theory and analyse them under a variety of assumptions.

Motivation: occupancy models
In many important biological models, state variables describe the occupancy in a certain site or a compartment. For example, in physiology, compartmental models describe drug absorption distribution and elimination in various body fluids or tissues [22].

A one-dimensional model
Many biological processes involve 'biological machines' that move along a one-dimensional lattice of ordered 'sites'. Examples include ribosomes that scan the mRNA molecule during translation, molecular motors that carry cargoes along a filamentous network in the cytoskeleton, and phosphotransferases that transfer the phosphoryl group from the sensor kinases to some ultimate target. To be concrete, we focus on ribosomes and mRNA translation, but the same ideas apply to other models.
We now derive such a one-dimensional occupancy model using several alternative modelling approaches. Let X (Z) be the species denoting bound (unbound) ribosomes. The free ribosomes bind to mRNA. Bound ribosomes need tRNAs to translate the information in the mRNA into proteins (P). A phenomenological one-step model written in chemical reaction network (CRN) formalism [23] gives royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 We assume that tRNA and mRNA are abundant, so that their dynamics are not affected by the aforementioned reactions. Note that the species tRNA represents all possible variants of transfer RNA. Let x(t) be the concentration of occupied (bound) ribosomes in the cell at time t, and let z(t) be the concentration of free ribosomes. The occupancy of ribosomes is determined by mRNA transcript abundance u 0 (t) and tRNA abundance u 1 (t). The CRN gives the following system of bilinear ODEs: _ xðtÞ ¼ u 0 ðtÞzðtÞ À u 1 ðtÞxðtÞ, _ zðtÞ ¼ u 1 ðtÞxðtÞ À u 0 ðtÞzðtÞ: Assuming a fixed total concentration of ribosomes M, we have x(t) + z(t) ≡ M. The total concentration can be normalized to M = 1. Then the two-dimensional dynamics can be reduced to a one-dimensional ODE _ xðtÞ ¼ u 0 ðtÞð1 À xðtÞÞ À u 1 ðtÞxðtÞ: ð1:2Þ This implies that x(t) evolves on the unit interval, and it can be interpreted as a normalized occupancy of some site at time t. More generally, x(t) can be interpreted as the probability that a certain site is occupied by some 'biological machine', like a ribosome or a molecular motor, at time t. This occupancy model has been termed a 'bottleneck' module in [24].
Note that the occupancy model (1.1) can also be used to model binding and unbinding of a substrate to an enzyme.

Multisite models: the ribosome flow model
The Totally Asymmetric Simple Exclusion Process (TASEP) [25] is a fundamental stochastic model from non-equilibrium statistical physics. In TASEP, particles move forward at random times along a onedimensional chain of sites. A site can be either free or contain a single particle. Totally asymmetric means that the flow is unidirectional, and simple exclusion means that a particle can only hop into a free site. This models the fact that two particles cannot be in the same place at the same time. The simple exclusion paradigm generates an indirect coupling between the particles and also allows modelling the evolution of 'traffic jams': if a particle remains at site i for a long time, then particles will accumulate 'behind' it, i.e. in site i − 1, then site i − 2 and so on. TASEP has been used extensively to model and analyse ribosome flow [26] and many more natural and artificial processes including molecular motors, traffic flow, evacuation dynamics and more [27].
The ribosome flow model (RFM) [28] is the dynamic mean-field approximation of TASEP. In the RFM, the state variables x 1 ðtÞ, . . ., x n ðtÞ describe the occupancy in n sites along the mRNA molecule. The RFM dynamics is described by a system of n first-order ODEs where we define x 0 (t) ≡ 1 and x n+1 (t) ≡ 0. Here, x i (t) describes the occupancy at site i at time t, normalized such that x i (t) = 0 [x i (t) = 1] means that site i is completely empty [full] at time t. In the context of translation, λ i (t) > 0 describes the transition rate from site i to site i + 1 at time t. This rate depends on various biomechanical properties, for example, the abundance of tRNA molecules delivering the amino acids to the ribosomes. Equation (1.3) can be explained as follows. The change in the density in site k is the flow from site k − 1 into site k minus the flow from site k to site k + 1. The first term, λ k−1 x k−1 (1 − x k ), is proportional to the transition rate from site k − 1 to k, the occupancy at site k − 1, and the amount of 'free space' (1 − x k ) at site k. Note that this is a 'soft' version of simple exclusion: as site k fills up, the flow into it decreases. The second term is similar. Note that λ n (t)x n (t) describes the flow of ribosomes out of the last site at time t, i.e. the protein production rate. If the whole mRNA strand is considered as one site, that is, n = 1, then the RFM model will be identical with the occupancy model (1.2). The state space of the RFM is the n-dimensional unit cube [0, 1] n . It was shown in [29] that the RFM (with constant λ i 's) admits a unique equilibrium x e ¼ x e ðl 0 , . . ., l n Þ [ ð0, 1Þ n and that for any a ∈ [0, 1] n , the corresponding solution of (1.3) satisfies lim t→∞ x(t; a) = x e . In other words, the transition rates determine a unique globally asymptotically stable (GAS) equilibrium. More generally, [6] showed that if the rates are time varying, and jointly periodic with a period T, then (1.3) admits a unique solution g T : R þ ! ð0, 1Þ n , i.e. T-periodic, and x(t, a) converges to γ T for all a ∈ [0, 1] n . In other words, the RFM entrains. Note that a constant rate is T-periodic for any T, so entrainment also holds if a single rate is T-periodic and all the other rates are constant. In the biological context, entrainment can be interpreted as follows: if, say, variations in tRNA abundances generate T-periodic initiation and/or elongation rates, then the protein production rate will also converge to a periodic pattern with period T.
Just like TASEP, the RFM (and in particular the model (1.1)) is a phenomenological model that can be applied to study various processes like vehicular or pedestrian traffic [24]. In this case, the occupancy is interpreted as the ratio between the number of vehicles (or pedestrians) at a certain junction at time t and the total number of possible vehicles.

Generalized occupancy models
Recall that the linear single-input single-output (SISO) linear system is called positive if every entry of b, c, and every off-diagonal entry of A is non-negative (i.e. A is a Metzler matrix). This implies that for any x(0) ≥ 0 and any control u with u(t) ≥ 0 for all t we have x(t) ≥ 0 and y(t) ≥ 0 for all t ≥ 0 [36]. This is useful when the state variables and the output represent physical quantities that can never attain negative values, e.g. population sizes or concentrations of molecules.
Generalized occupancy models (GOMs) are a cascade of occupancy models and SISO positive linear systems. These models are useful when the output of an occupancy model is the input to another biological system that, in the vicinity of its equilibrium point, can be approximated as a positive linear dynamical system. Similar to the multi-site RFM model introduced before, it can be shown that GOMs entrain to periodic inputs.
For example, figure 1a depicts a time-varying bottleneck module feeding a positive linear system. In this module, u 0 , u 1 are entrance rates, and w 1 is the exit rate. The effective inflow is proportional to the vacancy 1 − x(t), while the outflow is proportional to the occupancy x(t). This cascade models an occupancy model driving a downstream linear system. As another example, figure 1b depicts a linear system 'sandwiched' between a two-site RFM and one-site RFM. This can model a situation where the production rate of one protein affects, via another biological process, the promoter (and thus the initiation rate) of some other mRNA.
A GOM can also be used to model the RFM with time-varying rates under the condition l i ðtÞ ) l 0 ðtÞ for all i ! 1 and all t ! 0: ð1:4Þ Then, we can expect that the initiation rate becomes the bottleneck rate and thus x i (t), i ¼ 2, . . ., n, converge to values that are close to zero, suggesting that (1.3) can be simplified to _ x 1 ðtÞ ¼ l 0 ðtÞð1 À x 1 ðtÞÞ À l 1 ðtÞx 1 ðtÞ and _ which has the same form as the cascade in figure 1a.
bottleneck module u 0 (t) Figure 1. Two examples of generalized occupancy models. The controls are u 0 (t), u 1 (t) which are scalar functions. We have x 1 , þ . The linear system block is assumed to be positive and Hurwitz.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 4 After these motivating examples, we next formulate the abstract questions to be studied in this article.

Gain of entrainment
To analyse the effect of periodic signals on the performance of an occupancy system, we focus on a class of systems whose internal variables entrain, i.e. follow the rhythms of an external driving signal. Entrainment can be studied in the framework of systems and control theory. The periodic excitation is modelled as the control input u(t) of a dynamical system, and the system entrains if in response to a T-periodic excitation it admits a globally attractive T-periodic solution γ T . In other words, every solution of the system converges to the attractor γ T . Assuming that a system entrains, the next question is: what is the quantitative potential advantage of entrainment? In other words, is entrainment to periodic inputs more advantageous when optimizing a certain objective function? If the answer is affirmative, then we say that the system exhibits a positive gain of entrainment. To explain this, consider a control system that, for any T ≥ 0 and any T-periodic control u T , entrains to a unique T-periodic solution γ T . Note that, in particular, this implies that for any constant control u(t) ≡ u 0 the trajectory converges to a unique equilibrium γ 0 for any initial condition. Suppose also that the system admits a scalar output y(t) = h(t, x(t), u(t)) (i.e. a function of time, state and the input), and that h is T-periodic in t, so that the output also entrains. The output represents a quantity that we would like to maximize, e.g. traffic flow or protein production rate.
Since the system entrains, we ignore the transients and consider the problem of maximizing the average of the periodic output, that is, the average over a period of h(t, γ T , u T ). The gain of entrainment is the benefit (if any) in the maximization for a (non-trivial) periodic control over a constant control. A natural example is to analyse the gain in traffic flow for periodically varying traffic signals over constant signals. However, to make this meaningful, we must add another assumption, namely, that the total time of green lights in both alternatives is equal. Mathematically, this means that we compare the average output for a time-periodic control u T and a constant control u such that the average value of u T over a period is equal to u. If the gain of entrainment is positive, then entrainment does not only assist in producing an internal clock that can follow an external periodic excitation but also yields higher production rates than those obtained by equivalent constant excitations.
The possible advantages of periodic forcing of various production processes are well known. For example, [37] states that: '…theoretical and experimental studies have shown that the performance (for instance micro-algae or bio-gas production) of some optimal steady-state continuous bioreactors can be improved by a periodic modulation of an input such as dilution rate or air flow'. Reference [38] studies a partial differential equation (PDE) model for harvesting a biological resource and demonstrates the advantages of periodic harvesting over a constant one.
The gain of entrainment was recently introduced in [24]. Entrainment in nonlinear systems is nontrivial to prove. A typical proof is based on contraction theory [6,7], yet this type of proof provides no information on the attractive periodic solution, except for its period (see [39] for some related considerations). Nevertheless, we show here that determining the gain of entrainment can be cast as an optimal control problem. This allows using powerful theoretical tools, like Pontryagin's maximum principle [40][41][42], as well as numerical methods in studying the gain of entrainment. We demonstrate this by analysing the gain of entrainment in several examples of occupancy models.
For instance, consider the gain of entrainment for (1.5). It is natural to speculate that using timeperiodic rates λ i (t), which are properly synchronized, yields a positive gain of entrainment with respect to using constant rates (with the same average values). In the context of traffic flow, this is equivalent to the conjecture that properly synchronized periodic traffic lights can improve the overall flow. However, we show that, perhaps surprisingly, for a subclass of these systems, the gain of entrainment is in fact zero.
We also consider a problem formalism that allows for time-varying costs of resources, like tRNAs, along the period. These may be produced at different unit costs at different times of the cycle. This modified formulation allows the allocation of resources differently at different times along the cycle. Also, instead of average throughput, a weighted average of the product may be more relevant, in the sense that we may need certain enzymes at different times of the day or at different points in the cell cycle. This corresponds to 'just-in-time production' [43]. In such cases, we show, not so surprisingly, that time-varying periodic inputs may indeed offer an advantage over constant inputs. This suggests royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 that the presence of non-constant periodic signals, which frequently appear in biological occupancy systems, implies that the system is optimizing an underlying time-varying objective functional.
Our work is related to results from the field of optimal periodic control (OPC) (e.g. [44]). As noted by Gilbert [45], OPC was motivated by the following question: does time-dependent periodic control yield better process performance than optimal steady-state control? In particular, the recent paper [46] defines a notion called over-yielding that is closely related to the gain of entrainment. However, our setting is different, as in OPC periodicity was enforced by restricting attention to controls u guaranteeing that x(T ) = x(0). This implies in particular that the initial value x(0) (and thus also general transient behaviours) may have a strong effect on the results. Also, in the typical OPC formulation, there is in general no requirement that the averages of the periodic and constant controls are equal.
We study systems that entrain, and thus for a T-periodic control, the state of the system converges to a unique T-periodic trajectory for any initial condition x(0). In other words, we consider the behaviour of attractors.
The remainder of this article is organized as follows. The next section defines the gain of entrainment for a general mathematical model. Section 3 shows how the analysis can be cast as an optimal control problem. Section 4 demonstrates the theory for the two-input bottleneck module. Section 5 proves that for several GOMs, including the ones depicted in figure 1, the gain of entrainment is zero. Finally, conclusions and future directions are presented in §6. Appendix A contains proofs of the results including a detailed analysis characterizing extremals via the Pontryagin maximum principle (PMP).

Gain of entrainment
We consider a general nonlinear control system We allow h to be time varying to include the cases in which different weights can be used at different times in the cycle. The set of admissible controls consists of measurable functions taking values in some compact set U , R m . Let x(t, p, u) denote the solution of (2.1) at time t ≥ 0 for the initial condition x(0) = p and the control u. We assume throughout that for any x(0) in the state space and any admissible control, (2.1) admits a unique solution for all t ≥ 0. We say that system (2.1) entrains if in response to any admissible and T-periodic control u T (t) the system admits a unique T-periodic solution γ T (t) (that depends on u T ), and for any initial condition p, the solution x(t, p, u T ) converges to γ T . This implies in particular that the system 'forgets' its initial condition.
To explain the mathematical formulation of the gain of entrainment, fix q [ R m with q i > 0 for all i. We would like to consider only inputs whose average over a period is q and compare their effect to the effect of the constant control u(t) ≡ q. However, we allow a slightly more general scenario by fixing a weighting function α(t) > 0 such that 1 T Ð T 0 aðtÞ dt ¼ 1. We then restrict attention to T-periodic controls satisfying the weighted integral constraint that is, the α-weighted average of u is q. This can be further generalized by allowing a general measure μ on the interval [0, T ] and imposing Ð ½0,T uðtÞ dm ¼ q. However, we keep the presentation simple by adhering to (2.2). Let that is, the average value of the output along the globally attractive T-periodic solution (recall that we assume that h is T-periodic in its first variable). If the convergence to γ T is relatively fast, then after a short transient the average output over a period of length T is very close to z(u). In applications in royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 6 fields like biotechnology and traffic control, the average value of the output, and not its specific values at all times, is often the relevant quantity. The constant control u(t) ≡ q, which we simply denote by q, is also T-periodic (for any T ≥ 0) and satisfies (2.2). Hence, the corresponding solution converges to a fixed point e = e(q) and hðt, e, qÞ dt ¼ hðe, qÞ: The gain of entrainment of (2.1) is defined as follows: where the sup is over all admissible, T-periodic controls that satisfy the constraint (2.2). Thus, we are always comparing the effect of controls with the same average value. Note that c T (q) ≥ 0 for all q. If c T (q) > 0 for some q, then there exists a non-trivial periodic control that yields a higher average output than that obtained for a constant control. If c T (q) = 0, then non-trivial T-periodic controls are 'no better' than the simple constant control equal to q.
To gain a wider perspective, consider the case of a SISO asymptotically stable linear time-invariant (LTI) system with input [output] u(t) [y(t)] and transfer function G(s). Fix T > 0, and consider the Tperiodic control Let ω := 2π/T. It is well known that the output converges to the T-periodic function y T ðtÞ : On the other hand, for the constant control u(t) ≡ a, the output converges to G(0)a, which is the same value. Thus, for this input, the gain of entrainment is zero. Any T-periodic, measurable, and bounded input can be expressed as a Fourier series in terms of sinusoidal functions, and this implies that for LTI systems, the gain of entrainment is always zero.
However, for nonlinear system, the gain of entrainment may be positive. The next two examples demonstrate this.
Example 2.1. Consider the scalar system: _ xðtÞ ¼ 1 À xðtÞuðtÞ and yðtÞ ¼ xðtÞ: ) ð2:5Þ For the control u(t) ≡ q any solution of (2.5) converges to the equilibrium q −1 . Consider a T-periodic and positive control u T (t) satisfying u T ¼ q, and assume there exists some α > 0 such that u T (t) ≥ α for almost all t ∈ [0, T ]. Then any matrix measure of the Jacobian of (2.5) is uniformly less than or equal to −α < 0. Therefore, the system is contractive and any solution of (2.5) converges to a unique T-periodic solution x T (t).
Let ω := 2π/T. Consider now the specific T-periodic control u T ðtÞ : ¼ 1 þ 1 2 cosðvtÞ: ð2:6Þ For this input, the corresponding solution of (2.5) is where fðtÞ : ¼ Ð t 0 expðs þ ðsinðvsÞ=2vÞÞ ds. In particular, The initial condition royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 On the other hand, for the control u 0 (t) ≡ 1, the solution of (2.5) converges to the steady-state 1. Figure 2 depicts the value as a function of ω = 2π/T. It may be seen that this is always positive and is maximal as T → ∞. We conclude that for q = 1 the gain of entrainment of (2.5) is positive for any T > 0. Note that for large values of ω, the gain of entrainment goes to zero. This is expected due to averaging [47]. Roughly speaking, for large values of ω, the system cannot track the fast changes in the input and thus responds to the average of the input. More rigorously, for a system affine in the control, the map from controls on an interval [0, T ] to trajectories on [0, T ] is continuous with respect to the weak Ã topology in L 1 and the uniform topology on continuous functions, respectively (e.g. [48, theorem 1]), and for a periodic input u(t), the input u(ωt) converges weakly to the average of u. An alternative proof is given for example in the textbook [47] (Section 10.2) (changing time scale in the statement of theorem 10.4, by x(t) = x(t/ε)). Hence,  royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 It follows that x 2 converges to the steady-state solution On the other hand, for the average input u T ¼ 1, x 1 (t) converges to one, and x 2 (t) to a, so the average of the output is a. The difference between the two averaged outputs is thus This is maximized for ω = 0, so the gain of entrainment is at least c T (1) = a/2. Observe that examples with arbitrarily large gain of entrainment can be obtained by taking the constant a in (2.8) large enough.
In the next section, we cast the problem of determining the gain of entrainment as an optimal control problem.

Optimal control formulation
Consider the control system (2.1) with n state variables and m inputs. We assume that the system entrains. Pick any T > 0 and any q [ R m . We restrict attention to T-periodic controls satisfying the individual weighted average constraints where J : ½0, T Â R m þ ! R m þ is an integrable vector positive function that satisfies 1 T Ð T 0 Jðt, qÞ dt ¼ q. The n-dimensional control system with the integral constraint on the controls (3.1) can be lifted to an (n + m)-dimensional nonlinear control system by adding m-equations to (2.1): wherex : ¼ ½x T j T T . We impose the boundary conditions xð0Þ ¼ xðTÞ, jð0Þ ¼ 0, jðTÞ ¼ Tq: ð3:3Þ Since we consider systems that entrain, for any T-periodic control, there corresponds a unique GAS Tperiodic solution γ T (t). The condition x(0) = x(T ) guarantees that the maximization is performed over this solution. The other two conditions are equivalent to (3.1).
To make the problem well posed, we will assume that controls take values in a hypercube [ℓ, L] m , where 0 < ℓ < L. We then formulate an optimal control problem as follows: Note that J(u) is the average value of the output along the globally attractive T-periodic solution (recall that we assume that h is T-periodic in its first variable).
In what follows, we always consider systems affine in the control. Then, the fact that [ℓ, L] m is compact and convex implies, by Filippov's theorem (e.g. [49]), that the reachable set at any time t ≥ 0 is compact. Since h is locally Lipschitz, an optimal control exists.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 The optimal control formulation allows to apply powerful theoretical tools for solving optimal control problems as well as use software packages for numerical solutions (e.g. [50,51]). In the next section, we demonstrate how to determine the gain of entrainment using this formulation for both time-invariant and time-varying cost functions.
Remark 3.2. The control signals can be assumed to belong to different intervals. In other words, we can have u j (t) ∈ [ℓ j , L j ], j ¼ 1, . . ., m: Nevertheless, we have simplified the aforementioned formulation by re-scaling the controls, so that they all satisfy the same bounds.
The following result is immediate: In other words, the problem of choosing the best periodic input signal has been transformed into a standard optimal control problem. Hence, to find how much improvement we get with periodic inputs, i.e. the gain of entrainment, we must find a control u that solves problem 3.1 and then compute J(u) − q.

Pontryagin's maximum principle
Problem 3.1 can be studied in the framework of the PMP [40][41][42]. The Hamiltonian associated with our problem is where pðtÞ [ R nþm is the co-state, and the abnormal multiplier p 0 ≥ 0 is a constant.
Proposition 3.4 (PMP). Let u Ã ðtÞ [ R m , t ! 0 be an optimal control for problem 3.1, and let x Ã : ½0, T ! R nþm be the corresponding optimal trajectory. There exist p Ã 0 ! 0 and p Ã : ½0, T ! R nþm n f0g, such that: 1. The optimal statex Ã ðtÞ and corresponding adjoint p Ã (t) satisfy: The proof is given in appendix A. Use of the PMP to deduce the structure of the optimal control is a difficult problem in general. We will show in the next section how it can be used in certain cases.
A trajectory X : ¼ ðuðtÞ,xðtÞ, pðtÞÞ is said to be feasible if it satisfies the ODEs (3.5) and the boundary conditions (3.3) and (3.7). A feasible trajectory X is an extremal trajectory if it satisfies the PMP, i.e. if it also satisfies proposition 3.4. Observe that any optimal trajectory must be an extremal by proposition 3.4.

ð4:1Þ
This is an n-dimensional RFM with initiation and exit rates that are non-negative control inputs. Suppose that both u 0 (t), u 1 (t) are periodic with period T ≥ 0. It was proved in [6] that the RFM with T-periodic rates entrains, so in particular, (4.1) admits a unique solution γ T (t), with γ T (0) = γ T (T ), and x → γ T for any initial condition x(0) ∈ [0, 1] n .
To study the cost of entrainment in this system, fix 0 < ℓ < L. We will assume that the rates u 0 (t), u 1 (t) are two measurable and essentially locally bounded functions taking values in the interval [ℓ, L].
To allow a 'fair' comparison between T-periodic controls and constant controls, fix two values We now use theorem 3.3 to formulate an optimal control problem that allows finding the gain of entrainment. Introduce the two-dimensional state ξ := [x n+1 x n+2 ] T and let u := [u 0 u 1 ] T . Then, the extended system is an (n + 2)-dimensional nonlinear control system  and the boundary conditions are as follows: The following is the optimal control problem. In general, this seems to be a non-trivial problem. Nevertheless, this formulation allows the use of both theoretical and numerical optimal control tools, and we will provide exact results in special cases.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 are called the switching functions. These functions play a central role in the determination of the structure of the extremal solutions of the optimal control problem. In particular, the analysis based on the PMP is separated to two cases: first, the case when the switching functions are non-zero. In this case, each subtrajectory of an extremal trajectory is called a regular arc. Second, the switching functions are 'identically' zero in a sense that will be precisely defined. In such case, a subtrajectory of an extremal trajectory is called a singular arc.
If the Hamiltonian is linear in the control inputs, as in (4.7), then the regular arcs take a very simple form. The control inputs assume either the minimum or maximum values that are allowed for them. Such controllers are known as bang-bang controllers. This is a well-known result in optimal control. We state it for the sake of completeness, and the proof is provided in the appendix A.
Before stating the result, we need to define regular arcs rigorously. Let X be a feasible trajectory. Define the open set: A regular arc is a restriction Xj V for some open subset V , E r . The next result analyses regular arcs. Therefore, unless either of the switching functions vanish on a non-zero measure set, the optimal control is bang-bang, meaning that it has values in {ℓ, L} 2 for almost all t.
The next section considers the unweighted problem 4.1 and shows that a singular arc satisfies the PMP on [0, T ]. Furthermore, for the one-dimensional problem, any extremal trajectory cannot contain any regular arcs. In other words, it must be fully singular.

The unweighted optimal control problem
In this section, we consider the unweighted version of problem 4.1, that is, the case where α 0 (t) = α 1 (t) = β(t) ≡ 1 for all t ∈ [0, T ].

Constant controls satisfy the Pontryagin maximum principle
We first show that constant controls satisfy the necessary conditions for optimality. Note that this does not guarantee that such solutions are optimal. where J is the Jacobian of the RFM (4.1) with respect to x, and b := [0 … 0 p 0 /T ] T . Also, _ p nþ1 ðtÞ ¼ _ p nþ2 ðtÞ ; 0. It has been shown in [29] that the RFM with constant rates admits a unique GAS steady state in (0, 1) n . Hence, every solution of (4.1) with u 0 ðtÞ ; u 0 , u 1 ðtÞ ; u 1 , converges to a point x ¼ ½ x 1 x 2 . . . x n T [ ð0, 1Þ n . It was shown in [6] that if M is any compact subset of (0, 1) n , then there exists a matrix measure m : R nÂn ! R such that μ(J(x, u)) < 0 for all x ∈ M, u ≥ 0. This implies in particular that all the eigenvalues of J(x, u) have a negative real part [52], so J(x, u) is non-singular for each x ∈ (0, 1) n , u ≥ 0. Hence, so is J T ð x, uÞ. Let z :¼ ÀðJ T ð x, uÞÞ À1 b u 1 , and let u :¼ ½ u 0 u 1 T . We now show that for uðtÞ ; u, xðtÞ ; x, p 0 ¼ T, pðtÞ ; z À p 1 ð1 À x 1 Þ= u 0 À x n ð1 À p n Þ= u 1 Â Ã T , all the conditions in the PMP hold. First note that the boundary conditions (4.5) all hold. Equation (4.10) holds by the definition of p. The switching functions (4.9) satisfy w 0 (t) ≡ w 1 (t) ≡ 0. Equation (4.8) implies that H ; 0 and that (3.6) trivially holds. ▪ We have shown that constant controls satisfy the necessary conditions of the PMP. In other words, constant controls are always extremal solutions.
In the following section, we show that for n = 1, constant controls are the only controls that satisfy the PMP.

Extremal analysis of the one-dimensional unweighted problem
In this section, we study the following system: The controls u 0 [u 1 ] represent time-varying initiation [exit] rates in an RFM with n = 1. Even though the PMP provides a general approach for addressing optimal control problems, it seldom leads to a full characterization of extremal solutions, especially in the case of multiple inputs. We will show that this is possible for the system (4.11): a detailed analysis using the PMP shows that any extremal trajectory corresponds to a constant x 1 (t). Since each control input takes values in a compact and convex set, the optimal control problem always has a solution. Thus, there is no gain of entrainment. This shows that the PMP is a viable approach for handling such problems and lays the ground for future generalization to higher dimensional cases.
Theorem 4.4. Let X be an extremal trajectory for problem 4.1 with the system (4.11). Then The proof is given in appendix A. The PMP immediately yields the following result, which implies that periodic inputs do not confer any advantage over constant counterparts, and hence, the presence of periodic signals in an occupancy system cannot be a result of optimizing the unweighted throughput of the system. Theorem 4.5. Fix 0 < ℓ < L, and let the admissible controls u 0 , u 1 take values in [ℓ, L], with given averages u 0 , u 1 [ ð', LÞ, the optimal objective for problem 4.1 and system (4.11) is The optimal trajectory is x Ã 1 ðtÞ ; u 0 =ð u 0 þ u 1 Þ, a constant, and it can be achieved by the constant inputs: Remark 4.6. The control inputs u 0 (t), u 1 (t) that achieve the optimal cost are not unique. Indeed, it is clear that the optimal solution in (4.12) depends only on the ratio u 1 = u 0 . For instance, u ÃÃ 0 ðtÞ ; u 0 rðtÞ, u ÃÃ 1 ðtÞ ; u 1 rðtÞ is also an optimal solution for any function ρ such that u 0 rðtÞ , ½', L and Remark 4.7. Theorem 4.5 can be also proven via a more direct approach, motivated by an idea from [53]. This alternative proof is given in appendix A.

The unweighted problem for the ribosome flow model with n = 2 and a single input
We now study problem 4.1 for an RFM with dimension n = 2 and a single control u 0 (t) as the initiation rate, i.e.
subject to the boundary conditions (4.5). Here, our goal is to maximize the average value of the output rate λ 2 x 2 . More formally, we aim at solving the following problem: The next result shows that here as well a constant control is optimal, that is, there is no gain of entrainment. Proof. The first equation in (4.15) is the first equation (4.11) for u 1 (t) = λ 1 (1 − x 2 ), so u 1 ¼ l 1 ð1 À x 2 Þ, and the output rate is λ 1 (1 − x 2 )x 1 . (Note that here x 2 is the second state-variable in (4.15), and not the integral of u 0 as in (4.11)). By theorem 4.5, we have l 1 u 1 x 1 ðl 1 u 1 u 0 Þ=ð u 0 þ l 1 u 1 Þ. Hence, Integrating (4.15), we get Hence, l 1 ð1 À x 2 Þx 1 ¼ l 2 x 2 . Substituting in (4.16), we get The left-hand side here is the quantity that we are trying to maximize. Rearranging gives: where fðsÞ :¼ s 2 À ð1 þ u 0 ð1=l 1 þ ð1=l 2 ÞÞÞs þ ð u 0 =l 2 Þ. Let p, q denote the roots of f(s). Then, Recall that the RFM (4.15) with the constant control u 0 ðtÞ ; u 0 admits a unique steady state e [ ð0, 1Þ 2 [29]. It is straightforward to show that fð e 2 Þ ¼ 0. We may assume that p ¼ e 2 , so p ∈ (0, 1). Then, (4.19) implies that q is real and q > 1. The quadratic inequality (4.18) implies that either x 2 p , 1 or x 2 ! q . 1. Since x 2 (t) ∈ [0, 1] for all t, the second inequality can be ignored, and we have that the maximal (feasible) value of Obviously, this is attained for the constant control u 0 ðtÞ ; u 0 . ▪ royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 14

Gain of entrainment with time-varying weight functions
Analysing the case with time-varying weights is challenging, but it is highly relevant to applications since resources may be allocated differently during the period. In this section, we show that, even for the above examples, once the weighting functions become time varying, constant inputs may no longer be optimal. We consider the special case of (4.3) with n = 1 and a single input u 0 (t) as the initiation rate, i.e. _ x 1 ðtÞ ¼ u 0 ðtÞð1 À x 1 ðtÞÞ À l 1 x 1 ðtÞ and _ x 2 ðtÞ ¼ u 0 ðtÞ: Our goal now is to maximize Jðu 0 Þ : ¼ ðl 1 =TÞ Ð T 0 bðtÞx 1 ðtÞ dt, subject to the boundary conditions (4.5), and where the weight function β is differentiable and satisfies β(t) > 0 for all t ∈ [0, T ]. Without loss of generality, we assume that T = 1. Note that if u 0 [ ð', LÞ, then this implies that the constant control u 0 ðtÞ ; u 0 cannot be optimal. If (4.21) does not hold for any t (e.g. when the right-hand side of (4.21) takes values that are not in [ℓ, L]), then the optimal control is bang-bang.
As a specific example take ' ¼ 0:001, L ¼ 10, u 0 ¼ 2, l 1 ¼ 1, and the weight function bðtÞ ¼ e Àr(tÀðT=2Þ) 2 , with ρ = 100 and T = 1. In the context of the RFM, this would represent the case where it is required to highly express a specific protein near the middle of every cycle, rather than having a uniform level of production along the entire period. The constant input u 0 (t) ≡ 2 yields a steady-state trajectory x(t) ≡ 2/3. The corresponding value of the objective function is expressed as follows: Our optimal control formulation can be used to solve the problem numerically using optimal control packages like shown in [50]. The result is a three-arc bang-bang control where D : ¼ ð u 0 À 'Þ=ðL À 'Þ ¼ 0:19992 and t 1 = 0.27335. The corresponding periodic solution satisfies x(1) = x(0) = 0.50251. This achieves a cost J(u Ã ) = 0.141183, which is roughly 20% better than (4.23). Figure 3 depicts the approximate optimal bang-bang control (computed using [50] and a bisection procedure) and the resulting periodic solution x Ã 1 ðtÞ. The weight function β(t) is also shown. The maximal value of β is achieved at t = 1/2. The optimal control switches to the maximal value L = 10 before the peak time of β. This makes sense as it guarantees that x Ã 1 ðtÞ will have large values when the weighting of x Ã 1 ðtÞ in the cost function is large.

Gain of entrainment in generalized occupancy models
In this section, we consider general cascades like those in figure 1. Recall that under the conditions (1.4), the n-dimensional RFM can be approximated by (1.5), which has the form in figure 1a. We consider here this approximated system.

Unweighted objective functional
We first state the following result which is well known in the theory of linear time-invariant systems (see also [24]). We include the proof for completeness.
Proposition 5.1. Consider a single-input-single-output linear system: _ z ¼ Az þ bw, y = c T z, where z [ R n , and A is Hurwitz. Let w be a bounded measurable input which is T-periodic. Then, y converges to a steady-state T-periodic solution y w , and Proof. Since w is measurable and bounded, w ∈ L 2 ([0, T ]). Hence, it can be written as a Fourier series The output of the linear system converges to the steady-state periodic solution Each sinusoid in the expansion has period T, so Ð T 0 y w ðtÞ dt ¼ THð0Þ w: ▪ Combining proposition 5.1 with our results on the gain of entrainment in certain bottleneck models yields the following result. Proof. Consider the system depicted in figure 1a. Fix admissible T-periodic controls u 0 (t), u 1 (t). Denote the corresponding steady-state average values of w 1 (t) and y(t) by w 1 and y. Obviously, w 2 ¼ y. By proposition 5.1,

Weighted objective functional
We extend the results of §4.3 to the GOM in figure 1a. We show that constant rates are no longer optimal. As before, let β be a differentiable and positive weight function defined over [0, T ]. Proof. The proof follows immediately from theorem 4.11 and proposition 5.1. ▪

Discussion
Entrainment is an interesting and important property of dynamical systems. It allows systems to develop an 'internal clock' that synchronizes to periodic excitations like the 24 h solar day. Such clocks are important in biology, as they allow organisms to adequately respond to periodic processes like the solar day and the cell cycle division process. They are also essential for synthetic biology, as a common clock is an important ingredient in building synthetic biology circuits that include several modules working in synchrony.
Here, we considered an additional qualitative property called the gain of entrainment. This measures the advantage, if any, of using a periodic control vs. an 'equivalent' constant control for maximizing the average throughput. We showed how this problem can be cast as an optimal control problem. This allows using the sophisticated analytical and numerical tools developed for solving optimal control problems to determine the gain of entrainment.
We have shown that, perhaps surprisingly, there is no gain of entrainment in a class of systems relevant to biology and traffic applications. In other words, in these systems, non-constant periodic controls are no better than constant controls. The optimality of constant controls fails to hold if we allow time-varying objective functionals. This result is not particularly surprising and has been recapitulated in other models [37], where it has been shown that that gain of entrainment is positive when the system exhibits two phases such as day and night. Hence, this suggests the possibility that the observation of non-constant periodic signals in biological systems is correlated with the bottleneck module bottleneck module x 1 = u 0 (t) (1x 1 ) -l 1 x 1 (1x 2 ) u 0 (t) u 1 (t) Figure 4. A generalized occupancy model with a two-dimensional bottleneck entrance. The controls are the scalar functions u 0 (t), u 1 (t). We have x 1 , þ . The linear system block is assumed to be positive and Hurwitz.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 maximization of the throughput with varying weights along each period or cycle. The other possibility is that periodic signals are driven by processes external to the occupancy system. Distinguishing between these two scenarios is a subject of future research. Future work includes identifying general classes of systems where the gain of entrainment is positive and trying to understand the structure of the optimal periodic controls. The literature on optimal fishing [38] has revealed that optimal solutions are periodic when the fishermen are allowed to target fish in a specific age group only. This may be interpreted as a kind of periodic constraint on the control. It may be of interest to understand if such constraints also appear in the context of cellular systems.
Another interesting research direction is to generalize these results to models such as the nonlinear n site RFM with (n + 1) time-periodic control inputs. Recall that we consider (4.11) , so Proof. We first show that we can take By the transversality condition, we know that p Ã 1 ð0Þ ¼ p Ã 1 ðTÞ, which implies that u 0 þ u 1 ¼ 0. This is a contradiction, so we conclude that p Ã 0 . 0, and by scaling the objective function we may take p Ã 0 ¼ T. Now (A1) follows from calculating the partial derivatives in (3.5). ▪

A.3.1. Analysis of the switching functions
Using lemma A.1, the switching functions in our case are as follows: Given an extremal trajectory X, let

A.3.2. Characterization of singular arcs
In this section, we are interested in the case where mðE i 0 Þ . 0 for either i = 0 or i = 1. Let Let X be an extremal trajectory. We call any restriction of X to any non-zero-measure subset of E s a singular arc.
The following lemmas characterize the behaviour on singular arcs.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 Lemma A.3. Let X be an extremal trajectory, and assume that mðE i 0 Þ . 0 for some i ∈ {0, 1}. Then, there exists c i ∈ (0, 1) such that Furthermore, _ x Ã 1 ðtÞ ¼ 0 for almost all t [ E i 0 and the two inputs satisfy 1 is the set of isolated points of E i 0 , which is countable and hence has measure zero. Let Since t is an accumulation point, We consider the case i = 0 (the proof when i = 1 is very similar). Using (A 2) and (A 4), the equations p . Let F 0 0 be the set of accumulation points of F 0 . Then, mðF We conclude that μ(E s ) > 0. Pick τ ∈ E s . Then, ϕ 0 (τ) = ϕ 1 (τ) = 0 and (A 9) implies that ϕ 0 (t) = ϕ 1 (t) = 0 for all t, so E s = [0, T ]. ▪

A.3.3. Inadmissibility of regular arcs
In the previous section, we decomposed an extremal trajectory into regular and singular arcs. On the regular arcs, the control is bang-bang. On the singular arcs, the controls satisfy (A 6) and the state must be constant almost everywhere, so _ x Ã 1 ðtÞ ¼ 0 a.e. In general, an extremal trajectory can consist of an arbitrary patching of regular and singular arcs. In this section, we consider the admissibility of regular arcs.
To simplify presentation, we write x and p instead of x 1 and p 1 from here on. Furthermore, we denote extremal trajectories by x Ã , p Ã . Figure 5  Proof. If x(t) is identically constant then the proof follows from lemma A.4. Hence, we assume that x(t) is not constant. This implies that we can restrict attention to the four regular cases depicted in figure 5.
royalsocietypublishing.org/journal/rsos R. Soc. Open Sci. 8: 210878 20 Suppose that x(0) = L/(L + ℓ). Then considering the regular cases depicted in figure 5, we see that again x(T ) < L/(L + ℓ), as x(t) can increase towards L/(L + ℓ) only in the fourth case depicted in figure 5, yet it can never reach L/(L + ℓ), as this is an equilibrium (and thus an invariant set) of this dynamics.
Assume now that inf T ¼ 0. Since X is periodic, we can study it on the interval [0, 2T ]. Let E 0 À : Hence, defineT as the maximal open neighbourhood containing τ = T inẼ 0 À >Ẽ 1 þ . The setsT 1 ,T 2 are defined similarly. Replicating the previous arguments to the setsT ,T 1 ,T 2 , we see that supT ¼ 2T. Hence, by periodicity, sup T ¼ T. ▪ We can strengthen lemma A.6 to exclude mixed-sign arcs.
Lemma A.8. Let X be an extremal trajectory. If x(τ) ≠ 1/2 for some τ ∈ [0, T ], then x(t) ≠ 1/2 for all t ∈ [0, T ] The next lemma shows that an extremal trajectory must consist of a single singular arc.