Detection of persistent signals and its relation to coherent feed-forward loops

Many studies have shown that cells use the temporal dynamics of signalling molecules to encode information. One particular class of temporal dynamics is persistent and transient signals, i.e. signals of long and short duration, respectively. It has been shown that the coherent type-1 feed-forward loop with an AND logic at the output (or C1-FFL for short) can be used to discriminate a persistent input signal from a transient one. This has been done by modelling the C1-FFL, and then using the model to show that persistent and transient input signals give, respectively, a non-zero and zero output. The aim of this paper is to make a connection between the statistical detection of persistent signals and the C1-FFL. We begin by first formulating a statistical detection problem of distinguishing persistent signals from transient ones. The solution of the detection problem is to compute the log-likelihood ratio of observing a persistent signal to a transient signal. We show that, if this log-likelihood ratio is positive, which happens when the signal is likely to be persistent, then it can be approximately computed by a C1-FFL. Although the capability of C1-FFL to discriminate persistent signals is known, this paper adds an information processing interpretation on how a C1-FFL works as a detector of persistent signals.


Introduction
By analysing the graph of the transcription networks of the bacterium Escherichia coli and the yeast Saccharomyces cerevisiae, the authors in [1 -3] discovered that there were sub-graphs that appear much more frequently in these transcription networks than in randomly generated networks. These frequently occurring sub-graphs are called network motifs. A particular example of a network motif is the coherent type-1 feed-forward loop with an AND logic at the output (or C1-FFL for short). C1-FFL is the most abundant type of coherent feed-forward loop in the transcription networks of E. coli and S. cerevisiae [4].
& 2018 The Authors. Published by the Royal Society under the terms of the Creative An example of C1-FFL in E. coli is the L-arabinose utilization system which activates the transcription of the araBAD operon when glucose is absent and L-arabinose is present [5]. By modelling the C1-FFL with ordinary differential equations (ODE), the authors in [2,4] show that the C1-FFL can act as a persistence detector to differentiate persistent input signals (i.e. signals of long duration) from transient signals (i.e. signals of short duration). The aim of this paper is to present a new perspective of the persistence detection property of C1-FFL from an information processing point of view.
In information processing, the problem of distinguishing signals which have some specific features from those which do not has been studied under the theory of statistical detection [6]. An approach to detection is to formulate a hypothesis testing problem where the alternative hypothesis (resp. null hypothesis) is that the observed signal does have (does not have) the specific features. The next step is to use the observed signal to compute the likelihood ratio to determine which hypothesis is more likely to hold. Since a C1-FFL can detect persistent signals, the question is whether the C1-FFL can be interpreted as a statistical detector. We show in this paper that the C1-FFL is related to a detection problem whose aim is to distinguish a long rectangular pulse (a prototype persistent signal) from a short rectangular pulse (a prototype transient signal). In particular, we show that, for persistent input signals, the output of the C1-FFL can be interpreted as the log-likelihood ratio of this detection problem. This result therefore provides an information processing interpretation of the computation being carried out by a C1-FFL.

C1-FFL
The properties of coherent feed-forward loops have been studied in [2,4,7] using ODE models and in [5] experimentally. Here, we will focus on the property of C1-FFL with AND logic to detect persistent signals. We do that by using an idealized model of C1-FFL adapted from the text [7]. The model retains the important features of C1-FFL and is useful in understanding the derivation in this paper. Figure 1a depicts the structure of the C1-FFL. One can consider both X and Y as transcription factors (TFs) which regulate the transcription of Z. The TF X is activated by the input signal s(t) which acts as an inducer. We will denote the active form of X by X * . Following [4], we assume that the activation of X (resp. the deactivation of X * ) is instantaneous when the input signal is present (absent). The active form X * can be used to produce Y if its concentration exceeds a threshold K xy . We use [Y] to denote the concentration of Y. We write the reaction rate equation for Y as log-likelihood ratio X loglikelihood ratio computation where b y and a y are reaction rate constants, and u(c) is 1 if the Boolean condition c within the parentheses is true, or is otherwise 0.
The transcription of Z requires the concentration of X * to be greater than K xz and the concentration of Y to be greater than K yz , which corresponds to the AND gate in figure 1a . The reaction rate equation for the output Z is: where b z and a z are reaction rate constants. We now present a numerical example to demonstrate how the C1-FFL can be used to detect persistent signals. We assume the input signal s(t) consists of a short pulse of duration 3 (the transient signal) followed by a long pulse of duration 40 (the persistent signal). We also assume that s(t) has an amplitude of 1 when it is ON. The other parameter values are a y ¼ b Since the activation of X or deactivation of X * is instantaneous, we assume [X * ](t) ¼ s(t) for simplicity. The time profile of s(t) ¼ [X * ](t) is shown in the top plot in figure 2.
The middle plot of figure 2 shows [Y ](t). Since [X * ](t) . K xy when the input s(t) is ON, the production of Y occurs during this period. When the pulse is short, the amount of Y being produced is limited and the maximum [Y] is below K yz , which is indicated by the dashed red horizontal line in the middle plot. Since the production of Z requires both [X * ] . K xz and [Y] . K yz (i.e. the AND gate) but the latter condition is not satisfied, therefore no Z is produced when the pulse is short. The bottom plot shows [Z ](t) is zero when a short pulse is applied. However, when the pulse is long, the concentration of [Y] is given enough time to increase beyond the threshold K yz and as a result we see the production of Z, as shown in the bottom plot. Note that when the pulse is long, the production of Z only starts after a delay; this is because the AND condition for the production of Z in equation (2.2) does not hold initially. This example shows that, for an ideal C1-FFL, a transient input will produce a zero output and a persistent input will give a non-zero output.

Detection theory
Detection theory is a branch of statistical signal processing. Its aim is to use the measured data to decide whether an event of interest has occurred. For example, detection theory is used in radar signal processing to determine whether a target is present or not. In the context of this paper, the events are whether the signal is transient or persistent. A detection problem is often formulated as a hypothesis testing problem, where each hypothesis corresponds to a possible event. Let us consider a detection problem with two hypotheses, denoted by H 0 and H 1 , which correspond to, respectively, the events of transient and persistent signals. Our aim is to decide which hypothesis is more likely to hold. We define the log-likelihood ratio R: where P[measured data j H i ] is the conditional probability that the measured data are generated by the signal specified in hypothesis H i . Note that we have chosen to use the log-likelihood ratio, rather than the likelihood ratio, because it will enable us to build a connection with C1-FFL later on. Intuitively, if the log-likelihood ratio R is positive, then the measured data are more likely to have been generated by a persistent signal or hypothesis H 1 , and vice versa. Therefore, the key idea of detection theory is to use the measured data to compute the log-likelihood ratio and then use it to make a decision.

Connecting detection theory with C1-FFL
We will now present a big picture explanation of how we will connect detection theory with C1-FFL. The signal x * (t) in figure 1a is the output signal of Node X in the C1-FFL. We can view the C1-FFL as a twostage signal processing engine. In the first stage, the input signal s(t) is processed by Node X to obtain x * (t) and this is the part within the dashed box in figure 1a. In the second stage, the signal x * (t) is processed by the rest of the C1-FFL to produce the output signal z(t). We will now make a connection to detection theory. Our plan is to apply detection theory to the dashed box in figure 1a. We consider x * (t) as the measured data and use them to determine whether the input signal is transient or persistent. Detection theory tells us that we should use x * (t) to compute the log-likelihood ratio. This means that we can consider the two-stage signal processing depicted in figure 1b where the input signal s(t) generates x * (t) and the measured data x * (t) are used to calculate the log-likelihood ratio.
If we can identify the log-likelihood ratio calculation in figure 1b with the processing by the part of C1-FFL outside of the dashed box, then we can identify the signal z(t) with the log-likelihood ratio.

Defining the detection problem
We first define the problem for detecting a persistent signal using detection theory. Our first step is to specify the signalling pathway in Node X, which consists of three chemical species: signalling molecule S, molecular type X in an inactive form and its active form X * . The activation and inactivation reactions are: where k þ and k 2 are reaction rate constants. Let x(t) and x * (t) denote, respectively, the number of X and X * molecules at time t. Note that both x(t) and x * (t) are piecewise constant because they are molecular counts. We assume that x(t) þ x * (t) is a constant for all t and we denote this constant by M.
We assume that the input signal s(t), which is the concentration of the signalling molecules S at time t, is a deterministic signal. We also assume that the signal s(t) cannot be observed, so any characteristics of s(t) can only be inferred.
We model the dynamics of the chemical reactions by using the chemical master equation [8]. This means that x * (t) is a realization of a continuous-time Markov chain. This also means that the same input signal s(t) can result in different x * (t).
The measured datum at time t is x * (t). However, in the formulation of the detection problem, we will assume that at time t, the data available to the detection problem are x * (t) for all t [ [0, t]; in other words, the data are continuous in time and are the history of the counts of X * up to time t inclusively. We will use X Ã (t) to denote the continuous-time history of x * (t) up to time t inclusively. Note that even though we assume that the entire history X Ã (t) is available for detection, we will see later on that the calculation of the log-likelihood ratio at time t does not require the storage of the past history. The last step in defining the detection problem is to specify the hypotheses H i (i ¼ 0, 1). Later on, we will identify H 0 and H 1 with, respectively, transient and persistent signals. However, at this stage, we want to solve the detection problem in a general way. We assume that the hypothesis H 0 (resp. H 1 ) is that the input signal s(t) is the signal c 0 (t) (resp. c 1 (t)) where c 0 (t) and c 1 (t) are two different deterministic signals. Intuitively, the aim of the detection problem is to use the history X Ã (t) to decide which of the two signals, c 0 (t) or c 1 (t), is more likely to have produced the observed history.

Solution to the detection problem
Based on the definition of the detection problem, the log-likelihood ratio L(t) at time t is given by: where P[X Ã (t) j H i ] is the conditional probability of observing the history X Ã (t) given hypothesis H i . We show in appendix A 1 that L(t) obeys the following ODE: . We also assume that the two hypotheses are a priori equally likely, so L(0) ¼ 0. Since x * (t) is a piecewise constant function counting the number of X * molecules, its derivative is a sequence of Dirac deltas at the time instants that X is activated or X * is deactivated. Note that the Dirac deltas corresponding to the activation of X carry a positive sign and the [ ] þ operator keeps only these. Figure 3 shows an example x * (t) and its corresponding [dx * (t)/dt] þ . We remark that the derivation of equation (4.3) requires that both c 0 (t) and c 1 (t) are strictly positive for all t, otherwise (4.3) is not well defined.
Note that a special case of equation (4.3) with constant c i (t) and M ¼ 1 appeared in [9]. An equation of the same form as equation (4.3) is used in [10] to understand how cells can distinguish between the presence and absence of a stimulus. A more general form of equation (4.3) which includes the diffusion of signalling molecules can be found in [11].
The importance of equation (4.3) is that, given the measured data x * (t), we can use it together with c i (t) to compute the log-likelihood ratio L(t). We will use an example to illustrate how equation (4.3) can be used to distinguish between two signals of different durations. This example will also be used to illustrate what information is useful to distinguish such signals. In this example, we consider using equation (4.3) to distinguish between two possible input signals s 0 (t) and s 1 (t). Both s 0 (t) and s 1 (t) are rectangular pulses where s 1 (t) has a longer duration than s 0 (t). For simplicity, we assume that the reference signals c 0 (t) ¼ s 0 (t) and c 1 (t) ¼ s 1 (t).
In order to perform the numerical computation, we assume k þ ¼ 0.02, k 2 ¼ 0.5 and M ¼ 100. The time profiles of s 0 (t) and s 1 (t) are shown in figure 4a. The durations of s 0 (t) and s 1 (t) are, respectively, 10 and 40 time units. The amplitude of the pulses when they are ON is 10.7 and it is 0.25 when they are OFF.
We use simulation to produce the measured data x * (t). We first use the input s 0 (t) together with the Stochastic Simulation Algorithm [12] to simulate the reactions (4.1). This produces the simulated x * (t) in the top plot of figure 4b. After that, we do the same with s 1 (t) as the input and this produces the simulated x * (t) in the bottom plot of figure 4b. It is important to point out that although we have plotted s 0 (t), s 1 (t) and the two time series of x * (t) in figures 4a,b using the same time interval, we are performing two separate numerical experiments: one with s 0 (t) as the input and the other with s 1 (t) as the input.
The log-likelihood ratio calculation in equation (4.3) uses the reference signals c 0 (t) and c 1 (t). We see from equation (4.3) that these two reference signals are used to form two weighting functions, log(c 1 (t)/ c 0 (t)) and (c 1 (t) 2 c 0 (t)). By using the assumed time profiles of c 0 (t) and c 1 (t), we can compute these two weighting functions and we have plotted them in figure 4c. It can be seen that both weighting functions are non-zero in the time interval [10,40) but are otherwise zero. This means that the computation of L(t) is only using the measured data in the time interval [10,40) to determine whether the input signal is c 0 (t) or c 1 (t). This is because, outside of the time interval [10,40), the two data series x * (t) generated by s 0 (t) and s 1 (t) have the same statistical behaviour and therefore there is no information outside of [10,40) to say whether the input is long or short. Hence, a lesson we have learnt from this example is that the informative part of the data is when the long pulse is expected to be ON and the short pulse is expected to be OFF.  0  10  20  30  40  50  60  70  80   0  10  20  30  40  50  60 70 80 rsos.royalsocietypublishing.org R. Soc. open sci. 5: 181641 We first use the x * (t) generated by s 0 (t), together with the time profiles of c 0 (t) and c 1 (t), to compute the log-likelihood ratio L(t) by numerically integrating equation (4.3). The resulting L(t) is the red curve in figure 4d. Similarly, the blue curve in figure 4d shows the L(t) corresponding to the input s 1 (t). We can see distinct behaviours in the two L(t)'s in the time intervals [0, 10), [10,40) and t!40. The behaviour in the time intervals [0, 10) and t!40 is simple to explain because dL/dt ¼ 0 in these time intervals.
We next focus on the time interval [10,40). We first consider s 1 (t) as the input. In this time interval, a large s 1 (t) means the activation X continues to happen: see the bottom plot of figure 4b. The activation of X contributes to an increase in L(t) due to the first term on the right-hand side (RHS) of equation (4.3). Although the second term of equation (4.3) contributes to a decrease in L(t) via (M 2 x * (t)), which is the number of inactive X, the contribution is comparatively small. Therefore, we see that the log-likelihood ratio L(t), which is the blue curve in figure 4d, becomes more positive. Since a positive log-likelihood ratio means that the input signal is more likely to be similar to the reference signal c 1 (t), this is a correct detection. In a similar way, we can explain the behaviour of the red curve in figure 4d when s 0 (t) is applied.
A lesson that we can learn from the last paragraph is that, if our aim is to distinguish a persistent signal from a transient one accurately, then we want the persistent signal to produce a large positive L(t). Since the positive contribution of L(t) comes from the first term on the RHS of equation (4.3), we can get a large positive L(t) by making sure that a persistent signal will produce many activations. This occurs when a persistent signal has a duration which is long compared with the time-scale of the activation and deactivation reactions (4.1)-we will make use of this condition later.

Choosing detection problem parameters to match the behaviour of C1-FFL
The detection problem defined in §4 is general and can be applied to any two chosen reference signals c 0 (t) and c 1 (t). In order to connect the detection problem in §4 to the fact that C1-FFL is a persistence detector, we will need to make specific choices for c 0 (t) and c 1 (t). In this paper, we will choose the reference signals c 0 (t) and c 1 (t) to be rectangular (or ON/OFF) pulses. Furthermore, we assume that when the reference signal is ON, its concentration level is a 1 ; and when it is OFF, its concentration level is at the basal level a 0 with a 1 . a 0 . 0. The temporal profile of c i (t) (where i ¼ 0, 1) is:

1Þ
where d i is the duration of the pulse c i (t). In particular, we assume that the duration of c 1 (t) is longer than c 0 (t), i.e. d 1 . d 0 . We can therefore identify c 0 (t) and c 1 (t) as the reference signals for, respectively, the transient and persistent signals. We remark that there may be other choices of reference signals that can connect the detection problem in §4 to the one solved by C1-FFL; we will leave that for future work.
Remark 5.1. We would like to make a remark on the detection problem formulation. In this paper, we have chosen to formulate the detection problem by assuming that each hypothesis H 0 and H 1 consists of one reference signal. Such hypotheses, which consist of only one possibility per hypothesis, are known as simple hypotheses in the statistical hypothesis testing literature [6]. We know from [6] that if both hypotheses are simple, then the solution of the detection problem is to compute the likelihood ratio (2.3). In this paper, we have chosen to use simple hypotheses for H 0 and H 1 so as to make the problem trackable. In order to understand that, let us explore an alternative detection problem formulation.
An alternative formulation would be to assume that H 0 (resp. H 1 ) consists of all rectangular pulses with duration less than (greater than or equal to) a pre-defined threshold d 0 . In this case, both H 0 and H 1 are known as composite hypotheses. To the best of our knowledge, there are no standard solutions to the hypothesis testing problem with composite hypotheses at the moment. Although the text [6] presented two methods to deal with composite hypotheses, neither of them appears to be trackable because the Bayesian approach requires the evaluation of an integral and the generalized likelihood ratio test requires the solution to two optimization problems. Therefore, we have not considered them in this paper.

Computing an intermediate approximation
Our ultimate goal is to connect the computation of the log-likelihood ratio L(t) in equation (4.3) to the computation carried out by C1-FFL. We will first derive an intermediate approximation for equation (4.3). In order to motivate why this intermediate approximation is necessary, one first needs to know that the C1-FFL realizes computation by using chemical reactions, and research from the molecular computation in synthetic biology has taught us that some computations are difficult to carry out by chemical reactions [13]. For equation (4.3), the difficulties are: (i) the log-likelihood ratio can take any real value but chemical concentration can only be non-negative; and (ii) it is difficult to calculate derivatives using chemical reactions. The aim of the intermediate approximation is to remove these difficulties. In addition, we want the computation to make use of x * (t) (number of active species X * ) instead of M 2 x * (t) (number of inactive species X) because signalling pathways typically use the active species to propagate information.
In order to analytically derive the intermediate approximation, we will need to assume that the input signal s(t) has a certain form. Our derivation assumes that the input s(t) is a rectangular pulse with the following temporal profile: where d is the pulse duration, and a is the pulse amplitude when it is ON where a . a 0 . Note that the parameters a and d are not fixed; and we will show that the intermediate approximation holds for a range of a and d.

In appendix A 2, we start from equation (4.3) and use a time-scale separation argument to derive the intermediate approximationL(t). The intermediate approximationL(t) has the following properties: if the input signal s(t) is persistent, thenL(t) approximates the log-likelihood ratio L(t); if the input signal s(t) is transient, thenL(t) is zero.
Note that the latter property is consistent with the behaviour of the ideal C1-FFL which gives a zero output for transient signals. The time evolution ofL(t) is given by the following ODE: The behaviour of the intermediate approximationL(t) depends on the duration d of the input signal s(t). Two important properties forL(t), which are discussed in further detail in appendix A 2, are: We can consider those input signals s(t) whose duration d is less than d 0 as transient signals. The first property says that these signals will give a zeroL(t). Note that for the ideal C1-FFL considered in §2.1, a transient signal gives a zero output. Those signals whose duration d is greater than or equal to d 0 are considered to be persistent. The second property concerns persistent signals with the property that the duration d and amplitude a have to be such that d 2 d 0 is long compared to 1/k þ a þ 1/k 2 , which is the mean time between two consecutive activations of an X molecule. The physical effect of these signals is to produce a large number of activations and deactivations when the input signal s(t) is ON. We argue in appendix A 2 that, if these conditions hold, then it is possible to useL(t) in (5.3) to approximate the log-likelihood L(t) in the time interval 0 t , min {d, d 1 }.
We discussed in §4.2.1 that the detection of a persistent signal is best if there are many activations and deactivations when the persistent signal is ON. Fortunately, this is exactly the condition required for the second property to hold. Note that in the analysis of the ideal C1-FFL in [2,4,7] and in §2.1, both the rsos.royalsocietypublishing.org R. Soc. open sci. 5: 181641 activation and deactivation reactions (4.1) are assumed to be instantaneous, which can be viewed as k þ and k 2 being very large. This assumption can be justified from the fact that for C1-FFL, the molecule species S and X can be considered to be, respectively, an inducer and a transcription factor. It is known that the activation and deactivation dynamics of transcription factors are fast, see [7, Table 2.1]. Hence this assumption is not stringent and we will assume that reactions (4.1) are fast for the rest of this paper.
We remark that the second property does not cover all the persistent signals. For example, signals with a small amplitude a which do not produce a large enough number of activations and inactivations are not covered. These signals are persistent but are hard to detect.
At the beginning of this section, we mentioned some difficulties in realizing the computation of L(t) in equation (4.3) using chemical reactions. We note that those difficulties are no longer present in the computation ofL(t) using (5.3). In particular, we note thatL(t) is always non-negative and can be interpreted as a log-likelihood ratio when the input is persistent.

Numerical illustration
We will now use a few numerical examples to illustrate that the intermediate approximationL(t) is approximately equal to the log-likelihood ratio L(t) for persistent signals. For all these examples, we For the first example, we choose d ¼ 70 and a ¼ a 1 for the input signal s(t). We use the Stochastic Simulation Algorithm to obtain a realization of x * (t). We then use x * (t) to compute L(t) andL(t). The results are shown in figure 5a. We can see that the approximation is good. We next generate 100 different realizations of x * (t) and use them to compute L(t) andL(t). Figure 5b shows the mean of jL(t) ÀL(t)j over 100 realizations, as well as one realization of L(t) andL(t). It can be seem that the approximation error is small. In figure 5b, we have also plotted the mean ofL(t) obtained by solving the following system of ODEs: where x Ã (t) and L(t) are, respectively, the mean of x * (t) andL(t). It can be seem that a realization ofL(t) is comparable to its mean.
We repeat the numerical experiment for d ¼ 40 and a ¼ a 1 . Figure 5c shows a realization of L(t), a realization ofL(t), mean of jL(t) ÀL(t)j over 100 realizations, as well as the mean ofL(t). We can see the approximation holds up till time t ¼ 40, which is min {d, d 1 }. The purpose of this example is to illustrate why we need to include the condition t min {d, d 1 }. This is because L(t) andL(t) behave differently for t . min {d, d 1 } if d , d 1 . For L(t), it falls after t ¼ 40 because from this time onwards, the input signal s(t) being used is small; this leads to a small number of activations and consequently a negative RHS for equation (4.3). However, forL(t), the RHS of equation (5.3) is zero because a small s(t) makes [f(s(t))] þ zero.
We have so far used a ¼ a 1 and two different durations d. We now illustrate that the approximation holds for a different amplitude a. These examples demonstrate that, for persistent signals, the approximationL(t) % L(t) holds for different values of input duration d and amplitude a.
We also want to point out that the behaviour ofL(t) for transient and persistent signals is consistent with that of the ideal C1-FFL discussed in §2.1. We have already pointed out that this is true for transient signals.
For persistent signals,L(t) is zero initially and then followed by a non-zero output, i.e. there is a delay beforê L(t) becomes positive and this also holds for the ideal C1-FFL: see the bottom plot in figure 2. We will now map the intermediate approximation equation (5.3) to the reaction rate equations of a C1-FFL.
Remark 5.2. We want to remark that in the above formulation and numerical examples, the input signal s(t) is allowed to differ from the two reference signals c 0 (t) and c 1 (t). Since the decision of the detection problem is based on the log-likelihood ratio in equation (4.2), we can interpret the detection problem as using the history X Ã (t) (which is generated by s(t)) to decide which of the two signals, c 0 (t) or c 1 (t), is more likely to have produced the observed history. Furthermore, consider the case that s(t) is parameterized by positive parameters a and d as in (5.2), then it can be shown that a small change in a or d will produce a small change in the mean of L(t) andL(t).

Using C1-FFL to approximately computeL(t)
The aim of this section is to show that the C1-FFL can be used to approximately compute the intermediate approximationL(t) in equation (5.3). Recall that the C1-FFL in figure 1a transforms the signal x * (t) into the output signal z(t) using the the following components: Nodes Y and Z, and the AND logic. We will model these components using the following chemical reaction system: Àd y y(t) ð5:9aÞ ð5:9bÞ where h y , n y , K y , etc. are coefficients of the Hill functions. We assume that the initial conditions are y(0) ¼ z(0) ¼ 0. Note that these two equations are comparable to the ideal C1-FFL model in §2.1. In particular, if we replace the u-function in (2.1) by a Hill function, then it becomes (5.9a). Also, if we choose K xz ¼ 0 and  A major argument made in appendix A 3 is to match h(t) and H z (y(t)) in the time interval [d 0 , min {d,d 1 }) for persistent signals. We show in appendix A 3 that this matching problem can be reduced to choosing the parameters in (5.9) so that the following two functions in a: k 2 [f(a)] þ and H z ((1/d y )H y (Mk þ a/(k þ a þ k 2 ))) are approximately equal for a large range of a where a, as defined in §5.2, is the amplitude of the input s(t) when it is ON. We note in appendix A 3 why these two functions in a can fit to each other.

Numerical examples
We now present numerical examples to show that C1-FFL can be used to computeL(t). We use the same k þ , k 2 , M, a 0 and a 1 values as in §5.2.1. We choose d 0 ¼ 10 and d 1 ¼ 80. We use parameter estimation to determine the parameters in equation (5.9) so that the C1-FFL output z(t) matchesL(t) for a range of a. The estimated parameters for the C1-FFL are: h y ¼ 1.01, K y ¼ 8.04, n y ¼ 2.26, d y ¼ 0.24, h z ¼ 10.6, n z ¼ 5.84 and K z ¼ 5.43. In this section, we will compareL(t) from (5.3) with z(t) from (5.9) assuming the x * (t) in these two equations is given by x Ã (t) in (5.7). We have demonstrated that z(t) matchesL(t) for two different values of a. We can show that the match is good for a large range of a. We fix the duration d to be 70 but vary the amplitude a from 2.7 to 85.7. Figure 6c comparesL(t) and z(t) at t ¼ 70. It can be seen that the C1-FFL approximation works for a wide range of a.
The previous examples show that we can match the C1-FFL output z(t) to the intermediate approximationL(t) for pulse input s(t) of different durations and amplitudes. We can also show that the match extends to slowly-varying inputs. In this example, we assume s(t) is a triangular pulse with s(0) ¼ 0 and rises linearly to s(40) ¼ 42.8 and then decreases linearly to s(80) ¼ 0. Figure 6d shows the time responses z(t) andL(t), and they are comparable.
Remark 5.3. We finish this section by making a number of remarks.
-Note that we have not included the degradation of Z in (5.9b) so that we can match it toL(t), which does not decay. It can be shown that if we add a degradation term 2az(t) to the RHS of (5.9b) and ÀaL(t) to the RHS of equation (5.3), the resulting z(t) will still be matched toL(t). -Equation (5.9b) is not the most general form of C1-FFL. In the general form of equation (5.9b), which is presented in [4], the factor x(t) is replaced by a Hill function of x(t). We conjecture that it is possible to generalize the methodology in this paper to obtain the general case and we leave it as future work. This can be seen from the fact that the C1-FFL model in (5.9) has seven parameters while the loglikelihood ratio calculation in equation (5.3) has only four parameters. A research question is whether any C1-FFL that can detect persistent signals has a corresponding log-likelihood ratio detector equation (5.3). We can answer this question by first characterizing the C1-FFL that can detect persistent signals and check whether such a correspondence exists. This is an open research problem to be addressed. -We have so far assumed that c 0 (t) and c 1 (t) are strictly positive for all t by assuming that a 0 . 0. If a 0 ¼ 0, then the log-likelihood ratio is no longer well defined because both (4.2) and (5.3) diverge. However, we can compute a shifted and scaled version of the log-likelihood ratio whose intermediate approximation for persistent signals is: It is still possible to use this intermediate approximation to detect persistent signals. This intermediate approximation can also be approximated by a C1-FFL. Details are omitted and will be studied in future work.

Conclusion and discussion
In this paper, we study the persistence detection property of C1-FFL from an information processing point of view. We formulate a detection problem on a chemical reaction cycle to understand how an input signal of a long duration can be distinguished from one of short duration. We solve this detection problem and derive an ODE which describes the time evolution of the log-likelihood ratio. An issue with this ODE is that it is difficult to realize it using chemical reactions. We then use timescale separation to derive an ODE which can approximately compute the log-likelihood ratio when the input signal is persistent. We further show that this approximate ODE can be realized by a C1-FFL. It also provides an interpretation of the persistence detection property of C1-FFL as an approximate computation of the log-likelihood ratio. The concept of the log-likelihood ratio (or a similar quantity) has been used to understand how cells make a decision in [9,10]. The paper [10] considers the problem of distinguishing between two environment states, which are the presence and absence of stimulus. It derives an ODE of the log-odds ratio and uses the ODE to deduce a biochemical network implementation in the form of a phosphorylation-dephosphorylation cycle. In this cycle, the fraction of phosphorylated substrate is the posteriori probability of the presence of stimuli. The paper [9] considers the problem of distinguishing between two different levels of concentration using the likelihood ratio. It also presents a molecular implementation that computes the likelihood ratio. This paper differs from [9,10] in one major way. We make a crucial approximation by considering only the positive log-likelihood ratio and ignoring the negative log-likelihood ratio. We are then able to connect the computation of the positive log-likelihood ratio with the computation carried out by a C1-FFL. This work therefore provides a connection between detection theory and C1-FFL using the positive log-likelihood ratio as the connecting point. The computation of the positive log-likelihood ratio by C1-FFL, which is the key finding of this paper, is an example of using biochemical networks to perform the analog computation. There are a few other examples. The incoherent type-1 feed-forward loop, which is another network motif, is found to be able to compute fold change [14]. Allosteric protein is found to be able to compute logarithm approximately [15]. In addition, there is also work on using synthetic biochemical circuits to do analog computation [16,17].
In this paper, we use a methodology which is based on three key ingredients-statistical detection theory, time-scale separation and analog molecular computation-to derive a molecular network that can be used to discriminate persistent signals from transient ones. A possible application of the methodology of this paper in molecular biology is to derive the molecular networks that can decode temporal signals. According to the review paper on temporal signals in cell signalling [18], only some of the molecular networks for decoding temporal signals have been identified. In fact, the authors of [18] went further to state that 'Identifying the mechanisms that decode dynamics remains one of the most challenging goals for the field.' In [19], we used a methodology-which is similar to the one used in this paper and is based on the same three key ingredients -to derive a molecular network to decode concentration modulated signals. The derived molecular network was found to be consistent with the Saccharomyces cerevisiae DCS2 promotor data in [20], which were obtained from exciting the promotor by using various transcription factor dynamics, e.g. concentration modulation, duration modulation and others. Another possible application of the methodology of this paper is in synthetic biology. For example, in [21] we used a methodology-which is similar to the one used in this paper and in [19]-to derive a de novo molecular network for decoding concentration modulated signals. We remark that the molecular networks in [19,21] can be interpreted as an approximate log-likelihood detector of concentration modulated signals.
A recent report [22] considers the problem of determining the biochemical circuits that can be used to distinguish between a persistent and a transient signal. By searching over all biochemical circuits with a limited complexity, the authors find that there are five different circuits that can be used. One of these is C1-FFL. An open question is whether one can use the framework in this paper to deduce all circuits that can detect persistent signals. If this is possible, then it presents an alternative method to find the biochemical circuits that can realize a function.
Data accessibility. The source code for producing the results for this paper is available at Github, which is an open online code repository. The source code is at https://github.com/ctchou-unsw/c1ffl-journal and https://doi.org/10.5061/ dryad.20md774.