Journal of The Royal Society Interface
You have accessResearch article

Community dynamics and sensitivity to model structure: towards a probabilistic view of process-based model predictions

Clement Aldebert

Clement Aldebert

Mediterranean Institute of Oceanography, Aix-Marseille University, Toulon University, CNRS/INSU, IRD, MIO, UM 110, 13288 Cedex 09, Marseille, France

[email protected]

Google Scholar

Find this author on PubMed

and
Daniel B. Stouffer

Daniel B. Stouffer

School of Biological Sciences, University of Canterbury, Christchurch 8140, New Zealand

Google Scholar

Find this author on PubMed

Published:https://doi.org/10.1098/rsif.2018.0741

    Abstract

    Statistical inference and mechanistic, process-based modelling represent two philosophically different streams of research whose primary goal is to make predictions. Here, we merge elements from both approaches to keep the theoretical power of process-based models while also considering their predictive uncertainty using Bayesian statistics. In environmental and biological sciences, the predictive uncertainty of process-based models is usually reduced to parametric uncertainty. Here, we propose a practical approach to tackle the added issue of structural sensitivity, the sensitivity of predictions to the choice between quantitatively close and biologically plausible models. In contrast to earlier studies that presented alternative predictions based on alternative models, we propose a probabilistic view of these predictions that include the uncertainty in model construction and the parametric uncertainty of each model. As a proof of concept, we apply this approach to a predator–prey system described by the classical Rosenzweig–MacArthur model, and we observe that parametric sensitivity is regularly overcome by structural sensitivity. In addition to tackling theoretical questions about model sensitivity, the proposed approach can also be extended to make probabilistic predictions based on more complex models in an operational context. Both perspectives represent important steps towards providing better model predictions in biology, and beyond.

    1. Introduction

    With the need for more accurate predictions in biology and environmental sciences [13], two philosophically different streams of research have been growing, statistical inference and mechanistic modelling. While the former aims to make predictions based on uncovering statistical relationships in large datasets, mechanistic modelling aims to make predictions based on causal mechanisms that explain observed patterns. In practice, the pros of one approach are the cons of the other, so a promising way forward would be to combine them in a ‘symbiotic relationship’ [4]. Here, we provide an example of such cross-fertilization. Specifically, we use Bayesian statistics to present probabilistic predictions of a deterministic mechanistic model—built around empirical data—in a way that takes into account uncertainty both in model construction and model parameterization. We, therefore, improve the model’s predictive capability by including prediction uncertainty while maintaining the explanatory power of a mechanistic model. As our focal model, we use the Rosenzweig & MacArthur [5] predator–prey model, which is known to be ‘structurally sensitive’ (as defined by Cordoleani et al. [6]); that is, apparently minor changes in model formulation can lead to a dramatic change in both quantitative (predicted biomasses over time) and qualitative predictions, such as prey–predator oscillations or the coexistence of alternative stable states [79].

    Structural sensitivity is a common phenomenon that emerges in mechanistic biological models, which usually aim to summarize multi-level processes into equations after adopting simple assumptions regarding the complexity of the biological system of interest. As the entire complexity can rarely, if ever, be taken into account, available empirical data may be insufficient to statistically discriminate between alternative models [6,10]. Alternative models are known to make different predictions when the uncertain process is the infection in a host–pathogen system [11], the colimited uptake of nutrient [12], or predation in predator–prey and food-web models [68,1320]. Some studies also indicate that major ocean-scale predictions, such as the dominance of phytoplankton groups, primary production and export to the deep ocean, can be deeply affected by this form of sensitivity [2123]. In a way, structural sensitivity can be considered an extension of classical parameter sensitivity; moreover, Cordoleani et al. [6] and Adamson & Morozov [14] have provided examples where a change in model formulation has a higher effect on model predictions than an equivalent change in parameter values. However, as far as we know, such an analysis has never been statistically performed simultaneously across alternative models and all plausible parameter values.

    Here, we ask whether the predictions made by a biological model are more sensitive to the uncertainty in its parameterization or to its mathematical formulation. To answer this question, we present a probabilistic view of predictions made by a deterministic predator–prey model, the Rosenzweig & MacArthur [5] model (§2). In particular, we explore the consequences of alternative predator functional responses in changing the behaviours predicted. We achieve the shift from determinism (§3) to probabilism (§4) by merging bifurcation theory for model analysis [2426] and Bayesian statistics [2729], thus harnessing the latter’s ability to include uncertainty in a model and to propagate it forward into predictions. Intriguingly, our probabilistic approach—applied on three example datasets—indicates that parametric uncertainty is regularly overcome by model uncertainty (§5), an observation that has broad implications across various other challenges faced in biology.

    2. A simple predator–prey model

    We study the Rosenzweig–MacArthur model [5]:

    Display Formula
    2.1
    Here, the prey population N follows logistic growth with per capita growth rate r and environmental carrying capacity K in the absence of predators. The predator population Y has a linear per capita mortality rate μ, conversion efficiency e, and functional response f(N). All parameters and variables are strictly positive to ensure their biological meaning.

    In line with the biology, a well-behaved functional response is expected to be strictly increasing, concave, saturating and vanishes only at 0. Here, we consider three possible alternatives for fF = {f(H), f(I), f(t)}, all of which only depend upon two parameters:

    Display Formula
    2.2
    which are the Holling type II [30,31] (later denoted Holling), Ivlev [32] and the hyperbolic tangent [10], respectively. The parameters in each have the same mathematical meaning: Inline Formula corresponds to the maximum uptake rate and Inline Formula corresponds to the function’s slope in the limit of no prey, but they originate from different perspectives. For Holling, a(H) and h(H) are predator attack rate and handling time. For Ivlev, 1/h(I) and a(I)h(I) are predator maximum digestion rate and satiation coefficient. The hyperbolic tangent is actually a phenomenological model without underlying biological assumptions, which nevertheless can at times provide a more accurate description of data than others [10].

    To drop one parameter and ease the analysis without loss of information, we re-scale model (2.1): y = Y/r, t = rτ, m = μ/e, ɛ = e/r, and write x = N. The re-scaled model reads:

    Display Formula
    2.3
    Upon re-scaling, the remaining parameters are (i) the prey carrying capacity K, (ii) the scaled predator mortality m, (iii) the timescale factor ɛ and (iv) the two parameters Inline Formula and Inline Formula from the functional response that must be estimated from data. These first two are of biological interest as they might be affected by external factors (e.g. environmental degradation, additional predator mortality due to harvesting).

    3. Probabilistic predictions

    3.1. Fitting functional responses on data

    Empirical data were not considered in previous structural sensitivity analyses in Rozensweig–MacArthur model [8,9]. As our baseline, we therefore use three examples of functional-response data to provide an overview of our approach and its practical use. To introduce the approach, we first focus on one of these data sets (until §5). Specifically, we use data from an experiment where two individuals of Gazella thomsoni were eating hand-assembled grass swards, and their grazing rate as a function of grass biomass (i.e. a functional response) was estimated [33, fig. 1]. This consumer–resource system can be modelled with generic assumptions on population dynamics: (i) grass biomass follows a logistic growth limited for instance by space and nutrients availability; (ii) the growth rate of Gazella thomsoni population is proportional to its grass intake which is a strictly increasing, concave, saturating function of grass biomass that vanishes in absence of grass; (iii) Gazella thomsoni has a linear per capita mortality rate (e.g. ageing, harvesting). All these assumptions define the Rozensweig–MacArthur model, which is often refered as a predator–prey model but is generic enough to also describe a variety of consumer–resource systems.

    Previous studies on structural sensitivity compared a deterministic analysis of model predictions based on best-fitted functions ([8,9,11,18,20], among others). Fitting a function to data implies uncertainty in the parameters being inferred, which means that there is additional insight to be gained by considering both parameter and model uncertainty. To perform a probabilistic analysis of model predictions, we first estimate the parametric uncertainty while fitting each functional response to data. This is achieved by a Hamiltonian Markov chain Monte Carlo (HMCMC) algorithm computing the probability P(θf) that the set of functional-response parameters θf = (a(f), h(f)) of function f predict observed data (figure 1ac), assuming a Gaussian noise around f ([29], details in electronic supplementary material, appendix). The probability P(θf) is proportional to the likelihood of θf, and the set of probabilities corresponding to the inferred parameter values is called the posterior distribution in Bayesian statistics. This posterior distribution of θf is estimated by the HMCMC algorithm by learning from available data (figure 1ac). The maximum of this distribution occurs at parameters having the maximum-likelihood and thus giving the best fit to data based on this criterion. Therefore, the HMCMC approach gives the same best-fitted functions as a maximum-likelihood estimation, but it adds information about parametric uncertainty for each function by considering the whole posterior probability distribution (figure 1df).

    Figure 1.

    Figure 1. Results of the Markov chain Monte Carlo estimate of parameters probability densities (posterior distributions) for each function. (ac) Bivariate posterior probability distributions (grey levels) for each functional response parameter, i.e. the likelihood of these parameter values based on available data. Points ‘+’ correspond to parameter values sampled to perform the probabilistic model analysis. Black bullets (one per panel) indicate the parameter values with the maximum likelihood, i.e. parameter values giving the best fit to data. (df) Functional responses fit to experimental data (points ‘+’). The parametric uncertainty from the HMCMC estimation gives a confidence interval (95%, shaded area) around the best-fitted functions (curves). Model uncertainty is derived through the relative likelihood that one function fits new data better than the others, knowing their respective parametric uncertainty. (Online version in colour.)

    Based on the posterior distribution for each function, we use the widely applicable information criterion (WAIC) to derive Akaike weights which correspond to the relative predictive accuracy of each function—i.e. the probability P(f) that function fF gives the best fit to new data, conditional to the alternative functions that we consider [29]. This relative accuracy can be estimated using any other suitable method than WAIC, without altering our generic framework. For the first dataset, the hyperbolic tangent has the highest probability of 0.572 compared to 0.350 for Ivlev and 0.079 for Holling (exact values may change due to algorithm stochasticity, table 1). Based on that, one could decide to focus only on the hyperbolic tangent as it seems to be the best function. However, doing this neglects the fact that the alternative functions can be a better description of new data with a significant probability (0.428). Our approach hence aims to keep all the alternative functions, and to merge their predictions based on their respective likelihood through the use of model averaging.

    Table 1. Comparison of the three functional responses fit to data on Gazella thomsoni feeding on grass swards, based on WAIC (widely applicable information criterion). WAIC estimates are given plus or minus standard error.

    functional response WAIC difference with best WAIC weights P(f) ranking
    Holling −31.5 ± 9.2 4.0 ± 2.6 0.079 3
    Ivlev −30.5 ± 9.1 1.0 ± 0.9 0.350 2
    hyperbolic tangent −27.5 ± 9.1 0.0 0.572 1

    3.2. Overview of model predictions

    Let us first present an overview of model predictions for a given functional response f and its plausible parameter values θf. We performed a bifurcation analysis based on analytical and numerical results (mathematical details in electronic supplementary material, appendix) that reveals all the qualitative asymptotic dynamics predicted by the model for any environmental condition (K, m). We limited the range of these conditions to K ∈ ]0, xmax], where xmax ≈ 197 is the maximal prey abundance in functional response data, and m ∈ ]0, mmax], where Inline Formula is the average mortality rate (among all functional responses) that allows predator survival for the considered range of prey abundance. The choice of mmax restrains our analysis to the range of parameter values that will give interesting results. Any m > mmax will lead to predator extinction in most cases, and this is a trivial result that presents no interest for our study as all models necessarily give identical predictions.

    Figure 2ac presents model predictions based on each functional response and including parametric uncertainty. The effect of parametric uncertainty will be introduced in the next subsection and is shown by the blurred colours in the figure. For the moment, the reader can have an overview of model predictions for given parameter values θf by looking at the areas with sharp colours. In this model, there is a trivial extinction equilibrium E(0) = (0, 0) without any species, which can be reached only if the prey is initially absent. If both species are initially present, different asymptotic dynamics are predicted depending on the parameters and form of the functional response. An extinction equilibrium E(1) = (K, 0) without the predator always exists, but it is only stable in the white area in the figure. In other areas, prey carrying capacity is high enough to sustain the predator population and a coexistence equilibrium E(2) = (x(2), y(2)) exists. It starts to exist for higher carrying capacity values when predator mortality (i.e. the loss that must be overcome to ensure predator survival) is higher. This equilibrium is stable only in the blue and red areas. In the green area found at high carrying capacity, the paradox of enrichment [34] has destabilized the coexistence equilibrium and there are stable prey–predator oscillations. Stable oscillations and the stable coexistence equilibrium are both possible in the red area, which is only predicted with the hyperbolic tangent at low predator mortality and high carrying capacity. In this bistability area, the model predicts that the system will converge to one of the two alternatives depending on its initial state. The presence of alternative stable states is of particular interest to study ecosystem resilience when facing external disturbances [18,35]. This other form of the paradox of enrichment has been found in predator–prey models incorporating a more detailed description of organisms biology [19]. All of the above qualitative predictions are independent of the timescale ɛ, except for the bistability area that we computed numerically for ɛ = 1; this area starts at lower (higher) K for higher (lower) ɛ, which does not affect our conclusions.

    Figure 2.

    Figure 2. Probabilistic predictions of Rosenzweig–MacArthur model, made with each alternative functional response (ac) and averaged (d). Probabilistic predictions include both parametric uncertainty for each function (ac), and model uncertainty for the average prediction (d). Each panel presents a probabilistic bifurcation diagram, where the colours indicate the qualitative system dynamics depending on the predator mortality rate and the prey carrying capacity: predator extinction (white), prey–predator coexistence at equilibrium (blue), prey–predator oscillations (green) and bistability with equilibrium coexistence or oscillations depending on initial population sizes (red). Colour gradients indicate the probability (red-green-blue levels) of each model predictions. Thus, blurred areas (e.g. top left of panel (d)) indicate uncertain predictions. Calculations at point ‘+’ are detailed in table 2. (Online version in colour.)

    Table 2. Example to detail probabilities related to uncertainty, for each functional response (knowing f) and then averaged together (f-independent) based on WAIC weights from table 1. This example corresponds to α = (K* = 42.6, m* = 1.08), the ‘+’ point in figures 24.

    description probability Holling Ivlev hyp. tangent average
    function weight P(f) 0.258 0.326 0.416
    predicting qualitative dynamics P(`extinction′|f) 0.000 0.000 0.000 0.000
    P(`equilibrium′|f) 0.004 0.502 1.000 0.748
    P(`oscillations′|f) 0.996 0.498 0.000 0.252
    P(`bistability′|f) 0.000 0.000 0.000 0.000
    source of predictive uncertainty:
    — parametric uncertainty Uparam(f) 0.001 0.175 0.000 0.061
    — model uncertainty Umodel(f) 0.373 0.162 0.126 0.158
    — total Utot(f) 0.373 0.337 0.126 0.219
    sensitivity index S(f) 0.997 −0.038 1.000 0.637

    As a general conclusion, using Holling and Ivlev functional responses leads to similar patterns of predictions and always one stable state whereas using the hyperbolic tangent allows us to predict bistability. Notably, the transitions between different qualitative predictions occur in different regions of parameter space for different models and correspond to the following biological phenomena: predator invasion (transcritical bifurcation: extinction – coexistence), onset of oscillations through enrichment paradox (supercritical Hopf bifurcation: coexistence – oscillations), and potential catastrophic shifts (i.e. tipping points) with a change in the number of alternative stable states (subcritical Hopf bifurcation: bistability – oscillations; and limit point of cycles bifurcation: bistability – coexistence). Note that if the number of alternative stable states is affected by structural sensitivity, it becomes hard to estimate ecosystem resilience under disturbances [18].

    3.3. Introducing parametric uncertainty into predictions

    To introduce parametric uncertainty in our analysis, we now look at the probability that model (2.3) with function f predicts the different qualitative dynamics X, which can be any of ‘extinction’, ‘equilibrium’, ‘oscillations’, or ‘bistability’. For fixed values of the model parameters α = (K, m, ε) that do not relate to f, the probability that functional response f leads to the prediction X is:

    Display Formula
    3.1
    where M(X, α, f, θf) = 1 if the model based on function f and with parameter values (θf, α) predicts dynamics X and 0 otherwise, and the integral is over the range of the posterior distributions of the two parameters in θf. In practice, this probability (4) is estimated by performing model analysis for a finite sample of θf values (here 1000 parameter sets) drawn randomly from the posterior distribution P(θf) [29]. Computing the probability (3.1) for different model parameters α therefore provides a probabilistic bifurcation analysis of model predictions based on one functional response. Translating the probabilities of each predictions into colour gradients gives the ‘fuzzy’ transitions between qualitative dynamics (figure 2ac), showing how parametric uncertainty propagates into predictions.

    3.4. Introducing model uncertainty into predictions

    To go further, model uncertainty can be introduced through a weighted mean over the alternative functions. This allows us to compute the probability P(X) that the model generally predicts the different qualitative dynamics X:

    Display Formula
    3.2
    The resulting probabilistic predictions are shown in figure 2d. Not surprisingly, the average prediction looks most similar to that made with the hyperbolic tangent since that model has a probability of 0.572. For low mortality values, the figure looks greener, with red-green and blue-green areas. In these areas, the hyperbolic tangent leads to bistability or a stable equilibrium, respectively, whereas the two other functions lead to oscillations and have a cumulative probability of 0.428. A representative example to illustrate the utility of these calculations occurs at Inline Formula and Inline Formula (point ‘+’ in figure 2), where all functional responses predict completely different qualitative dynamics: (i) the Holling model predicts oscillations, (ii) the hyperbolic tangent predicts equilibrium coexistence, and (iii) the Ivlev predicts both of these dynamics with roughly equal probability (table 2, upper half).

    4. Identifying the sources of uncertainty

    The probabilistic predictions presented in figure 2 present a form of predictive uncertainty, in the sense that predictions can change due to the uncertainty in the choice of a function and parametric uncertainty. As is commonly done in other studies, let us assume that we have chosen one function f among the possible ones. We will now explore the consequences of this choice, both in terms of parametric uncertainty and model uncertainty.

    For the parametric uncertainty, consider two parameter sets θf for function f drawn independently from their posterior distribution. We know from (3.1) the probability P(X| f) that one parameter set predicts the dynamics X. Moreover, predicting dynamics X or not is a binary outcome. Therefore, the joint probability that one set of parameters predicts the dynamics X and the other does not follow a binomial distribution and is P(X|f)[1 − P(X|f)]. This probability is maximal (0.25) if there is an equal probability that f predicts dynamics X or not (P(X|f) = 1 − P(X|f) = 0.5), which corresponds to the highest uncertainty on predicting dynamics X. Conversely, the probability is null if there is no uncertainty in predicting dynamics X, i.e. P(X|f) = 0 (f never predicts X) or P(X|f) = 1 (f always predicts X). Thus, the total probability that these two parameter sets ever predict different dynamics is:

    Display Formula
    4.1
    where the sum is across possible qualitative dynamics. Similarly, we can define the probability that different dynamics are predicted by one parameter set θf of function f and one parameter set of any of the other alternative functions:
    Display Formula
    4.2
    The sum over alternative functions g is weighted in order to take into account the respective weight of each alternative to function f. The probabilities (4.1–4.2) quantify the probabilities of making different predictions knowing that we are looking at the parametric or the model uncertainty. This implies that we can define:
    Display Formula
    4.3
    which equals the parametric uncertainty that is proportional to the probability that f is the best function to describe new data, and:
    Display Formula
    4.4
    which equals the model uncertainty that is proportional to the probability that another function than f is the best function to describe new data. Note that this second probability is averaged over the |F| − 1 functions other than f, to give an equal weight to parametric and model uncertainty if all the alternative functions are equally plausible, independent of their number. In other words, we are comparing f against one of its alternatives, not f against all its alternatives, because in the latter case the approach would always indicate a high model uncertainty if the number of alternative functions is sufficiently high. If one chooses the Holling function in our example, the parameter uncertainty as defined in (4.3) is small in comparison to model uncertainty defined by (4.4), as the Holling function is very unlikely to be the best function (probability of 0.079) in comparison to the two alternative functions (figure 3a,b). According to this analysis, the total uncertainty associated to the choice of function f is:
    Display Formula
    4.5
    with the example for the Holling function shown in figure 3c.
    Figure 3.

    Figure 3. Uncertainty in predictions made with the Rosenzweig–MacArthur model. Details of the calculations if one chooses to use Holling functional response: uncertainty due to functional response’s parameter values (a), to the choice of the functional response (b), and their sum (c). (d) Total uncertainty (like in (c)) averaged over the three alternative functional responses, weighted by their respective likelihood. Calculations at point ‘+’ are detailed in table 2. (Online version in colour.)

    To have a global overview, one can look at the average uncertainty across all the alternative functions that are considered (figure 3d):

    Display Formula
    4.6
    To go deeper, we propose to quantify the respective importance of parametric and model uncertainty in the predictions made with a function f with:
    Display Formula
    4.7
    Note that this index can be computed only if there is uncertainty in predictions (Utot(f) > 0). It is positive if model uncertainty is greater than parametric uncertainty and negative otherwise. In addition, S(f) = 1 if the function is not a suitable candidate (P(f) = 0) or its parametric uncertainty does not affect the predicted dynamics, which are affected (in any amount) by model uncertainty. Conversely, S(f) = −1 if predictions are affected by parametric uncertainty, and there is no model uncertainty i.e. P(f) = 1. Finally, S(f) = 0 can occur if (i) all alternative functions have an equal likelihood and they all lead to the same uncertain predictions (i.e. P(X | f) = P(X|g) for all dynamics X and alternative functions g); or more generally (ii) the function f is more likely than others but others lead to predictions that are different enough to get Uparam(f) = Umodel(f).

    Based on the index (4.7) computed for each function, one can look at its average value across the alternative functions to get a global overview of the source of uncertainty in the model predictions:

    Display Formula
    4.8
    Figure 4ac shows the index for the three alternative functional responses and our first dataset. If Holling is chosen, model uncertainty is always higher than the parametric uncertainty, as alternative equations are more plausible and lead to different predictions. Conversely, if the hyperbolic tangent is chosen, parametric uncertainty is higher on average than model uncertainty (S(f(t)) < 0 over 66.6% of the parameter space). This equation has the highest likelihood, which explains why its own parametric uncertainty appears to be more important. The Ivlev represents an intermediate case where model uncertainty is higher on average than parametric uncertainty (S(f(I)) > 0 over 69.0% of the parameter space). On average, choosing one equation creates a higher predictive uncertainty due to model uncertainty than parametric uncertainty in 66.3% of the parameter space. It is worth noting that parametric uncertainty is higher for high mortality rates, which are close to the predator maximum growth rate that itself is proportional to Inline Formula. Thus, this parameter value has a strong impact on the predicted dynamics, explaining why parametric uncertainty is higher. As a representative example, we show detailed calculations of these probabilities for Inline Formula in table 2.
    Figure 4.

    Figure 4. Source of uncertainty in predictions made with the Rosenzweig–MacArthur model. Relative importance of parametric (negative value, blue) and model (positive value, red) uncertainty in the resulting total predictive uncertainty (grey area: total prediction uncertainty lower than 0.01). (ac) Source of uncertainty if one of the alternative functional responses is chosen. (d) Average over the three alternative functions, weighted by their respective likelihood. Calculations at point ‘+’ are detailed in table 2. (Online version in colour.)

    5. Overview of the method’s possible outcomes and use

    The underlying idea behind our approach is that the cost of choosing one model to make a prediction increases if (i) there is a high probability that another model also fits available data well and (ii) the two models make different predictions. Thus, this approach extends earlier studies on structural sensitivity ([8,9], among others) by considering the fact that, even if alternative models predict different dynamics, one model may outperform others in fitting available data. Again with the Rosenzweig–MacArthur model, we illustrate this possibility in figure 5 by using data on the ingestion rate (i.e. a functional response) of copepods (Calanus pacificus) feeding on diatoms (centric sp.) to parameterize the alternative functional responses ([36, fig. 4]). One functional response—the hyperbolic tangent—fits the data better than others, with a high probability (0.853) to be the best description of new data (table 3, upper rows). Thus, the predictions made with the hyperbolic tangent functional response strongly drive the average predictions across alternative functional responses (figure 5b). Also, the areas of highest predictive uncertainty in figure 5c are due to the parametric uncertainty of the dominant function, as indicated in blue in figure 5d. Conversely, predictive uncertainty is low in areas where it is mostly due to model uncertainty (in red in figure 5d). This example illustrates the idea that, if a change in equations leads to different predictions, it mostly matters whether or not alternative equations are actually likely to be chosen.

    Figure 5.

    Figure 5. Example where one equation is almost certainly the best one among candidates. Overview of the analysis with data on copepods (Calanus pacificus) feeding on diatoms (centric sp.). All panels are drawn similarly as in earlier figures: (a) functional responses are fitted to data by a HMCMC algorithm; (b) average qualitative predictions (probabilistic bifurcation diagram); (c) average total uncertainty in predictions; (d) source of uncertainty in predictions.

    Table 3. Comparison of the three functional responses fit to data on copepods (Calanus pacificus) feeding on diatoms (centric sp. and Thalassiosira fluviatilis), based on WAIC (widely applicable information criterion). WAIC estimates are given plus or minus standard error.

    prey functional response WAIC difference with best WAIC weights ranking
    centric sp. Holling −39.4 ± 7.7 12.4 ± 2.9 0.002 3
    Ivlev −48.2 ± 7.7 3.5 ± 1.2 0.145 2
    Hyp. Tangent −51.7 ± 8.2 0.0 0.853 1
    Thalassiosira fluviatilis Holling −22.6 ± 4.8 0.4 ± 2.2 0.299 3
    Ivlev −22.8 ± 4.5 2.0 ± 1.6 0.332 2
    Hyp. Tangent −23.0 ± 3.5 0.0 0.370 1

    As a third and last example, we fit the functional responses to the data on the ingestion rate of starved copepods Calanus pacificus feeding on diatoms (Thalassiosira fluviatilis) [36, fig. 2]. Here, in contrast to the earlier examples, all three alternative functional responses have roughly the same likelihood (table 3, lower rows). As a result, the predictive uncertainty of the population model is mostly due to the model uncertainty (figure 6). This last example shares similarities to the one we used to introduce our approach. In figures 14, two alternative equations had a high likelihood in comparison to the third. Thus, one may imagine removing the most unlikely function from the analysis, and to keep only the equations giving an almost equally good fit. Doing so will end up either in one of two cases. First, something like figure 6 where some functions remain equally likely; or second, something like figure 5 where one of the alternative functions has a high likelihood in comparison to all others, and we can imagine only considering this best-fit function. In this last case, one may remove all functions except the one giving the best fit. By doing so, our approach simplifies in a parameter sensitivity analysis, as the scientist decides that the uncertainty in model construction—defined for the set of alternative equations that we choose to consider—can be neglected. Of course, deciding to neglect the uncertainty in model construction and choosing the best function (according to available data) can only be done after considering alternative models to fit to data, and having determined that one seems better than others.

    Figure 6.

    Figure 6. Example where all alternative equations are equally likely. Overview of the analysis with data on starved copepods (Calanus pacificus) feeding on diatoms (Thalassiosira fluviatilis). All panels are drawn similarly as in earlier figures: (a) functional responses are fitted to data by a HMCMC algorithm; (b) average qualitative predictions (probabilistic bifurcation diagram); (c) average total uncertainty in predictions; (d) source of uncertainty in predictions. Note that here we took Kmax as five times the maximum prey density in data, in order to show all the possible qualitative dynamics predicted by the model, without altering our conclusions. (Online version in colour.)

    6. Discussion

    Here, we present probabilistic predictions of system dynamics in the classical Rosenzweig–MacArthur model that take into account both model and parametric uncertainty. Overall, we show that uncertainty in model formulation regularly leads to a larger predictive uncertainty than does usual parametric uncertainty. Moving forward, it is worth noting that uncertainty likely also involves other parameter values and processes. For predator mortality and prey carrying capacity, the resulting prediction uncertainty can be estimated directly from figure 3. Conversely, a known uncertainty based on data on the timescale parameter ɛ or the intrinsic dynamics of the prey (here specified as logistic growth) can be taken into account by also sampling their posterior distributions simultaneously with those of the functional response and its parameters. Knowing the respective contribution of different biological processes to the resulting predictive uncertainty would be a major step forward. To go this way, sampling a higher-dimensional parameter space is feasible with existing algorithms [29], but performing each model analysis might be computationally prohibitive. Nevertheless, the cost of conducting such an analysis may ultimately be lower than the cost of making the wrong biological/environmental decision due to an unreasonable faith in model predictions.

    Decreasing predictive uncertainty can be achieved by decreasing the uncertainty in either model construction, model parameterization, or both. To do this, the naive idea of improving the experimental data collection might help to greatly decrease the model uncertainty. In our three examples, we are studying a process that is assumed to be a saturating function of prey abundance. However, if this saturation is not present in the collected data (figure 6a), it necessarily becomes harder to identify the best model. Indeed, the model parameter defining the plateau is less constrained by data (in comparison to figures 1df, 5a), increasing the parametric uncertainty of each alternative equations. Thus, alternative equations tend to have highly overlapping confidence intervals, subsequently increasing the uncertainty in model selection. Therefore, our study highlights the importance of designing experiments in a way that maximizes the constraints on parameter values of alternative models to decrease the uncertainty in selecting the ‘best’ model.

    Finding the ‘best’ model in a given situation may imply arguments beyond the simple fit of models to data, such as the fact that one of the alternative equations is a well-established model in the literature that is based on valuable theoretical arguments (e.g. underlying mechanisms). Though we did not do so here, such non-quantitative arguments can be taken into account in our quantitative analysis thanks to the concept of prior probabilities. Prior probabilities are used in Bayesian statistics to give an a priori weight to some parameter values, or here alternative models, before estimating their likelihood from data [29]. With prior probabilities, one could thus give more weight to the well-established Holling functional response instead of the phenomenological hyperbolic tangent. As a consequence, this function might become the best model in the example in the table 3 (lower rows), where all functions are equally likely based solely on the data. Conversely, the Holling functional response might remain the worst model in the example in table 3 (upper rows), where the hyperbolic tangent is about 400-fold more likely according to data.

    Our framework can also be easily extend to consider quantitative as well as qualitative differences in predictions (e.g. equilibrium versus oscillations). Quantitative differences are important for topics such as resource management, disease and pest control, or species invasions. Here, we combined Bayesian statistics with a qualitative analysis of model predictions (bifurcation analysis). Depending on model complexity and the question at stake, one can combine Bayesian statistics with quantitative predictions (e.g. numerical simulations) or include quantitative aspects in the bifurcation analysis. Quantitative aspects of structural sensitivity have already been considered in earlier studies [6,37]. However, including quantitative aspects in the bifurcation analysis might become computationally prohibitive. Thus, getting the quantitative predictions of the model of interest is the challenge in extending our framework, which is well-suited to consider uncertainty in both quantitative and qualitative predictions.

    The predictions made with the predator–prey model we used as an example could not be compared to data on the temporal dynamics of the studied system. Indeed, we extracted functional-response data from studies on grazing/ingestion rates of a few predator individuals—functional response sensu stricto—but these studies did not follow the temporal dynamics of a prey–predator system over many generations (i.e. the scale of the population model). Conversely, some experiments on population dynamics are not combined with functional response experiment, and functional-response parameters are optimized so that the predicted system dynamics fit the temporal data (e.g. [38,39]). However, if one has access to both types of data, the additional information on temporal dynamics can be used to constrain the probabilistic analysis. One way to do so is to remove candidate models that never predict the observed qualitative dynamics. Another complementary way is to perform the parameter estimation with constraints coming from fitting both the functional-response data and data on the temporal dynamics. This approach might be the best way to solve the issue of structural sensitivity. However, it is worth noting that this is a truly idealistic case. Indeed, for organisms with a long lifetime (months, years), collecting temporal data on their population dynamics would require a long-term monitoring (at least many years). Therefore, making probabilistic predictions prior to such an experiment would still be of interest, given the time needed to acquire data.

    Though a similar idea has been used to improve parameter inference [40], as far as we know this is the first time the full approach proposed here has been used. An intriguing application of our probabilistic approach would be to conduct structural-sensitivity analysis for additional data sets corresponding to other prey–predator species. This might allow the classification of organisms with population dynamics that are inherently more or less predictable, either because of parameter values or the uncertainty around their functional-response data. This approach can also be extended to different models, for example, to test the recent hypothesis that mass-balanced prey–predator models with maintenance are less structurally sensitive [19]. More generally, our proposed approach may benefit from further cross-fertilization with the approach of partially specified models [1416]. That approach provides generic results but cannot currently make probabilistic predictions of the sort we make here. Another way to include uncertainty would be to use stochastic differential equations where the uncertain process is randomly drawn to include all data variability into model predictions.

    To conclude, we have conducted a proof-of-concept study outlining a novel approach that considers both parametric uncertainty and uncertainty in biological model construction while presenting model prediction. We achieved this by bringing elements of Bayesian statistics into the analysis of deterministic dynamical systems. Here, the prey–predator system considered is small enough to get analytical results on bifurcations, as this provides a comprehensive overview of qualitative model predictions. More complex models can also be studied by the proposed approach, by adapting the automatic model analysis to model complexity. For instance, a full overview of qualitative model predictions can be obtained if a numerical bifurcation analysis can be performed, or a sample of numerical simulations of different scenarios can be used for models as large as global ecosystem models. Some of these models are known to make different large-scale predictions such as the global dominance of phytoplankton groups or the ocean primary production [2123]. Those results were based on comparisons between alternative models, and the incorporation of our approach would allow probabilistic predictions to be made based on different plausible biological models. Indeed, these small changes in model construction affect predictions at the ocean-scale, but also the coexistence of alternative stable states and the predicted resilience at the scale of a few populations in interactions. Therefore, extending our approach to complex operational models, together with theoretical analyses to rank the processes and species according to the level of predictive uncertainty they produced at the ecosystem level, would be critical advances towards a better knowledge of the uncertainty and forecast horizon (as defined in [2]) of model predictions in environmental sciences.

    Data accessibility

    Experimental data can be found in the references cited in the manuscript. The codes used to perform the analysis are available upon request to C.A.

    Authors' contributions

    Original idea by C.A. and D.B.S. C.A. performed the research and wrote a first version of the article. Both authors contributed to discussion, wrote the final manuscript and contributed to revisions.

    Competing interests

    We have no competing interests.

    Funding

    C.A. received funding from European FEDER Fund under project 1166-39417. D.B.S. received support from a Rutherford Discovery Fellowship and the Marsden Fund Council from New Zealand Government funding, both of which are managed by the Royal Society Te Apārangi (RDF-13-UOC-003 and 16-UOC-008).

    Acknowledgements

    We acknowledge David Nerini, Jean-Christophe Poggiale, Mathias Gauduchon, Claude Manté, Gerald Gregori and Andrew Letten for stimulating discussions. We also thank the anonymous referees for constructive comments that helped to improve the manuscript.

    Footnotes

    Electronic supplementary material is available online at https://dx.doi.org/10.6084/m9.figshare.c.4303088.

    Published by the Royal Society. All rights reserved.

    References

    Reference