Numerical method for parameter inference of systems of nonlinear ordinary differential equations with partial observations

Parameter inference of dynamical systems is a challenging task faced by many researchers and practitioners across various fields. In many applications, it is common that only limited variables are observable. In this paper, we propose a method for parameter inference of a system of nonlinear coupled ordinary differential equations with partial observations. Our method combines fast Gaussian process-based gradient matching and deterministic optimization algorithms. By using initial values obtained by Bayesian steps with low sampling numbers, our deterministic optimization algorithm is both accurate, robust and efficient with partial observations and large noise.

However, there are four fundamental problems with the paper. 1) Imputation of missing values with GP-based gradient matching isn't new; it has, for instance, already been proposed in Reference [1] (see Sections 4 and 5.3). As opposed to the method in [1], the approach proposed in the submitted manuscript requires repeated numerical integrations of the ODEs. This is NOT an innovation, it is a clear disadvantage! The whole idea of gradient matching is to bypass the computationally expensive numerical integration of the ODEs. That won't become apparent in the applications chosen by the authors, because their toy problems are relatively simple and there is no need for gradient matching (and hence the algorithm proposed by the authors) in the first place. However, when dealing with complex systems and ODEs that are computationally expensive to solve numerically, the authors' imputation step will be a serious disadvantage over existing proper gradient-matching imputation methods like [1]. To explain this differently, it is difficult to see where the algorithm proposed by the authors would become relevant. If the system of ODEs is so simple that repeated numerical integrations are computationally feasible, then there is no need for any approximation based on gradient matching, and the authors' method becomes obsolete. If, on the other hand, the system of ODEs is so complex that repeated numerical integrations are practically not feasible -which is the very motivation for gradient matching -then the authors' imputation step is practically not feasible either.
2) The combination of MCMC with a follow-up optimization step can hardly be regarded as innovative. Approximately sampling parameters from an approximate posterior distribution properly quantifies uncertainty; parameter optimization loses this attractive feature. The advantage of optimization over MCMC-based sampling is the lower computational cost. However, to first invest computational resources to run MCMC simulations, and then ditch their main asset (uncertainty quantification) to reduce the result to a point estimate based on a followup optimization is counter-intuitive, counter-productive, and methodologically absurd.
3) The authors have tested their method on simple toy problems. One could argue that the computational costs of numerically integrating the ODEs are so low here that no gradient matching scheme, like the one proposed by the authors, is needed. However, in fairness to the authors, one has to acknowledge that most other publications on this topic use the same toy problems. This is okay, as long as computational complexity is properly quantified in terms of forward simulations from the model (which can be generalized to other more complex ODE systems). The problem, as pointed out above, is that the authors seem to assume that because their toy problems are computationally so cheap, repeated numerical integrations of the ODEs, as required for their imputation steps, are not an issue. This is totally misleading, in that their algorithm won't be applicable to more complex systems. 4) An advantage of standard benchmark toy problems, like the ones used by the authors, is that they have been widely used by other authors, enabling a comparison with related methods from the literature. It is peculiar that the authors of the submitted manuscript, despite demonstrating sound knowledge of the relevant literature, have not attempted a comparison of their method with any other existing method. In particular, the comparison with the method from Reference [1] is conspicuous by its absence.

CONCLUSION:
Related methods in the literature, like [1], can be applied to computationally complex ODE systems and provide a natural way of uncertainty quantification. The method proposed in the submitted manuscript loses both of these important features, and it has not even been compared with other state-of-the-art methods.
There are too much tables with confusing numbers. The authors should consider to generate head-maps instead. (For example Table 1/Sensitivity tables) I miss some comments regarding the running time and robustness and impact of the link function h (y=h(x)+e). Also Page 13 line 33.."Thus a least square optimization after doing FGPGM may well reduce this effect (of smoother trajectories)". Isn't the opposite the case after applying the L2 norm?
In the current version the discussion is very weak and needs improvement, also Page 18 Line 53 is not supported by the results. How does the dimension of the ODE system translate into computational costs and accuracy? In practice systems have more than 10 Equations and species.
Decision letter (RSOS-200932.R0) We hope you are keeping well at this difficult and unusual time. We continue to value your support of the journal in these challenging circumstances. If Royal Society Open Science can assist you at all, please don't hesitate to let us know at the email address below.

Dear Dr Xu
The Editors assigned to your paper RSOS-200932 "Numerical Method for Parameter Inference of Nonlinear ODEs with Partial Observations" have made a decision based on their reading of the paper and any comments received from reviewers.
Regrettably, in view of the reports received, the manuscript has been rejected in its current form. However, a new manuscript may be submitted which takes into consideration these comments.
We invite you to respond to the comments supplied below and prepare a resubmission of your manuscript. Below the referees' and Editors' comments (where applicable) we provide additional requirements. We provide guidance below to help you prepare your revision.
Please note that resubmitting your manuscript does not guarantee eventual acceptance, and we do not generally allow multiple rounds of revision and resubmission, so we urge you to make every effort to fully address all of the comments at this stage. If deemed necessary by the Editors, your manuscript will be sent back to one or more of the original reviewers for assessment. If the original reviewers are not available, we may invite new reviewers.
Please resubmit your revised manuscript and required files (see below) no later than 14-Feb-2021. Note: the ScholarOne system will 'lock' if resubmission is attempted on or after this deadline. If you do not think you will be able to meet this deadline, please contact the editorial office immediately.
Please note article processing charges apply to papers accepted for publication in Royal Society Open Science (https://royalsocietypublishing.org/rsos/charges). Charges will also apply to papers transferred to the journal from other Royal Society Publishing journals, as well as papers submitted as part of our collaboration with the Royal Society of Chemistry (https://royalsocietypublishing.org/rsos/chemistry). Fee waivers are available but must be requested when you submit your manuscript (https://royalsocietypublishing.org/rsos/waivers). Comments to the Author: Thank-you for submitting to RSOS. I have now received three reviews, two of which make substantive criticisms of the manuscript. In the light of this, and my own reading of the paper, I am recommending rejection at this stage -but with the invitation to re-submit if the criticisms can be addressed. In particular, reviewer 2 points out that the proposed approach, requiring numerical integration, will be computationally expensive thereby negating the main advantage of gradient matching; they also suggest that the combination of MCMC plus optimization looses the attractive feature of MCMC that uncertainty can be readily quantified. Reviewer 2 points out that imputation has already been proposed by Calderhead et al (2008) and Reviewer 3 suggests there is little novel relative to Wenk et al. (2019).
If you do decide to revise and resubmit, I encourage you to deal with all of the points raised by the reviewers (not just those mentioned above). Reviewer 3 also points out that source code to reproduce the examples is required for acceptance at RSOS -and indeed it is required for further review, in line with our editorial policy. I believe the journal staff have contacted you about this already.

With best wishes, Len Thomas
Reviewer comments to Author: Reviewer: 1 Comments to the Author(s) This paper proposed an algorithm for parameter inference of coupled ODE systems with partially observable data. This algorithm combined a Gaussian process based gradient matching and a least square optimisation. This is a good paper, well written and very clear in its findings. I believe that the Journal of the Royal Society Open Science is a good location for its publication.
Reviewer: 2 Comments to the Author(s) The ScholarOne online submission system seems to remove all line breaks from standard text; hence my apologies for the following unformatted text.

SUMMARY:
The topic of the manuscript is "fast" parameter estimation in ODEs using gradient matching with Gaussian processes (GPs). The authors' new algorithm adds two modifications to related existing algorithms from the literature: 1) it combines an MCMC-based sampling scheme with deterministic optimization; 2) it can deal with partial observations by imputing missing values based on numerical integration of the ODEs. The proposed algorithm is evaluated on three benchmark systems widely used in the related literature.

EVALUATION:
The authors demonstrate comprehenisive knowledge of the relevant literature, and the mathematical derivations of their algorithm are sound.
However, there are four fundamental problems with the paper. 1) Imputation of missing values with GP-based gradient matching isn't new; it has, for instance, already been proposed in Reference [1] (see Sections 4 and 5.3). As opposed to the method in [1], the approach proposed in the submitted manuscript requires repeated numerical integrations of the ODEs. This is NOT an innovation, it is a clear disadvantage! The whole idea of gradient matching is to bypass the computationally expensive numerical integration of the ODEs. That won't become apparent in the applications chosen by the authors, because their toy problems are relatively simple and there is no need for gradient matching (and hence the algorithm proposed by the authors) in the first place. However, when dealing with complex systems and ODEs that are computationally expensive to solve numerically, the authors' imputation step will be a serious disadvantage over existing proper gradient-matching imputation methods like [1]. To explain this differently, it is difficult to see where the algorithm proposed by the authors would become relevant. If the system of ODEs is so simple that repeated numerical integrations are computationally feasible, then there is no need for any approximation based on gradient matching, and the authors' method becomes obsolete. If, on the other hand, the system of ODEs is so complex that repeated numerical integrations are practically not feasible -which is the very motivation for gradient matching -then the authors' imputation step is practically not feasible either.
2) The combination of MCMC with a follow-up optimization step can hardly be regarded as innovative. Approximately sampling parameters from an approximate posterior distribution properly quantifies uncertainty; parameter optimization loses this attractive feature. The advantage of optimization over MCMC-based sampling is the lower computational cost. However, to first invest computational resources to run MCMC simulations, and then ditch their main asset (uncertainty quantification) to reduce the result to a point estimate based on a followup optimization is counter-intuitive, counter-productive, and methodologically absurd.
3) The authors have tested their method on simple toy problems. One could argue that the computational costs of numerically integrating the ODEs are so low here that no gradient matching scheme, like the one proposed by the authors, is needed. However, in fairness to the authors, one has to acknowledge that most other publications on this topic use the same toy problems. This is okay, as long as computational complexity is properly quantified in terms of forward simulations from the model (which can be generalized to other more complex ODE systems). The problem, as pointed out above, is that the authors seem to assume that because their toy problems are computationally so cheap, repeated numerical integrations of the ODEs, as required for their imputation steps, are not an issue. This is totally misleading, in that their algorithm won't be applicable to more complex systems. 4) An advantage of standard benchmark toy problems, like the ones used by the authors, is that they have been widely used by other authors, enabling a comparison with related methods from the literature. It is peculiar that the authors of the submitted manuscript, despite demonstrating sound knowledge of the relevant literature, have not attempted a comparison of their method with any other existing method. In particular, the comparison with the method from Reference [1] is conspicuous by its absence.

CONCLUSION:
Related methods in the literature, like [1], can be applied to computationally complex ODE systems and provide a natural way of uncertainty quantification. The method proposed in the submitted manuscript loses both of these important features, and it has not even been compared with other state-of-the-art methods.
With no relevant methodological innovation and no comparative evaluation with alternative state-of-the-art methods, I don't think that the submitted manuscript has sufficient merit for a journal publication.
MINOR COMMENTS: Page 5, Figure 1: This is not a probabilistic graphical model (which by definition is a directed acyclic graph), but a chain graph (because the edge between FM and FarM is undirected). See reference [2] for details.
Page 5, equation (8): This is not an equality, but a proportionality relationship.
[2] D. Barber, Y. Wang. Gaussian processes for Bayesian estimation in ordinary differential equations. International Conference on Machine Learning, 2014 Reviewer: 3 Comments to the Author(s) In their manuscript entitled "Numerical Method for Parameter Inference of Nonlinear ODEs with Partial Observations" the authors propose an extension to the known FGPGM framework by adding a regularization term.
In the provided version of the manuscript needs major improvement to figure out the novelty and the benefit of the introduced regularization term. Currently it looks like a summary of the paper from Wenk et al.
It is not clear to me if the method can be applied to all ODE systems or just to biological systems. Given the introduction and the title it can be applied to all types of ODE systems. In that case the authors should add one or two non-biological examples. Otherwise Title, Abstract and Introduction have to be changed. To clarify further, the title should be changed to ".. Parameter Inference of SYSTEMS OF Nonlinear ODEs... ". Moreover typical a major challenge with biological systems is their incompleteness, the fact that the states can not observed directly and that initial parameters are obtained from different experimental settings. Page 3 Line 54: As the authors may know there are different types of parameter identifiability. This should be figured out and explained in detail. Can the approach deal with all types of identifiability?
Page 3 Line 43 Word 2: Typo. The authors should carefully review their manuscript.
Page 4 Line 12: What do the authors mean with this sentence?
The following section describes the Algorithm similar to Weng et al. This is fine but the authors should mention this at the beginning of Section 2.
Section 2: For the benefit of the reader the authors should either include an index for the timepoints or explain how the algorithm translates to a dynamic system. Especially Figure 1; which is reproduced from Weng et al; should be translated into a dynamic framework and/or the original authors have to be mentioned here.
Section 2: What about more complex systems where only combination of states can be estimated: y=h(x)+e. This is the usual case in biological systems. Please explain and clarify.
Section 2: Again how does the framework and the algorithm on Page 6 translate to a time dynamic system in detail? This is crucial for the understanding of the framework.
Section 2 Algorithm 1: How can the initial parameters are derived in detail. What is GP? Gaussian Process? Why do you set tau equal empty in the second FOR loop. Looks like this is not correct. There are too much tables with confusing numbers. The authors should consider to generate head-maps instead. (For example Table 1/Sensitivity tables) I miss some comments regarding the running time and robustness and impact of the link function h (y=h(x)+e). Also Page 13 line 33.."Thus a least square optimization after doing FGPGM may well reduce this effect (of smoother trajectories)". Isn't the opposite the case after applying the L2 norm?
In the current version the discussion is very weak and needs improvement, also Page 18 Line 53 is not supported by the results. How does the dimension of the ODE system translate into computational costs and accuracy? In practice systems have more than 10 Equations and species.

===PREPARING YOUR MANUSCRIPT===
Your revised paper should include the changes requested by the referees and Editors of your manuscript. You should provide two versions of this manuscript and both versions must be provided in an editable format: one version identifying all the changes that have been made (for instance, in coloured highlight, in bold text, or tracked changes); a 'clean' version of the new manuscript that incorporates the changes made, but does not highlight them. This version will be used for typesetting if your manuscript is accepted. Please ensure that any equations included in the paper are editable text and not embedded images.
Please ensure that you include an acknowledgements' section before your reference list/bibliography. This should acknowledge anyone who assisted with your work, but does not qualify as an author per the guidelines at https://royalsociety.org/journals/ethicspolicies/openness/.
While not essential, it will speed up the preparation of your manuscript proof if accepted if you format your references/bibliography in Vancouver style (please see https://royalsociety.org/journals/authors/author-guidelines/#formatting). You should include DOIs for as many of the references as possible.
If you have been asked to revise the written English in your submission as a condition of publication, you must do so, and you are expected to provide evidence that you have received language editing support. The journal would prefer that you use a professional language editing service and provide a certificate of editing, but a signed letter from a colleague who is a native speaker of English is acceptable. Note the journal has arranged a number of discounts for authors using professional language editing services (https://royalsociety.org/journals/authors/benefits/language-editing/).

===PREPARING YOUR REVISION IN SCHOLARONE===
To revise your manuscript, log into https://mc.manuscriptcentral.com/rsos and enter your Author Centre -this may be accessed by clicking on "Author" in the dark toolbar at the top of the page (just below the journal name). You will find your manuscript listed under "Manuscripts with Decisions". Under "Actions", click on "Create a Revision".
Attach your point-by-point response to referees and Editors at Step 1 'View and respond to decision letter'. This document should be uploaded in an editable file type (.doc or .docx are preferred). This is essential.
Please ensure that you include a summary of your paper at Step 2 'Type, Title, & Abstract'. This should be no more than 100 words to explain to a non-scientific audience the key findings of your research. This will be included in a weekly highlights email circulated by the Royal Society press office to national UK, international, and scientific news outlets to promote your work.

At
Step 3 'File upload' you should include the following files: --Your revised manuscript in editable file format (.doc, .docx, or .tex preferred). You should upload two versions: 1) One version identifying all the changes that have been made (for instance, in coloured highlight, in bold text, or tracked changes); 2) A 'clean' version of the new manuscript that incorporates the changes made, but does not highlight them.
--An individual file of each figure (EPS or print-quality PDF preferred [either format should be produced directly from original creation package], or original software format).
--An editable file of each table (.doc, .docx, .xls, .xlsx, or .csv --If you are requesting a discretionary waiver for the article processing charge, the waiver form must be included at this step.
--If you are providing image files for potential cover images, please upload these at this step, and inform the editorial office you have done so. You must hold the copyright to any image provided.
--A copy of your point-by-point response to referees and Editors. This will expedite the preparation of your proof.

At
Step 6 'Details & comments', you should review and respond to the queries on the electronic submission form. In particular, we would ask that you do the following: --Ensure that your data access statement meets the requirements at https://royalsociety.org/journals/authors/author-guidelines/#data. You should ensure that you cite the dataset in your reference list. If you have deposited data etc in the Dryad repository, please include both the 'For publication' link and 'For review' link at this stage.
--If you are requesting an article processing charge waiver, you must select the relevant waiver option (if requesting a discretionary waiver, the form should have been uploaded at Step 3 'File upload' above).
--If you have uploaded ESM files, please ensure you follow the guidance at https://royalsociety.org/journals/authors/author-guidelines/#supplementary-material to include a suitable title and informative caption. An example of appropriate titling and captioning may be found at https://figshare.com/articles/Table_S2_from_Is_there_a_trade-off_between_peak_performance_and_performance_breadth_across_temperatures_for_aerobic_sc ope_in_teleost_fishes_/3843624.

At
Step 7 'Review & submit', you must view the PDF proof of the manuscript before you will be able to submit the revision. Note: if any parts of the electronic submission form have not been completed, these will be noted by red message boxes.

Author's Response to Decision Letter for (RSOS-200932.R0)
See Appendix A.

Recommendation?
Major revision is needed (please make suggestions in comments)

Comments to the Author(s)
The current version of the manuscript reads well, and the concept is much clearer now. Nevertheless, the authors should proofread there manuscript twice (e.g., page 6 line 23).
I really appreciate the accessible code to support this work.
In lines 46 and 53 on page 1 the authors make the point that they can deal with "partial observations" and "large noise". I highly appreciate section 3.4 and other edits to detail this further. But still, I suggest including a better benchmark analysis. I suggest performing a proper analysis for example 3.3. I would like to see a table or AUC plot which demonstrates how the parameter error evolves with increasing noise and with exclusion of observations (sample size and complete dynamics/variables). This should be a straightforward exercise. Also, I recommend including the run-time. I am wondering how the run-time is related to the noise level.
Related to the comment above I would like to see how the algorithm performs on real data. Or alternatively in cases where the noise is not Gaussian. I feel the algorithm performs well on toy examples but not on real data. In addition, the authors should investigate the case of uncertain/unknown initial state conditions, as well. If this is not possible it hast to be pointed out in the discussion that the performance with real data has to be further investigated.
The sentence on page 6 line 16 "This approach does not depend on the observed variables." is hard to understand. The authors should consider rephrasing the sentence. It is a little bit confusing.
I recommend another revision to address remaining concerns, but the current manuscript shows high potential and fits very well to the journal.

Decision letter (RSOS-210171.R0)
We hope you are keeping well at this difficult and unusual time. We continue to value your support of the journal in these challenging circumstances. If Royal Society Open Science can assist you at all, please don't hesitate to let us know at the email address below.

Dear Dr Xu
On behalf of the Editors, we are pleased to inform you that your Manuscript RSOS-210171 "Numerical Method for Parameter Inference of Nonlinear ODEs with Partial Observations" has been accepted for publication in Royal Society Open Science subject to minor revision in accordance with the referees' reports. Please find the referees' comments along with any feedback from the Editors below my signature.
We invite you to respond to the comments and revise your manuscript. Below the referees' and Editors' comments (where applicable) we provide additional requirements. Final acceptance of your manuscript is dependent on these requirements being met. We provide guidance below to help you prepare your revision.
Please submit your revised manuscript and required files (see below) no later than 7 days from today's (ie 07-Jun-2021) date. Note: the ScholarOne system will 'lock' if submission of the revision is attempted 7 or more days after the deadline. If you do not think you will be able to meet this deadline please contact the editorial office immediately.
Please note article processing charges apply to papers accepted for publication in Royal Society Open Science (https://royalsocietypublishing.org/rsos/charges). Charges will also apply to papers transferred to the journal from other Royal Society Publishing journals, as well as papers submitted as part of our collaboration with the Royal Society of Chemistry (https://royalsocietypublishing.org/rsos/chemistry). Fee waivers are available but must be requested when you submit your revision (https://royalsocietypublishing.org/rsos/waivers). Comments to the Author: Thank-you for your work on the manuscript to date. I feel it is almost ready for acceptance, however a reviewer is not happy with how well it may perform on real-world data. They write " I would like to see how the algorithm performs on real data. Or alternatively in cases where the noise is not Gaussian. I feel the algorithm performs well on toy examples but not on real data. In addition, the authors should investigate the case of uncertain/unknown initial state conditions, as well. If this is not possible it hast to be pointed out in the discussion that the performance with real data has to be further investigated." Please could you either include a real-world example, or state explicitly that this needs further investigation.
The reviewer has made other minor comments that you should please address. I do not anticipate your re-submission will need to go out for review again, and so we will be able to move forward quickly.
Reviewer comments to Author: Reviewer: 3 Comments to the Author(s) The current version of the manuscript reads well, and the concept is much clearer now. Nevertheless, the authors should proofread their manuscript twice (e.g., page 6 line 23).
I really appreciate the accessible code to support this work.
In lines 46 and 53 on page 1 the authors make the point that they can deal with "partial observations" and "large noise". I highly appreciate section 3.4 and other edits to detail this further. But still, I suggest including a better benchmark analysis. I suggest performing a proper analysis for example 3.3. I would like to see a table or AUC plot which demonstrates how the parameter error evolves with increasing noise and with exclusion of observations (sample size and complete dynamics/variables). This should be a straightforward exercise. Also, I recommend including the run-time. I am wondering how the run-time is related to the noise level.
Related to the comment above I would like to see how the algorithm performs on real data. Or alternatively in cases where the noise is not Gaussian. I feel the algorithm performs well on toy examples but not on real data. In addition, the authors should investigate the case of uncertain/unknown initial state conditions, as well. If this is not possible it hast to be pointed out in the discussion that the performance with real data has to be further investigated.
The sentence on page 6 line 16 "This approach does not depend on the observed variables." is hard to understand. The authors should consider rephrasing the sentence. It is a little bit confusing.
I recommend another revision to address remaining concerns, but the current manuscript shows high potential and fits very well to the journal.

===PREPARING YOUR MANUSCRIPT===
Your revised paper should include the changes requested by the referees and Editors of your manuscript. You should provide two versions of this manuscript and both versions must be provided in an editable format: one version identifying all the changes that have been made (for instance, in coloured highlight, in bold text, or tracked changes); a 'clean' version of the new manuscript that incorporates the changes made, but does not highlight them. This version will be used for typesetting. Please ensure that any equations included in the paper are editable text and not embedded images.
Please ensure that you include an acknowledgements' section before your reference list/bibliography. This should acknowledge anyone who assisted with your work, but does not qualify as an author per the guidelines at https://royalsociety.org/journals/ethicspolicies/openness/.
While not essential, it will speed up the preparation of your manuscript proof if you format your references/bibliography in Vancouver style (please see https://royalsociety.org/journals/authors/author-guidelines/#formatting). You should include DOIs for as many of the references as possible.
If you have been asked to revise the written English in your submission as a condition of publication, you must do so, and you are expected to provide evidence that you have received language editing support. The journal would prefer that you use a professional language editing service and provide a certificate of editing, but a signed letter from a colleague who is a native speaker of English is acceptable. Note the journal has arranged a number of discounts for authors using professional language editing services (https://royalsociety.org/journals/authors/benefits/language-editing/).

===PREPARING YOUR REVISION IN SCHOLARONE===
To revise your manuscript, log into https://mc.manuscriptcentral.com/rsos and enter your Author Centre -this may be accessed by clicking on "Author" in the dark toolbar at the top of the page (just below the journal name). You will find your manuscript listed under "Manuscripts with Decisions". Under "Actions", click on "Create a Revision".
Attach your point-by-point response to referees and Editors at Step 1 'View and respond to decision letter'. This document should be uploaded in an editable file type (.doc or .docx are preferred). This is essential.
Please ensure that you include a summary of your paper at Step 2 'Type, Title, & Abstract'. This should be no more than 100 words to explain to a non-scientific audience the key findings of your research. This will be included in a weekly highlights email circulated by the Royal Society press office to national UK, international, and scientific news outlets to promote your work.

At
Step 3 'File upload' you should include the following files: --Your revised manuscript in editable file format (.doc, .docx, or .tex preferred). You should upload two versions: 1) One version identifying all the changes that have been made (for instance, in coloured highlight, in bold text, or tracked changes); 2) A 'clean' version of the new manuscript that incorporates the changes made, but does not highlight them. --If you are requesting a discretionary waiver for the article processing charge, the waiver form must be included at this step.
--If you are providing image files for potential cover images, please upload these at this step, and inform the editorial office you have done so. You must hold the copyright to any image provided.
--A copy of your point-by-point response to referees and Editors. This will expedite the preparation of your proof.

At
Step 6 'Details & comments', you should review and respond to the queries on the electronic submission form. In particular, we would ask that you do the following: --Ensure that your data access statement meets the requirements at https://royalsociety.org/journals/authors/author-guidelines/#data. You should ensure that you cite the dataset in your reference list. If you have deposited data etc in the Dryad repository, please only include the 'For publication' link at this stage. You should remove the 'For review' link.
--If you are requesting an article processing charge waiver, you must select the relevant waiver option (if requesting a discretionary waiver, the form should have been uploaded at Step 3 'File upload' above).
--If you have uploaded ESM files, please ensure you follow the guidance at https://royalsociety.org/journals/authors/author-guidelines/#supplementary-material to include a suitable title and informative caption. An example of appropriate titling and captioning may be found at https://figshare.com/articles/Table_S2_from_Is_there_a_trade-off_between_peak_performance_and_performance_breadth_across_temperatures_for_aerobic_sc ope_in_teleost_fishes_/3843624.

At
Step 7 'Review & submit', you must view the PDF proof of the manuscript before you will be able to submit the revision. Note: if any parts of the electronic submission form have not been completed, these will be noted by red message boxes.
Author's Response to Decision Letter for (RSOS-210171.R0) Decision letter (RSOS-210171.R1) We hope you are keeping well at this difficult and unusual time. We continue to value your support of the journal in these challenging circumstances. If Royal Society Open Science can assist you at all, please don't hesitate to let us know at the email address below.
Dear Dr Xu, I am pleased to inform you that your manuscript entitled "Numerical Method for Parameter Inference of Nonlinear ODEs with Partial Observations" is now accepted for publication in Royal Society Open Science.
If you have not already done so, please remember to make any data sets or code libraries 'live' prior to publication, and update any links as needed when you receive a proof to check -for instance, from a private 'for review' URL to a publicly accessible 'for publication' URL. It is good practice to also add data sets, code and other digital materials to your reference list.
You can expect to receive a proof of your article in the near future. Please contact the editorial office (openscience@royalsociety.org) and the production office (openscience_proofs@royalsociety.org) to let us know if you are likely to be away from e-mail contact --if you are going to be away, please nominate a co-author (if available) to manage the proofing process, and ensure they are copied into your email to the journal. Due to rapid publication and an extremely tight schedule, if comments are not received, your paper may experience a delay in publication.
Please see the Royal Society Publishing guidance on how you may share your accepted author manuscript at https://royalsociety.org/journals/ethics-policies/media-embargo/. After publication, some additional ways to effectively promote your article can also be found here https://royalsociety.org/blog/2020/07/promoting-your-latest-paper-and-tracking-yourresults/.
On behalf of the Editors of Royal Society Open Science, thank you for your support of the journal and we look forward to your continued contributions to Royal Society Open Science. We would like to thank all referees for their very helpful comments and suggestions. We extensively revised our paper to address all of them in detail.
In what follows, we provide our responses to all individual comments and describe in detail specific changes in the revised manuscript made in response to the specific referees' comments and suggestions. ___________________________________________________

Responses ____________________________________________________
Reviewer: # 1 Comment #1 This paper proposed an algorithm for parameter inference of coupled ODE systems with partially observable data. This algorithm combined a Gaussian process based gradient matching and a least square optimization. This is a good paper, well written and very clear in its findings. I believe that the Journal of the Royal Society Open Science is a good location for its publication.
Response: We thank the reviewer for the positive evaluation of our paper.

Reviewer: 2
Comments to the Author(s) Comment #1 The Scholar One online submission system seems to remove all line breaks from standard text; hence my apologies for the following unformatted text. SUMMARY: The topic of the manuscript is "fast" parameter estimation in ODEs using gradient matching with Gaussian processes (GPs). The authors' new algorithm adds two modifications to related existing algorithms from the literature: 1) it combines an MCMC-based sampling scheme with deterministic optimization; 2) it can deal with partial observations by imputing missing values based on numerical integration of the ODEs. The proposed algorithm is evaluated on three benchmark systems widely used in the related literature. 2) EVALUATION: The authors demonstrate comprehenisive knowledge of the relevant literature, and the mathematical derivations of their algorithm are sound. However, there are four fundamental problems with the paper. 1) Imputation of missing values with GP-based gradient matching isn't new; it has, for instance, already been proposed in Reference [1] (see Sections 4 and 5.3). As opposed to the method in [1], the approach proposed in the submitted manuscript requires repeated numerical integrations of the ODEs. This is NOT an innovation, it is a clear disadvantage! The whole idea of gradient matching is to bypass the computationally expensive numerical integration of the ODEs. That won't become apparent in the applications chosen by the authors, because their toy problems are relatively simple and there is no need for gradient matching (and hence the algorithm proposed by the authors) in the first place. However, when dealing with complex systems and ODEs that are computationally expensive to solve numerically, the authors' imputation step will be a serious disadvantage over existing proper gradient-matching imputation methods like [1]. To explain this differently, it is difficult to see where the algorithm proposed by the authors would become relevant. If the system of ODEs is so simple that repeated numerical integrations are computationally feasible, then there is no need for any approximation based on gradient matching, and the authors' method becomes obsolete. If, on the other hand, the system of ODEs is so complex that repeated numerical integrations are practically not feasible -which is the very motivation for gradient matching -then the authors' imputation step is practically not feasible either.
Response: Thank you for your comments. In the revised manuscript, we compared three kinds of treatments of the unobserved variables, including the pros and cons.
1.Integrate the whole system, which is independent of the data of known variables but as referee pointed out it would be time consuming for large systems.
2. Partially integrate the unobservable variables. This approach saves time when only a few variables are unobservable but may need extra cost when interpolation and smoothing of the observed variables are needed.
3. Sample the unobserved variables with Gaussian process and do the gradient matching. The computational cost in each sampling cycle will be relatively low for large systems, but the result and convergence speed depends on initial guess.
According to our numerical results and discussion, we suggest that if the system is very small such that solving ODEs is cheap in time, one can adopt the integration approaches since they are independent of the prior information of the unobserved variables. If the integration requires fine time step and most of the variables in the system are unobservable, then the full integration is preferred since denoise and interpolation of the observable variables which appear in the equations for unobservable ones can be avoided. For large scale problems with reliable prior information, the full sampling method may have advantages in both accuracy and efficiency.
The related details and conclusions seems not found in Section 4 or Section 5.3 in reference [1]. From the following statement in [1] the smoothing of observed species and discrete equations are needed, which are only involved in the partial integration approach mentioned above rather than the sampling.
We hope our supplemented results can provide a more comprehensive understanding on the methods and the choice in practice.
Besides, we would like to clarify that the gradient matching may help avoid local minima, (minimize both data difference and time derivative difference). This shows advantage even for simple systems. As provided in Figure 9, the conventional least square method is trapped into a local minimum which doesn't happen for the gradient matching.

Comment #2
The combination of MCMC with a follow-up optimization step can hardly be regarded as innovative. Approximately sampling parameters from an approximate posterior distribution properly quantifies uncertainty; parameter optimization loses this attractive feature. The advantage of optimization over MCMC-based sampling is the lower computational cost. However, to first invest computational resources to run MCMC simulations, and then ditch their main asset (uncertainty quantification) to reduce the result to a point estimate based on a follow-up optimization is counter-intuitive, counterproductive, and methodologically absurd.
Response: Thank you very much for the comments. As we discussed above, the deterministic least square optimization may easily fall into local minimum points. One solution is to incorporate time derivative matching in the objective function. However, numerical differentiation is usually unstable and difficult task especially for noisy data. Gaussian process then provides a good way to deal with time derivative and realize gradient matching because it is closed under time differentiation. Then, the combination of them make advantages of the robustness of Gaussian process in dealing with time derivative and the low cost of optimization. Actually we do not need much MCMC samplings to obtain a good initial status for the optimization step.
Comment #3 The authors have tested their method on simple toy problems. One could argue that the computational costs of numerically integrating the ODEs are so low here that no gradient matching scheme, like the one proposed by the authors, is needed. However, in fairness to the authors, one has to acknowledge that most other publications on this topic use the same toy problems. This is okay, as long as computational complexity is properly quantified in terms of forward simulations from the model (which can be generalized to other more complex ODE systems). The problem, as pointed out above, is that the authors seem to assume that because their toy problems are computationally so cheap, repeated numerical integrations of the ODEs, as required for their imputation steps, are not an issue. This is totally misleading, in that their algorithm won't be applicable to more complex systems.
Response: Thank you for your comments. We agree that it is important to take into consideration the computational cost for integration of the unobserved variables for complex systems. As illustrated above, in the revised manuscript we compared three approaches to infer the unobservable variables including sampling and integrations, and discussed the pros and cons and the choice in various situations based on the numerical results. Comment #4 An advantage of standard benchmark toy problems, like the ones used by the authors, is that they have been widely used by other authors, enabling a comparison with related methods from the literature. It is peculiar that the authors of the submitted manuscript, despite demonstrating sound knowledge of the relevant literature, have not attempted a comparison of their method with any other existing method. In particular, the comparison with the method from Reference [1] is conspicuous by its absence.
Response: Thank you for the comments. We looked up the example involving unobserved variables used in Section 5.3 of [1]. However, the unobserved variable occurs as a time dependent function and there is no differential equation for that variable. This example is then not suitable for the approach that involving gradient matching for the unobserved variable (approach 3 mentioned above). For the known variables, both our methods and that in [1] apply gradient matching.