Early warning of some notifiable infectious diseases in China by the artificial neural network

In order to accurately grasp the timing for the prevention and control of diseases, we established an artificial neural network model to issue early warning signals. The real-time recurrent learning (RTRL) and extended Kalman filter (EKF) methods were performed to analyse four types of respiratory infectious diseases and four types of digestive tract infectious diseases in China to comprehensively determine the epidemic intensities and whether to issue early warning signals. The numbers of new confirmed cases per month between January 2004 and December 2017 were used as the training set; the data from 2018 were used as the test set. The results of RTRL showed that the number of new confirmed cases of respiratory infectious diseases in September 2018 increased abnormally. The results of the EKF showed that the number of new confirmed cases of respiratory infectious diseases increased abnormally in January and February of 2018. The results of these two algorithms showed that the number of new confirmed cases of digestive tract infectious diseases in the test set did not have any abnormal increases. The neural network and machine learning can further enrich and develop the early warning theory.


Comments to the Author(s) Paper overview
This paper summarizes the use of artificial neural networks to provide early warning in the event of a potential increase incidence or outbreak of an infectious disease. The manuscript discusses leveraging national databases which update dynamically to drive the ultimate implementation of the modeling approach as a surveillance and monitoring tool. The group uses a recurrent neural network that leverages real-time recurrent learning and an extended Kalman filter to estimate and adapt model weights dynamically. Implementation of this approach will help provide early warning in the event of an outbreak and allow local and government resources to respond in a timely manner.
General Comments: The paper would benefit significantly by a more organized description of their approach and results. There are some instances in the paper where methods and results are extremely difficult to follow, especially with the figures and graphical results provided. The overall readability of this manuscript should be significantly improved such that the methodology and results are more easily followed and understood. Furthermore, perhaps there needs to be sufficient justification as to why this approach is best suited rather than other predictive modeling approaches. Other more specific comments are referenced below that may significantly improve the readability of the manuscript: 1.) Make synaptic weights plural in the last sentence of the first paragraph in the section of Realtime recurrent learning.
2.) Under the Methods section, the description of the model inputs should be enhanced to be better understandable by the target audience. In its current state it takes a lot of time to interpret the text to determine what the model inputs are. The figures are too generalized. Recommend restructuring Figures 1A and 1B to be more consistent with the text and adding information on external model inputs and data flow through the model as described in the text as it relates to specific model data features/vectors (e.g. the number historical and newly confirmed cases of a specific infectious disease type). Make the Figures less general and more aligned with specific approach as referenced in the text. Improve the figures to visually demonstrate the use of realtime recurrent learning and extending Kalman filtering in the weight estimation for the recurrent neural network.
3.) There is reference in the text to supplemental materials that demonstrate the model establishment process as well as algorithms, however, no supplemental materials are provided. These may address some of the items discussed above, however, it cannot be determined given the absence of this information. Furthermore, the results should be described in more supporting detail. From the text it is hard to follow why the authors chose the output scale on Figure 3A and 3B to be between -2 and 2, if a hyperbolic tangent transfer function is used it should normalize model inputs and outputs to be between -1 and 1. A justification for this should be provided.
5.) The authors should provide a better explanation on how the model determines and makes the decision (deciding threshold) to execute the early warning/alert this wasn't explained especially clearly in the text and discussion section.
6.) In the discussion section, alternative modeling approaches should be suggested that may be perhaps better suited to the targeted end use case of this approach. For example, using incidence of events as a time series and forecasting future trends over N months in advance. For example, can autoregressive modeling of each incidence of event be incorporated just as effectively, or time delay neural network models? If this approach is superior additional section in the discussion section should provide significant justification as to why. 7.) The proposed modeling approach uses a rather limited set of input features to derive its prediction the authors should acknowledge this as another limitation and suggest other potential model input features that could support improved model performance in the future. The authors do acknowledge seasonal factors as a potential factor and that the model compensates from this in its various state/input vectors spanning a year. A major benefit of ANNs such as this is the ability to accommodate multiple model input factors and predictors, there is likely other model predictors that would provide more predictive accuracy and the authors should identify them as future pursuits.
8.) The last paragraph on 15 that extends to 16 needs to be rephrased to be more concise and to enhance clarity and as it is difficult to follow as it relates to non-specific nature of the alerts. This should also be referenced in the Results section and Figure 4 should be improved to highlight the alerts being attributed to the increased incidence of influenza and mumps cases by circling or marking this in the Figure  The editors assigned to your paper ("Early warning of some notifiable infectious diseases in China by the artificial neural network") have now received comments from reviewers. We would like you to revise your paper in accordance with the referee and Associate Editor suggestions which can be found below (not including confidential reports to the Editor). Please note this decision does not guarantee eventual acceptance.
Please submit a copy of your revised paper before 30-Oct-2019. Please note that the revision deadline will expire at 00.00am on this date. If we do not hear from you within this time then it will be assumed that the paper has been withdrawn. In exceptional circumstances, extensions may be possible if agreed with the Editorial Office in advance. We do not allow multiple rounds of revision so we urge you to make every effort to fully address all of the comments at this stage. If deemed necessary by the Editors, your manuscript will be sent back the original reviewer for assessment --indeed, the Editors have indicated that inviting a second reviewer is likely. If the original reviewers are not available, we may invite new reviewers.
To revise your manuscript, log into http://mc.manuscriptcentral.com/rsos and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision. Revise your manuscript and upload a new version through your Author Centre.
When submitting your revised manuscript, you must respond to the comments made by the referees and upload a file "Response to Referees" in "Section 6 -File Upload". Please use this to document how you have responded to the comments, and the adjustments you have made. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response.
In addition to addressing all of the reviewers' and editor's comments please also ensure that your revised manuscript contains the following sections as appropriate before the reference list: • Ethics statement (if applicable) If your study uses humans or animals please include details of the ethical approval received, including the name of the committee that granted approval. For human studies please also detail whether informed consent was obtained. For field studies on animals please include details of all permissions, licences and/or approvals granted to carry out the fieldwork.
• Data accessibility It is a condition of publication that all supporting data are made available either as supplementary information or preferably in a suitable permanent repository. The data accessibility section should state where the article's supporting data can be accessed. This section should also include details, where possible of where to access other relevant research materials such as statistical tools, protocols, software etc can be accessed. If the data have been deposited in an external repository this section should list the database, accession number and link to the DOI for all data from the article that have been made publicly available. Data sets that have been deposited in an external repository and have a DOI should also be appropriately cited in the manuscript and included in the reference list.
If you wish to submit your supporting data or code to Dryad (http://datadryad.org/), or modify your current submission to dryad, please use the following link: http://datadryad.org/submit?journalID=RSOS&manu=RSOS-191420 • Competing interests Please declare any financial or non-financial competing interests, or state that you have no competing interests.
• Authors' contributions All submissions, other than those with a single author, must include an Authors' Contributions section which individually lists the specific contribution of each author. The list of Authors should meet all of the following criteria; 1) substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data; 2) drafting the article or revising it critically for important intellectual content; and 3) final approval of the version to be published.
All contributors who do not meet all of these criteria should be included in the acknowledgements.
We suggest the following format: AB carried out the molecular lab work, participated in data analysis, carried out sequence alignments, participated in the design of the study and drafted the manuscript; CD carried out the statistical analyses; EF collected field data; GH conceived of the study, designed the study, coordinated the study and helped draft the manuscript. All authors gave final approval for publication.
• Acknowledgements Please acknowledge anyone who contributed to the study but did not meet the authorship criteria.
• Funding statement Please list the source of funding for each author.
Once again, thank you for submitting your manuscript to Royal Society Open Science and I look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch. This paper summarizes the use of artificial neural networks to provide early warning in the event of a potential increase incidence or outbreak of an infectious disease. The manuscript discusses leveraging national databases which update dynamically to drive the ultimate implementation of the modeling approach as a surveillance and monitoring tool. The group uses a recurrent neural network that leverages real-time recurrent learning and an extended Kalman filter to estimate and adapt model weights dynamically. Implementation of this approach will help provide early warning in the event of an outbreak and allow local and government resources to respond in a timely manner.

General Comments:
The paper would benefit significantly by a more organized description of their approach and results. There are some instances in the paper where methods and results are extremely difficult to follow, especially with the figures and graphical results provided. The overall readability of this manuscript should be significantly improved such that the methodology and results are more easily followed and understood. Furthermore, perhaps there needs to be sufficient justification as to why this approach is best suited rather than other predictive modeling approaches. Other more specific comments are referenced below that may significantly improve the readability of the manuscript: 1.) Make synaptic weights plural in the last sentence of the first paragraph in the section of Realtime recurrent learning.
2.) Under the Methods section, the description of the model inputs should be enhanced to be better understandable by the target audience. In its current state it takes a lot of time to interpret the text to determine what the model inputs are. The figures are too generalized. Recommend restructuring Figures 1A and 1B to be more consistent with the text and adding information on external model inputs and data flow through the model as described in the text as it relates to specific model data features/vectors (e.g. the number historical and newly confirmed cases of a specific infectious disease type). Make the Figures less general and more aligned with specific approach as referenced in the text. Improve the figures to visually demonstrate the use of realtime recurrent learning and extending Kalman filtering in the weight estimation for the recurrent neural network.
3.) There is reference in the text to supplemental materials that demonstrate the model establishment process as well as algorithms, however, no supplemental materials are provided. These may address some of the items discussed above, however, it cannot be determined given the absence of this information. Furthermore, the results should be described in more supporting detail. From the text it is hard to follow why the authors chose the output scale on Figure 3A and 3B to be between -2 and 2, if a hyperbolic tangent transfer function is used it should normalize model inputs and outputs to be between -1 and 1. A justification for this should be provided.

5.)
The authors should provide a better explanation on how the model determines and makes the decision (deciding threshold) to execute the early warning/alert this wasn't explained especially clearly in the text and discussion section.
6.) In the discussion section, alternative modeling approaches should be suggested that may be perhaps better suited to the targeted end use case of this approach. For example, using incidence of events as a time series and forecasting future trends over N months in advance. For example, can autoregressive modeling of each incidence of event be incorporated just as effectively, or time delay neural network models? If this approach is superior additional section in the discussion section should provide significant justification as to why. 7.) The proposed modeling approach uses a rather limited set of input features to derive its prediction the authors should acknowledge this as another limitation and suggest other potential model input features that could support improved model performance in the future. The authors do acknowledge seasonal factors as a potential factor and that the model compensates from this in its various state/input vectors spanning a year. A major benefit of ANNs such as this is the ability to accommodate multiple model input factors and predictors, there is likely other model predictors that would provide more predictive accuracy and the authors should identify them as future pursuits.
8.) The last paragraph on 15 that extends to 16 needs to be rephrased to be more concise and to enhance clarity and as it is difficult to follow as it relates to non-specific nature of the alerts. This should also be referenced in the Results section and Figure 4 should be improved to highlight the alerts being attributed to the increased incidence of influenza and mumps cases by circling or marking this in the Figure

Recommendation?
Accept with minor revision (please list in comments)

Comments to the Author(s)
Thank you to the authors for addressing previous comments, I believe the paper's readability has been significantly improved by additional information and description being referenced in the manuscript as well as modifications to the Figures.
There are still some minor issues with language and grammar that are present in the manuscript. I'd recommend reading through and addressing some of the issues such as the example below.
On Page 12 under Early warning models, the first sentence ends abruptly and doesn't finish. It reads "After 168 iterations of the network, we determined the synaptic weights of the network and then input the test data for 2018 into the network which yielded the following results." This sentence should either use a colon "yielded the following results:" or be broken up into multiple sentences. For example, "The model was trained after 168 epochs, and data from 2018 was used to validate model performance.". The authors can then provide their summary and description of the results as is.
On page 19 the author's state: "We should continue to train the network by adding new cases in the future to optimize early warning performance". I believe there should be an expansion of this statement and the authors should highlight that this manuscript only references a new modeling approach and was limited in its scope in terms of validation. Future work will be focused on gather more data and cases and providing a more comprehensive model validation. Results in this paper should be interpreted cautiously, however, the approach should be investigated in future work.

20-Dec-2019
Dear Dr Guo: On behalf of the Editors, I am pleased to inform you that your Manuscript RSOS-191420.R1 entitled "Early warning of some notifiable infectious diseases in China by the artificial neural network" has been accepted for publication in Royal Society Open Science subject to minor revision in accordance with the referee suggestions. Please find the referees' comments at the end of this email.
The reviewers and Subject Editor have recommended publication, but also suggest some minor revisions to your manuscript. Therefore, I invite you to respond to the comments and revise your manuscript.
• Ethics statement If your study uses humans or animals please include details of the ethical approval received, including the name of the committee that granted approval. For human studies please also detail whether informed consent was obtained. For field studies on animals please include details of all permissions, licences and/or approvals granted to carry out the fieldwork.
• Data accessibility It is a condition of publication that all supporting data are made available either as supplementary information or preferably in a suitable permanent repository. The data accessibility section should state where the article's supporting data can be accessed. This section should also include details, where possible of where to access other relevant research materials such as statistical tools, protocols, software etc can be accessed. If the data has been deposited in an external repository this section should list the database, accession number and link to the DOI for all data from the article that has been made publicly available. Data sets that have been deposited in an external repository and have a DOI should also be appropriately cited in the manuscript and included in the reference list.
If you wish to submit your supporting data or code to Dryad (http://datadryad.org/), or modify your current submission to dryad, please use the following link: http://datadryad.org/submit?journalID=RSOS&manu=RSOS-191420.R1 • Competing interests Please declare any financial or non-financial competing interests, or state that you have no competing interests.
• Authors' contributions All submissions, other than those with a single author, must include an Authors' Contributions section which individually lists the specific contribution of each author. The list of Authors should meet all of the following criteria; 1) substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data; 2) drafting the article or revising it critically for important intellectual content; and 3) final approval of the version to be published.
All contributors who do not meet all of these criteria should be included in the acknowledgements.
We suggest the following format: AB carried out the molecular lab work, participated in data analysis, carried out sequence alignments, participated in the design of the study and drafted the manuscript; CD carried out the statistical analyses; EF collected field data; GH conceived of the study, designed the study, coordinated the study and helped draft the manuscript. All authors gave final approval for publication.
• Acknowledgements Please acknowledge anyone who contributed to the study but did not meet the authorship criteria.
• Funding statement Please list the source of funding for each author.
Please note that we cannot publish your manuscript without these end statements included. We have included a screenshot example of the end statements for reference. If you feel that a given heading is not relevant to your paper, please nevertheless include the heading and explicitly state that it is not relevant to your work.
Because the schedule for publication is very tight, it is a condition of publication that you submit the revised version of your manuscript before 29-Dec-2019. Please note that the revision deadline will expire at 00.00am on this date. If you do not think you will be able to meet this date please let me know immediately.
To revise your manuscript, log into https://mc.manuscriptcentral.com/rsos and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions". Under "Actions," click on "Create a Revision." You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript and upload a new version through your Author Centre.
When submitting your revised manuscript, you will be able to respond to the comments made by the referees and upload a file "Response to Referees" in "Section 6 -File Upload". You can use this to document any changes you make to the original manuscript. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response to the referees.
When uploading your revised files please make sure that you have: 1) A text file of the manuscript (tex, txt, rtf, docx or doc), references, tables (including captions) and figure captions. Do not upload a PDF as your "Main Document". 2) A separate electronic file of each figure (EPS or print-quality PDF preferred (either format should be produced directly from original creation package), or original software format) 3) Included a 100 word media summary of your paper when requested at submission. Please ensure you have entered correct contact details (email, institution and telephone) in your user account 4) Included the raw data to support the claims made in your paper. You can either include your data as electronic supplementary material or upload to a repository and include the relevant doi within your manuscript 5) All supplementary materials accompanying an accepted article will be treated as in their final form. Note that the Royal Society will neither edit nor typeset supplementary material and it will be hosted as provided. Please ensure that the supplementary material includes the paper details where possible (authors, article title, journal name).
Supplementary files will be published alongside the paper on the journal website and posted on the online figshare repository (https://figshare.com). The heading and legend provided for each supplementary file during the submission process will be used to create the figshare page, so please ensure these are accurate and informative so that your files can be found in searches. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI.
Once again, thank you for submitting your manuscript to Royal Society Open Science and I look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch. Comments to the Author(s) Thank you to the authors for addressing previous comments, I believe the paper's readability has been significantly improved by additional information and description being referenced in the manuscript as well as modifications to the Figures.
There are still some minor issues with language and grammar that are present in the manuscript. I'd recommend reading through and addressing some of the issues such as the example below.
On Page 12 under Early warning models, the first sentence ends abruptly and doesn't finish. It reads "After 168 iterations of the network, we determined the synaptic weights of the network and then input the test data for 2018 into the network which yielded the following results." This sentence should either use a colon "yielded the following results:" or be broken up into multiple sentences. For example, "The model was trained after 168 epochs, and data from 2018 was used to validate model performance.". The authors can then provide their summary and description of the results as is.
On page 19 the author's state: "We should continue to train the network by adding new cases in the future to optimize early warning performance". I believe there should be an expansion of this statement and the authors should highlight that this manuscript only references a new modeling approach and was limited in its scope in terms of validation. Future work will be focused on gather more data and cases and providing a more comprehensive model validation. Results in this paper should be interpreted cautiously, however, the approach should be investigated in future work.

21-Jan-2020
Dear Dr Guo, It is a pleasure to accept your manuscript entitled "Early warning of some notifiable infectious diseases in China by the artificial neural network" in its current form for publication in Royal Society Open Science. The comments of the reviewer(s) who reviewed your manuscript are included at the foot of this letter.
Please ensure that you send to the editorial office an editable version of your accepted manuscript, and individual files for each figure and table included in your manuscript. You can send these in a zip folder if more convenient. Failure to provide these files may delay the processing of your proof. You may disregard this request if you have already provided these files to the editorial office.
You can expect to receive a proof of your article in the near future. Please contact the editorial office (openscience_proofs@royalsociety.org) and the production office (openscience@royalsociety.org) to let us know if you are likely to be away from e-mail contact --if you are going to be away, please nominate a co-author (if available) to manage the proofing process, and ensure they are copied into your email to the journal.
Due to rapid publication and an extremely tight schedule, if comments are not received, your paper may experience a delay in publication. Royal Society Open Science operates under a continuous publication model. Your article will be published straight into the next open issue and this will be the final version of the paper. As such, it can be cited immediately by other researchers. As the issue version of your paper will be the only version to be published I would advise you to check your proofs thoroughly as changes cannot be made once the paper is published.
Please see the Royal Society Publishing guidance on how you may share your accepted author manuscript at https://royalsociety.org/journals/ethics-policies/media-embargo/. Thank you for the recent review of our manuscript "Early warning of some notifiable infectious diseases in China by the artificial neural network". We appreciate the diligent efforts of the editor and reviewers to help improve our manuscript. We have revised our manuscript based on the reviewers' comments and the format requirements of the journal and wish to resubmit it for your consideration. Changes made in response to the concerns raised by the reviewers are indicated using the track changes feature in the revised manuscript. We have invited professional language experts at American Journal Experts to review the manuscript and revise the language.
All authors reviewed the final version of the revised manuscript and have approved it for publication. The manuscript has not been and will not be sent elsewhere for possible publication as long as it is under consideration by Royal Society Open Science. We hope that the revisions are acceptable and look forward to hearing from you soon. Thank you very much for your letter and advice regarding our manuscript entitled "Early warning of some notifiable infectious diseases in China by the artificial neural network". We have revised our manuscript based on the comments and wish to resubmit it for your consideration. Changes made in response to the concerns are indicated using the track changes feature in the revised manuscript. We have invited professional language experts at American Journal Experts to revise the language. Sincerely,

Zuiyuan Guo
Email: zuiyuanguo@163.com We would like to express our sincere gratitude to the reviewers for their constructive comments.
Replies to the Reviewers 1. Make synaptic weights plural in the last sentence of the first paragraph in the section of Real-time recurrent learning.
Answer: I apologize for this mistake, which I have corrected (manuscript with tracked changes, page 6, line 12). 3. There is reference in the text to supplemental materials that demonstrate the model establishment process as well as algorithms, however, no supplemental materials are provided. These may address some of the items discussed above, however, it cannot be determined given the absence of this information.

Under the
Answer: I apologize for not providing the supplemental material, which I have submitted to the journal.

The authors should provide a better explanation on how the model determines
and makes the decision (deciding threshold) to execute the early warning/alert this wasn't explained especially clearly in the text and discussion section.

Answer:
We have explained how the model decides to execute an early warning in detail in the methods section (page 7, lines 6-18) and in the results section (page 12, lines 2-4).
6. In the discussion section, alternative modeling approaches should be suggested that may be perhaps better suited to the targeted end use case of this approach.
For example, using incidence of events as a time series and forecasting future trends over N months in advance. For example, can autoregressive modeling of each incidence of event be incorporated just as effectively, or time delay neural network models? If this approach is superior additional section in the discussion section should provide significant justification as to why.
Answer: Some alternative modeling approaches have been referenced in the first paragraph of the discussion section (page 15, lines 13-16), and we have described some unique advantages of our model at the end of this paragraph (page 15, lines 22-24; page 16, lines 1-3). Answer: We have moved some contents to the methods section for clarification (page 16, lines 18-23; page 17, lines 2-4). Further, we have added a description of the iterations to the methods section (page 12, lines 2-4). In addition, Figure 4 has been improved to highlight the alerts with grey circles (page 14, Figure 4).