OpenWorm: overview and recent advances in integrative biological simulation of Caenorhabditis elegans

The adoption of powerful software tools and computational methods from the software industry by the scientific research community has resulted in a renewed interest in integrative, large-scale biological simulations. These typically involve the development of computational platforms to combine diverse, process-specific models into a coherent whole. The OpenWorm Foundation is an independent research organization working towards an integrative simulation of the nematode Caenorhabditis elegans, with the aim of providing a powerful new tool to understand how the organism's behaviour arises from its fundamental biology. In this perspective, we give an overview of the history and philosophy of OpenWorm, descriptions of the constituent sub-projects and corresponding open-science management practices, and discuss current achievements of the project and future directions. This article is part of a discussion meeting issue ‘Connectome to behaviour: modelling C. elegans at cellular resolution’.


Introduction
In 2011, the OpenWorm project was launched with the mission of building the world's first detailed biophysical simulation of the nematode Caenorhabditis elegans [1,2]. In addition to the ambitious scientific goals, a unique aspect of the project is the fully open science, distributed research framework in which the work would take place. In this article, we look at the past, present and future of OpenWorm. What has it achieved in the period since its foundation, what are the important next steps and what can others learn from this experience?
A unifying principle underpinning OpenWorm is the application of an engineering approach to the challenge of managing biological complexity [3]. Modern software engineering has given us the tools to keep track of the hundreds of thousands of details of which complex physical systems are composed. The synergy between human and machine in computer-assisted modelling can allow for deeper reasoning than either a human or computer alone. In industrial manufacturing, for example, advances in engineering software have enabled materials simulations that allow mechanical engineers to test many different mechanisms in silico before the manufacturing process [4]. While the fields of computational biology and computational neuroscience have made significant advances over their multi-decade history, simulations have only had a limited impact on the biological thinking process when compared with other disciplines in the physical and engineering sciences [5].
What level of complexity should our model aim for? Our perspective is the following: an integrative model need not incorporate any more detail than the individual models the research community has already produced. In other words, we take a holistic approach in which individual models, which may operate at multiple scales, are thoughtfully integrated into a unified computational platform, providing a global view of the entire organism. As we will discuss below, the lowest level of biological detail that the OpenWorm project incorporates is that of ion channel models which underpin membrane potential dynamics. Examples such as the whole cell model of Karr et al. [6] demonstrate that this kind of 'holistic biology' can lead to valuable insights into underlying biological function [6]. The purpose of this work is integrative and allows us to extract even greater value from the knowledge the scientific community has already produced. Nowhere is the need for models that encompass multiple scales more evident than in the hermaphrodite nematode's network of 302 neurons, where simple crawling and swimming behaviours remain unexplained [7]. Despite decades of effort, we struggle to describe how individual neurons give rise to such diverse organismal behaviour. Our belief is that a computational platform in which an organism's behaviour arises from lower-level biological models will come to play a significant role in advancing the field.
In the field of C. elegans biology, there has been significant effort to collect comprehensive anatomical and other structural data about the nervous system, ranging from the electrical and synaptic connectome [8], to cholinergic and GABAergic neurons [9,10], to the extrasynaptic connectome of neuropeptides [11]. The purpose of generating these 'map'-like datasets is to communicate the relationships between biological entities. Unfortunately, the complexity of such datasets places severe limitations on their intelligibility. This problem is not unrelated to modern genomics, where the many-tangled webs of relationships between hundreds of thousands of genes and gene products demand computational tools to assist in their understanding. It is with this complexity in mind that the OpenWorm project has taken upon itself to integrate the disparate and heterogeneous physiological maps and related datasets generated by the C. elegans community into a coherent software framework. Efforts such as PyOpenWorm (described below) are one such example in OpenWorm where publicly available data are assembled into a graph database and Python application programming interface, enabling users to query multiple datasets about C. elegans neuronal structure. By creating an open, shared repository and query tool for these data, the fruits of collective labour become integrated into a shared structure that amplifies the impact of the entire community's research output. Moreover, the need to arrive at a global view of relevant datasets has allowed us to identify key areas where new data should be collected, potentially taking advantage of novel experimental apparatus such as robotic patch-clamp set-ups. Ultimately, we expect that unified platforms for data integration will dovetail with other contemporary efforts in the life sciences to increase the robustness and exchangeability of datasets and models [12][13][14].
In the scientific community, assembling datasets solely for the purpose of consolidation has often led to the emergence of multiple, redundant standards. In the OpenWorm project, our fundamental aim is to curate datasets and mathematical models in a manner that facilitates dynamic simulations of biological function. Theoretical biophysicists have produced a rich literature of quantitative models of C. elegans physiology, ranging from membrane potential dynamics, to neuromuscular coupling, to the fluid dynamics of body movement. Integrating these individual models into a global, composite simulation creates an additional check on the underlying datasets themselves. In addition, the simulation enables the construction of complex hypotheses which researchers can further investigate through theoretical or experimental means [5].
In deciding on the level of biological detail we wish to incorporate, we have agreed upon an approach that incorporates biomechanics as a critical component of understanding the nematode in the context of its environment [15]. In addition to the biological implications, maintaining biological realism may have implications well beyond understanding C. elegans. Indeed, researchers in the artificial intelligence community have posited that sensorimotor feedback may play a role in allowing future AI systems to learn from experience more efficiently than current data-hungry systems based on deep learning [16]. As such, we have unified a biomechanical model of C. elegans, Sibernetic [17,18], that incorporates interactions with a fluid or gel environment, with a modelling infrastructure for complex neuronal networks, c302 [19].
Our ultimate vision for OpenWorm is to provide a computational platform that allows for simulations to become seamlessly integrated into biological thought. Rather than replacing existing theoretical or experimental methods, our vision is to take advantage of the powerful tools of modern software engineering to maximally enable the research community and leverage long-standing intellectual traditions and biological insights [5]. We can imagine a number of possible applications for such a platform. Because we have complete control over all details of the simulation, we can effortlessly create knockouts, where, for example, all synaptic connections to or from a specific cell can be removed. We can simulate known mutants that have ion channels with different properties and observe their behaviour. We can simulate the effects of drugs by modelling their impact on ion channels, potentially paving the way to using simulations as a way to generate hypotheses for new uses of existing pharmacological agents and for discovering new ones. If successful in the C. elegans community, we would hope this approach could assist in the understanding of   OpenWorm is organized into a number of sub-projects, several of which are described in more detail in this issue. In this section, we will give a condensed overview of the core of the platform. A 'simulation stack' refers to the set of integrated software tools that are used to run a simulation. It is called a 'stack' because each tool can be thought of as existing at a certain level in a hierarchy of abstraction and information flow. For instance, at the lowest level, we have ion channel models and connectomes. At the next level, we have models of neuromuscular coupling. And finally, the output generated by the connectome can be fed into a simulation of the body movement and environment. While each element of this simulation stack could form the basis for an independent research project, our aim is to use best practices from the software industry to integrate these tools into a single software framework. Figure 1 shows the different components of the OpenWorm simulation stack and their relationships. Figure 1a shows a breakdown of the contents of the Open-Worm simulation stack described as components of software. Inputs and outputs to the software components are depicted with arrows showing how they relate to the core modules. For the PyOpenWorm and ChannelWorm software projects, inputs include anatomical and structural data from the worm's nervous system and knowledge about ion channels, respectively. These data are fed into the c302 software component, which constructs systems of equations that are used to simulate the membrane dynamics of the nervous system at multiple levels of detail ranging from simple integrate-and-fire neurons to multicompartment neuronal models [19]. The outputs of the c302 simulation include muscle activation signals which form the inputs to the Sibernetic system. We are planning on incorporating feedback from Sibernetic to c302 that represents sensory signals generated from the worm body's posture as well as interactions with the environment. Additionally, Sibernetic takes as an input structural and biophysical data about the worm. The output from Sibernetic, the outline of the worm's body as it bends and moves over time, can be fed into the movement validation software system, where comparisons with videos of real worms are used to validate the global model's biological validity. These two systems will be incorporated into a web-based graphical user interface framework that provides a visual interface to the end user via WormSim/Geppetto [20]. An optimization block in the diagram indicates where the free parameters in the models can be filled in by tuning model parameters of single neurons to match experimental data [21,22].
In figure 1b, we show a simplified schematic that breaks down the integrated c302/Sibernetic system into mathematical components. The system is initialized with relationships, parameters and structure derived from databases that have been populated with information about C. elegans physiology. In the current version of the simulation, we begin with neuron membrane potential dynamics (N I ) that are set manually. From those dynamics, the electrical activities of the body wall muscle cells (N M ), i.e., those muscles receiving direct synaptic input from neurons, are calculated. This activation also results in dynamical changes in the muscles' internal calcium concentration (M c ). These first components of the simulation are carried out in the c302 framework, which executes a NeuroML-based model in the NEURON simulation engine [19]. The calcium dynamics of the muscle cells calculated by c302 (M c ) are passed into Sibernetic as activation signals. These activation signals are converted into forces that cause activated muscle cells lining the body model to contract (M s ). The combination of the contraction states of all the muscles leads to the state of the simulated body model as a whole (S), calculated via the predictive -corrective incompressible smoothed particle hydrodynamics (PCISPH) algorithm for  . This can then be compared against the movement of real worms once brought into a comparable format [23]. While there is currently only a uni-directional flow of information from c302 to Sibernetic via muscle activation signals, we are developing a reverse step where forces on the skin of the worm body model lead to activation signals of sensory neurons.

(ii) PyOpenWorm
Biological data are often weakly structured and heterogeneous, which creates fundamental problems for computational platforms that rely on these data. In addition, discrepancies that are frequently seen between database formats and term definitions create even further difficulties for end users. The challenges in making use of biological data are common across all subfields of computational biology, with C. elegans being no exception. PyOpenWorm (https://github.com/openworm/ pyopenworm) is a Python package intended to simplify access to a range of structured data on C. elegans anatomy and physiology. It is a data access layer for C. elegans information, where users can query data across multiple scales of the worm's biology. The heterogeneous nature of C. elegans biology requires that different underlying technologies be used to store different types of data. For instance, an RDF semantic graph representation is useful for representing neuronal structural properties such as ion channel expression and the density and type of neurotransmitter receptors, whereas a NeuroML representation is most appropriate for storing model morphology and simulation parameters [24]. PyOpenWorm solves the problem of abstracting away the underlying technologies, so the user can query the system in a manner that is intuitive for researchers who are already familiar with the worm's biology. The resulting data can be used directly or as part of a multistage software pipeline. The software project is implemented in the Python programming language and the code is available on GitHub along with the other sub-projects of OpenWorm. Data from reliable external sources, most often published journal articles, are collected into a single directory in the PyOpenWorm repository. These datasets can take the form of structured spreadsheet files or even other relational databases. For quality control, we only consider data that have an original source associated with them. Currently, data are collected from the literature and other secondary sources, such as WormBase [25] and WormAtlas [26]. When a user or program connects with PyOpenWorm's database, they have access to all of the data through a simple Python library. Table 1 lists current data sources that are part of PyOpenWorm.
Although PyOpenWorm's primary current use cases are for storing static data and models, its fundamental architecture anticipates future needs once members of the research community begin to make use of the OpenWorm tool stack as part of their daily research. In particular, ongoing development of PyOpenWorm is aimed at ensuring that the system can store metadata and simulation results, so that this output can subsequently be interrogated and analysed as part of the research process.

(iii) ChannelWorm
As we discussed above, ion channels represent the most granular level of biological detail that the OpenWorm simulation incorporates. Ion channels are pore-forming proteins, found in the membranes of all cells. They are responsible for many known cellular functions including shaping action potentials and gating the flow of ions across the cell membrane. Remarkably, most nematode ion channels are conserved across vertebrate species [28]. Because of their widespread relevance for biology, many electrophysiological experiments have been focused on ion channels and transporter functional genomics in C. elegans [29 -34]. Although much of this work is experimental, computational work has also been directed at integrating ion channel models into larger-scale simulations [35 -40]. One such example (outside C. elegans biology) is the Blue Brain Project, which recently unveiled a detailed simulation of a rat cortical microcolumn [41], taking advantage of an extensive repository of curated data and models of ion channels [42].
We have chosen ion channel models as the most granular level of detail with which to simulate the nematode for several reasons. For instance, insights into drug development would not be possible without an understanding of the action of the major neurotransmitter species on Na þ , K þ and Ca 2þ currents. Fortunately, incorporating ion channel models is a tractable approach and there is no need to limit ourselves to simulations of more abstract neurons. Moreover, the specific dynamics of ion channels themselves are key components of the models of neuromuscular coupling that we use. And as we argued above, biomechanics is a central component of our scientific roadmap.
As part of the OpenWorm project, we created ChannelWorm (https://github.com/openworm/channelworm and https:// chopen.herokuapp.com) in order to (i) integrate and structure data related to ion channels in C. elegans, (ii) digitize and curate electrophysiological data from publications, (iii) develop application programming interfaces for accessing these data and (iv) build ion channel models based on experimental data. As the project has progressed, we have found ourselves in the unique position of attempting to develop a global view of the current state of C. elegans ion channel modelling. One of the major lessons we have learned is that patch-clamp data are only available for a small minority of ion channels expressed in the nematode. Consequently, a significant undertaking is to build Hodgkin-Huxley models for ion channels that lack these data based on homologous channel types from other organisms. After a manual curation process in which contributors digitize electrophysiological plots, kinetic parameters are derived from these data using genetic algorithms [22,43,44] and related techniques such as particle swarm optimization. Ultimately, these ion channel models are translated into the NeuroML markup language [45,46], which allows for consistent representation of neuronal biophysics, anatomy and network architecture for use in subsequent computational simulations. (iv) Software testing and model validation As a software project, OpenWorm shares many commonalities with any large-scale software engineering endeavour in industry.
Unit testing is a key element of modern software engineering which uses semi-automated checklists to ensure the correctness of software. For instance, a company developing a word processor might have a test that verifies whenever the mouse clicks a specific region in the upper left hand of the screen, the 'File' menu opens and not the 'Edit' menu. Likewise, other tests might verify that files can be appropriately written to disk or that connectivity with printers and other network devices is working. From its inception, OpenWorm has incorporated best practices from the software industry, including unit testing, across all of the diverse sub-projects, especially PyOpenWorm [47]. Examples of unit tests used by OpenWorm include verifying that entries can be added to and removed from the PyOpenWorm database, that every biological fact such as ion channel parameters have associated PubMed identifiers and that functions implement error handling correctly. As a scientific research project that incorporates dynamic models, another class of tests crucial to our effort are model validation tests. In contrast to simple unit tests, which verify that a discrete piece of code has the correct behaviour, model validation tests verify that the output of an entire dynamic model corresponds to known behaviour from the academic literature. For instance, alongside the ion channel curation and parameter extraction tasks in ChannelWorm, a parallel effort is aimed at implementing validation tests for each of these models using the Python library SciUnit [48]. The validation process uses curated datasets of ion channel behaviour to instantiate analogous statistical tests that a researcher would use when developing such a model. By incorporating this process into the software development workflow, we can ensure that developers and researchers are alerted if any of the models at any level of abstraction are not in correspondence with known behaviour determined by experimentalists [47,49,50].

(b) Outreach, education and sister projects (i) Web-based visualization of OpenWorm models
We recognize that many motivated and talented citizen scientists are not experienced in software engineering and data science. Consequently, to make the OpenWorm model as accessible as possible, we have worked to create simple and intuitive applications that can be used for exploratory purposes and which can serve as a fun and compelling entry point to the project. Initial work to accomplish this was the development of the WormSim (http://wormsim.org) prototype. WormSim was launched via a successful Kickstarter campaign in 2014, but this has been superseded by more advanced approaches to visualizing these models. Recent developments with the Geppetto platform (http://geppetto.org) [20] for multi-scale biological simulation, which was the underlying platform for WormSim, have enabled users to visualize the C. elegans connectome within the body of the worm itself, and visualize and explore changing dynamics in the connectome to see the effect on swimming and crawling. This version of the visualization is currently being incorporated into the OpenWorm simulation stack above in order to allow users to examine intermediate levels of the simulation (see [20], this issue, for visual examples.)

(ii) Robotics
Because the scientific vision of OpenWorm places a key emphasis on biomechanics, we have multiple outlets for how the virtual nervous system simulation interfaces with the world. One is through a fully virtual body embedded in a virtual physical environment. Another is for the nervous system simulation to interact directly with a robotic body, a platform that provides a unique educational opportunity for newcomers to engage with the project. Figure 2a,c shows a top and side view of a prototype Open-Worm robot (https://github.com/openworm/robots) with major components denoted. The robot consists of nine articulated segments, each segment mounted on a pair of wheels. Locomotion is achieved, as it is in C. elegans, by moving in a snake-like manner that relies on surface friction. The wheels are not powered and exist solely to provide a suitable contact surface with the ground. Each segment is a three-dimensionalprinted component (figure 2b) that articulates with its neighbours via servos (figure 2d ). The electronic components, consisting of the Raspberry Pi Zero microprocessor with wireless communication capabilities, are mounted on platforms fastened to several of the segments. A pulse-width modulation board distributes power and controls signals from the Raspberry Pi Zero to the servos. Each servo is capable of maintaining a specified angular position that translates to inter-segment angular positioning. Figure 2b shows the designs for the 3 three-dimensionalprinted parts. These parts are specified in a common .stl file format that is editable and portable to most three-dimensional printers. On the left is the segment part. In the centre is the head that is envisaged to be mounted with sensors for food foraging and touch. On the right is one of the platforms for mounting the electronic components. Figure 2d shows how the segments are articulated. A servo is mounted on the front top of the segment with its geared shaft extending into an aperture in the next forward segment. An arm secured to the gear allows for gear motion to drive angular movement between segments.
Like WormSim, the robotics sub-project of OpenWorm is a key element of our education and outreach efforts. The accessibility and low cost of electronics microprocessors like the Raspberry Pi make this an attractive and compelling introduction to the project for students of all ages, which exposes them to bleeding edge concepts at the intersection of software and robotics. Ongoing work in the robotics sub-project is aimed at incorporating models for food foraging and touch response, developing a new system-on-a-board processor that will also perform power and control distribution, utilizing laser-cut segments and providing a programming interface via Jupyter notebooks.

(iii) DevoWorm
Much of what we have described above pertains to simulations of the adult nematode. A complementary goal for computational research, with direct relevance for many members of the C. elegans experimental community, is to simulate embryogenesis and development in C. elegans. Given the knowledge of the embryonic cell lineage in C. elegans [51], one of the goals of Devo-Worm, an ongoing sister project to OpenWorm with a parallel set of approaches (https://github.com/devoworm), is to apply a similar modelling technique of transforming datasets into computable forms and evolving their progression over time using mathematical models of biophysical developmental processes. It is currently divided into three loosely knit sub-projects: Developmental Dynamics, Cybernetics and Digital Morphogenesis, and Reproduction and Developmental Plasticity.
Developmental Dynamics currently involves using secondary data collected from embryos [52,53] along with bioinformatic and data science techniques to answer questions regarding the process of early embryogenesis and the timing of later morphogenesis. Cybernetics and Digital Morphogenesis has involved using cellular automata [54] or finite-element approaches [55] to model physical interactions during embryogenesis and morphogenesis. DevoWorm has also explored the use of cybernetic models and concepts to better understand the general process of embryogenesis [56]. Reproduction and Developmental Plasticity involves an evolutionary developmental biology approach [57] to understand C. elegans more generally. DevoWorm's existing datasets and papers include a focus on larval development and life-history processes. Taken together, these focus areas are beginning to draw additional interest into simulated embryogenesis and morphogenesis of C. elegans.

(c) Community management support (i) Distributed scientific collaboration
A citizen science consortium with over 90 contributors 1 from 16 different countries and no central source of funding, OpenWorm has been an organizational experiment in coordinating a distributed, international research effort with a highly fluid base of contributors. Freely available software tools have played a key role in project management and coordination. The focal point of much of our work is the diverse functionality of the GitHub platform [58], which allows us to use sophisticated, industrialscale management tools for versioning the OpenWorm codebase as well as data from our university-based research partners.
Other platforms such as the Google Docs platform with spreadsheets, drawings, slides and forms have also been critical for the creation and distribution of shared materials.
Teleconferencing systems like Google Hangouts have enabled building trust, camaraderie and working relationships among contributors living in many time zones across the globe. Google Calendar has been invaluable for scheduling, as has the Doodle poll tool for coordinating meeting times. The functionality of the Slack chat platform has played a crucial role in managing the many asynchronous conversations related to software development and the scientific roadmap. Given the volume of high-quality tools such as Amazon Web Services, Docker, Slack and many others that are available for use in the modern era of software engineering, the challenges we faced in the initial stages of building the organization often amounted to making the right choices about which tools to use on the basis of the cohesiveness of their relationships with one another. Consequently, the integration points between these different systems have been one of the concrete deliverables of OpenWorm for other organizations interested in distributed community management.
Because of its open source and volunteer-based nature, timelines for task completion are often fluid. Coordinating the project requires the discovery of synergy among collaborators based on individual interests and research goals. Managing the project requires creating the potential for others to contribute and build, connecting that potential to the right individuals at the right time and ensuring that there is sufficient flexibility in the high-level vision so that the project can make progress even if all directions are not advancing at a given moment.
The 'long memory' of online resources is helpful in this regard. Issues that are captured in GitHub may sit inactive for months before the right person comes along who has the skill set and motivation to solve them. Consequently, the tolerance of contributors to uncertainty is an important component of working well within an open community. We have taken inspiration from the open source programming movement that follows a similar philosophy. In open source software development, new volunteers are encouraged to take personal responsibility and leadership for creating new directions that excite them. A unique aspect of research and development in an open community is the rate at which volunteers enter the project eager to learn and to contribute their time and energy to a shared effort that is larger than any one individual [59].

(ii) Mentorship and training through badges
Open-science projects face a very different set of management challenges when compared with university or industry-based research initiatives. In particular, mechanisms are needed to assist new contributors to develop relevant technical skills and build familiarity with the project. To facilitate this process, Open-Worm has taken advantage of a free service called BadgeList, which allows for the creation of digital 'micro-credentials' certifying that an individual is able to complete a focused set of tasks (http://badgelist.com/openworm). Upon successfully learning and answering a set of test questions, a user can earn a badge, indicating that they have acquired a specific skill set. Example badges currently used by the project include basic and advanced GitHub/version control, Hodgkin-Huxley equation basics and literature mining. The collection of badges has been growing over the past several years, and many new contributors have found the system to be a valuable entry point to the project.

(iii) Volunteer composition and project leadership
We have been fortunate that OpenWorm has attracted an incredibly diverse set of volunteers with respect to nationality and intellectual background. As we mentioned above, we have over 90 volunteers from 16 different countries who have made substantive contributions to the project. Moreover, the contributors have come from a variety of academic backgrounds, including theoretical and experimental biology, physics and computer rstb.royalsocietypublishing.org Phil. Trans. R. Soc. B 373: 20170382 science, to name just a few. In addition to several core members who are tenure-track faculty at major universities, many of our volunteers are professional software engineers. One area where we are keen to make more progress in is the gender diversity within the project. We have recognized this as a priority and have advertised on social media our active commitment to providing a safe and welcome space for all individuals. We welcome any input on how we might go about achieving a more equitable gender balance.
We are frequently asked about project leadership, decisionmaking and conflict resolution. Like many open source projects, our list of contributors has a long tail, with a few core contributors assuming leadership roles and many others making periodic, smaller contributions [60,61]. To date, we have had no formal process for assigning roles, and we have found that experienced and enthusiastic volunteers often establish themselves as leaders without any prompting. Subgroups dedicated to topics ranging from engineering, to basic science, to community outreach organize via dedicated channels on Slack, and new volunteers have the opportunity to contribute to whichever efforts resonate with them the most. We have actively worked to ensure a culture where open deliberation takes place with all contributors receiving a voice. With the formal incorporation of the OpenWorm Foundation as an independent, non-profit research organization, we have formed an official scientific advisory board that is responsible for establishing the scientific direction of the effort. Thus far, we have found that input from the scientific advisory board has organically filtered into the project in an effective manner. As the project grows, we may consider formalizing the roles of full-time staff and primary collaborators.

Recent progress
What progress has the OpenWorm project made since the publication of our first overview paper [1]? The number of contributors has grown substantially and the codebase has sufficiently matured that new volunteers can join and begin to contribute by tackling open issues on GitHub. Building on several years of experience managing an open-science project, as well as our collective experience building software in the commercial and academic setting, we have refined many of our management practices to better serve the needs of a fluidly shifting base of contributors. The badge system described above has been used by several dozen new members, and we have been holding weekly 'office hours' on Slack where senior contributors are available to answer questions.
With regard to more traditional academic metrics, the special issue in which this article appears will include the publication of several new articles featuring foundational modelling, simulation, data management and data presentation technologies developed as a result of OpenWorm-led collaborations [19,20,23,48]. Before this special issue, we have published a handful of papers on several different facets of the project in a spectrum of journals focused on the computational biological sciences. We have been involved with multiple academic conferences and have built university-based collaborations with six different research laboratories in four countries. Equally as important, we have formally been incorporated as an independent nonprofit research organization, the OpenWorm Foundation. This foundation has allowed us to assemble an accomplished scientific advisory board that is helping to guide us through this critical infrastructure-building phase of the project.
To address a query frequently asked of the project: 'when can we turn the simulation on?', the simple answer is that there is already prototype code to do this, available online at GitHub (http://github.com/openworm/openworm). The C. elegans connectome contained in c302 is able to drive body movement in the Sibernetic platform for fluid dynamic simulations. However, the level of detail that we have incorporated to date is inadequate for biological research. A key remaining component is to complete the curation and parameter extraction of Hodgkin -Huxley models for ion channels to produce realistic dynamics in neurons and muscles. Once this task is complete, we expect that the platform will incorporate a sufficiently granular level of detail to be of interest to researchers in the field. Table 2 summarizes our accomplishments.

Discussion
By organizing the research output of an entire community into a shared structure, integrative simulations have the potential to advance biological thinking significantly. Rather than being replacements for existing theoretical or experimental techniques, these composite simulations should be viewed as powerful tools to augment the thought process and technical toolbox of scientists. Figure 3 summarizes how integrative simulations can be an organic part of the research process. The same observations that researchers use to form mental models and hypotheses are first organized into databases such as PyOpenWorm and ChannelWorm (arrow a). Researchers benefit from these databases directly, for example, by having on-demand access to useful facts about C. elegans physiology (arrows b and c). Subsequently, these datasets are formalized into mathematical models, a process that is itself intrinsically valuable to researchers as part of hypothesis generation (arrow e). Most significant are the final steps of this sequence, in which the datasets and mathematical models are integrated into a larger, composite simulation. By studying the outputs of simulations, analogous to the outputs of experiments, researchers are able to augment their intuition and mental models about biological function in ways that would not be possible through experimentation alone (arrows f and g). While these agendas are in their nascent stages, many of the key components of the OpenWorm simulation platform will only need to be built once and can then be re-used community-wide in day-to-day research. At the OpenWorm Foundation, we are assembling the necessary technical and organizational infrastructure to build the world's first integrative biological simulation of the nematode C. elegans. We hope that subsequent efforts will benefit from our experience, and, in the future, we hope to see the vision of integrative biological simulations extend to many other model organisms and have a widespread scientific impact.

(a) Future directions
Looking forward, there are two thrusts for the project: a primarily scientific one, and the second, a primarily engineering or tool-building phase.
The tap withdrawal circuit is a well-studied experimental protocol we are currently investigating that has focused our transition from infrastructure development to actively using the platform for scientific research [62,63]. Simulating this behaviour will require closing the loop between sensation, motor output and environmental activity. In addition, the rstb.royalsocietypublishing.org Phil. Trans. R. Soc. B 373: 20170382 nervous system model must be able to transform an external input into a switch of behaviour from crawling forwards to backwards. As a prerequisite, we must also implement a version of forward and backward locomotion based on the activity of motor neurons driving the muscles of the model.   incorporating SciUnit-based model validation tests, which allow us to constrain the simulation to match experimental data at different scales and modalities. A critical component of this research direction will be efficient optimization algorithms to help fill in data gaps of free parameters that are currently unknown within the biological community. Once a working prototype of tap withdrawal is completed, we can look at perturbing the model in ways that are consistent with mutations known to have an impact on neuronal or other cellular activity. This will be a valuable test of the ability of the OpenWorm integrated model to capture essential dynamics despite significant biological variation.
Our engineering aim at present is to reach a steady state where the fundamental infrastructure of OpenWorm has stabilized and can be used for scientific research. Active ongoing infrastructural development in the project includes expanding the functionality of PyOpenWorm to store metadata and provenance of simulations, using this framework to build a database of simulation results, completing the ion channel curation and parameter extraction tasks in ChannelWorm, building an automated system for identifying new publications on C. elegans relevant for OpenWorm and expanding the automated framework for verifying the correctness of curated scientific models, to name just a few. More information about making a contribution is available on our website and via our volunteer contribution form (http://bit.ly/OpenWormVolunteer).
Data accessibility. All data and code associated with OpenWorm is available through our GitHub repository at https://github.com/OpenWorm.