The evolution of multicellular complexity: the role of relatedness and environmental constraints

A major challenge in evolutionary biology has been to explain the variation in multicellularity across the many independently evolved multicellular lineages, from slime moulds to vertebrates. Social evolution theory has highlighted the key role of relatedness in determining multicellular complexity and obligateness; however, there is a need to extend this to a broader perspective incorporating the role of the environment. In this paper, we formally test Bonner's 1998 hypothesis that the environment is crucial in determining the course of multicellular evolution, with aggregative multicellularity evolving more frequently on land and clonal multicellularity more frequently in water. Using a combination of scaling theory and phylogenetic comparative analyses, we describe multicellular organizational complexity across 139 species spanning 14 independent transitions to multicellularity and investigate the role of the environment in determining multicellular group formation and in imposing constraints on multicellular evolution. Our results, showing that the physical environment has impacted the way in which multicellular groups form, highlight that environmental conditions might have affected the major evolutionary transition to obligate multicellularity.

concise and the analyses robust. I have minor suggestions to the authors that they may take or leave. The aspect that made me think the most is how reliable the input data are that the authors took from Bell and Mooers 1997? Looking at the table in this publication, I have the feeling that its both taxonomically biased and inaccurate in many cases. I'm not questioning Bell-Mooers's efforts in compiling those data, but state of the art at the time might not have allowed them to reach high accuracy in many cases. I can only assess fungi, where I feel serious limitations. I'm not sure what the authos could do to update these figures or to mitigate their potential bies on their results, but at least some discussion would be useful. As a general suggestion I think the data used in the analyses could be made more accessible by including it in the main text. Consider moving Figure S1 to the main text and displaying the examined traits next to taxon names (like in a circos plot). Specific comments: l34 -wild type yeast under normal condition does not form aggregates (it forms biofilms, but thats not an aggregation). l55 -as a mycologist I argue that ref10 has outdated numbers for fungal cell type diversity, which can reach 30 (which is based only on simple morphological observations, Kues 2015 Fungal Biol Rev., Bistis & Read 2003 Fungal Genet Biol.) l93-95 -the sentence reads like it has a bit of circularity in logic. l234 -As an alternative explanation, I would also argue that water currents distribute nutrients to all the cells of a clonal colony (in a sponge for example), whereas goods are more patchy on land and therefore need to be actively foraged for. l237 -please mention the containing lineages not example genera l239 -yes, but fungi with motile cells are connected to water. I would argue that multicellular fungi don't have a motile cell type, because fast apical growth of hyphae enables foraging for nutrients. l246 -Kiss et al (Nat Comms 2019) argues in a different way for facultative multicellularity in yeasts, the conservation of multicellularity-related genes and facultative hypha formation. l250 -I don't see a direct link between being obligate and being big. Please rephrase. l270 -there is some discussion on that in the below mentioned papers and in Nagy et al Microb Spectrum 2017. "Hyphal multicellularity" indeed seems to be an exception from many rules, and its not clearly clonal in the classic sense (clonal et the level of nuclei, not at cells). l279 -possibly? l301 -finishing the discussion with a broad conclusion on multicellularity would probably read better to me than a somewhat forced generalization to completely unrelated organismal traits. l322 -where was data for the ancestral environment of lineages taken? Ideally, this attribute could be inferred by ancestral state reconstructions. l341 -see Kiss et al Nat Comms 2019 and Nagy 2018 Biol Reviews on an update on fungi. The transition to simple multicellularity likely occurred once, whereas complex multicellularity emerged several times. Laszlo G. Nagy organisms seem to be "overrepresented" at ~4 cell types)? At the boundary, are there organisms qualitatively distinct? (i.e. close taxa grouped in one or the other groups). 5. Even though there is a discussion of sampling bias at the end, could the authors test the robustness of the main conclusions to sampling biases? (i.e. some bootstrap method on the datasets, weighted perhaps by taxa?). 6. Some of the data, especially on the number of independent transitions to multicellularity, is not current with the latest phylogenetic developments. Specifically, the 'red algae' are described as having 1+ transitions to multicellularity in Figure 1 (citing Andy Knoll's 2011 review paper)these two papers say either 2 or 3 times: Cock  Perhaps more importantly, the fungi have recently come into focus much more than before. The old estimate was 2 origins, but Lazlo Nagy's groundbreaking 2018 paper showed it was likely between 8 and 11. Nagy LG, Kovács GM, Krizsán K (2018) Complex multicellularity in fungi: evolutionary convergence, single origin, or both?Biological Reviews93 (4) Your manuscript has now been peer reviewed and the reviews have been assessed by an Associate Editor. As you will see, the reviewers and the AE have raised some concerns with your manuscript and we would like to invite you to revise your manuscript to address them. These are outlined nicely below, although I would add to the AE's comments that I would like you to address reviewer 2 and 3's comments about the degree to which we can rely on the older data sets (i.e., Bell and Mooers 1997). Note that I am not recommending that you do not use these data, but comment on the degree to which these were limited by the techniques available at the time they were published. Of course, please also address each of the reviewer's other comments. These comments (not including confidential comments to the Editor) and the comments from the Associate Editor are included at the end of this email for your reference.
We do not allow multiple rounds of revision so we urge you to make every effort to fully address all of the comments at this stage. If deemed necessary by the Associate Editor, your manuscript will be sent back to one or more of the original reviewers for assessment. If the original reviewers are not available we may invite new reviewers. Please note that we cannot guarantee eventual acceptance of your manuscript at this stage.
To submit your revision please log into http://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions", click on "Create a Revision". Your manuscript number has been appended to denote a revision.
When submitting your revision please upload a file under "Response to Referees" -in the "File Upload" section. This should document, point by point, how you have responded to the reviewers' and Editors' comments, and the adjustments you have made to the manuscript. We require a copy of the manuscript with revisions made since the previous version marked as 'tracked changes' to be included in the 'response to referees' document.
Your main manuscript should be submitted as a text file (doc, txt, rtf or tex), not a PDF. Your figures should be submitted as separate files and not included within the main manuscript file.
When revising your manuscript you should also ensure that it adheres to our editorial policies (https://royalsociety.org/journals/ethics-policies/). You should pay particular attention to the following: Research ethics: If your study contains research on humans please ensure that you detail in the methods section whether you obtained ethical approval from your local research ethics committee and gained informed consent to participate from each of the participants.
Use of animals and field studies: If your study uses animals please include details in the methods section of any approval and licences given to carry out the study and include full details of how animal welfare standards were ensured. Field studies should be conducted in accordance with local legislation; please include details of the appropriate permission and licences that you obtained to carry out the field work.
Data accessibility and data citation: It is a condition of publication that you make available the data and research materials supporting the results in the article. Datasets should be deposited in an appropriate publicly available repository and details of the associated accession number, link or DOI to the datasets must be included in the Data Accessibility section of the article (https://royalsociety.org/journals/ethics-policies/data-sharing-mining/). Reference(s) to datasets should also be included in the reference list of the article with DOIs (where available).
In order to ensure effective and robust dissemination and appropriate credit to authors the dataset(s) used should also be fully cited and listed in the references.
If you wish to submit your data to Dryad (http://datadryad.org/) and have not already done so you can submit your data via this link http://datadryad.org/submit?journalID=RSPB&manu=(Document not available), which will take you to your unique entry in the Dryad repository.
If you have already submitted your data to dryad you can make any necessary revisions to your dataset by following the above link.
For more information please see our open data policy http://royalsocietypublishing.org/datasharing.
Electronic supplementary material: All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI. Please try to submit all supplementary material as a single file.
Online supplementary material will also carry the title and description provided during submission, so please ensure these are accurate and informative. Note that the Royal Society will not edit or typeset supplementary material and it will be hosted as provided. Please ensure that the supplementary material includes the paper details (authors, title, journal name, article DOI). Your article DOI will be 10.1098/rspb.[paper ID in form xxxx.xxxx e.g. 10.1098/rspb.2016.0049].
Please submit a copy of your revised paper within three weeks. If we do not hear from you within this time your manuscript will be rejected. If you are unable to meet this deadline please let us know as soon as possible, as we may be able to grant a short extension.
Thank you for submitting your manuscript to Proceedings B; we look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch.
Best wishes, Dr Sarah Brosnan Editor, Proceedings B mailto: proceedingsb@royalsociety.org Associate Editor Comments to Author: All three peer reviewers agree that the paper makes an important contribution to the understanding of the transition to multicellularity, particularly in its incorporation of scaling relationships. However, this manuscript requires revision to be appropriate for publication in Proc B. Specifically, multiple reviewers have concerns about the selection of the breakpoint between "large" and "small" organisms, and the statistical support for a model that estimates different power law regimes for each of these categories, as opposed to a single unified model. Please address this concern as well as the other comments and suggestions made by the individual reviewers. Thank you, Reviewer(s)' Comments to Author: Comments to the Author(s) The manuscript probes a previously published data set for a series of questions regarding multicellularity evolution. This data set spans a phylogenetically diverse set of eukaryotes, representative of the majority of known eukaryotic multicellular lineages.
Firstly, the authors use scaling theory to quantify the association between the organism size (measured by the mean number of cells of an organism) and its complexity. As a proxy for complexity, they rely on the number of cell types in each species. They conclude that the relationship between the number of cell types and the total number of cells in an organism follows two different regimes: one for small and another for large organisms.
The second question analyzed is the effect of the environment of origin in the process of group formation (clonal or non-clonal). Their phylogenetically informed approach corroborates Bonner's observation that clonal group formation is more common when the transition happens in an aquatic environment.
Finally, the authors look into the complexity level of multicellular organisms on land vs in water. They find that the complexity is in general higher for land organisms, a result that remains strong even after controlling for phylogenetic factors.
The manuscript is clear and engaging to read, and the issue investigated is relevant. I particularly like the strong statistical confirmation that clonal group formation is more frequent for aquatic transitions. Nevertheless, I have some concerns regarding the scaling relations obtained.

Major comments
The authors conclude that two different power-law regimes exist, corresponding to small and large organisms. However, in my view, the evidence presented does not support this conclusion. The R^2 for the fitting to the full data is larger than the ones obtained for partial data in each subset, suggesting that a fit to the full data retains a relatively high explanatory power without requiring additional hypotheses.
Furthermore, the authors state that they used an R package to test for the existence of the breakpoint, but provide no test result. -the authors use reduced major axis regression with the argument that both X and Y variables have associated errors. It is not clear to me that this is the best approach to take (please see, e.g., RJ Smith, Use and misuse of the reduced major axis for line-fitting, 2009) -I would like to see a comment on why the authors have only considered eukaryotes in this study, given that the original data set (Fisher et al., 2013) included species from all three domains of life Referee: 2 Comments to the Author(s) Fisher et al test an old hypothesis by Bonner that postulates that the environment determines the type and stability of multicellular associations that emerge. I found the paper very interesting, concise and the analyses robust. I have minor suggestions to the authors that they may take or leave.
The aspect that made me think the most is how reliable the input data are that the authors took from Bell and Mooers 1997? Looking at the table in this publication, I have the feeling that its both taxonomically biased and inaccurate in many cases. I'm not questioning Bell-Mooers's efforts in compiling those data, but state of the art at the time might not have allowed them to reach high accuracy in many cases. I can only assess fungi, where I feel serious limitations. I'm not sure what the authos could do to update these figures or to mitigate their potential bies on their results, but at least some discussion would be useful.
As a general suggestion I think the data used in the analyses could be made more accessible by including it in the main text. Consider moving Figure S1 to the main text and displaying the examined traits next to taxon names (like in a circos plot).
Specific comments: l34 -wild type yeast under normal condition does not form aggregates (it forms biofilms, but thats not an aggregation). l55 -as a mycologist I argue that ref10 has outdated numbers for fungal cell type diversity, which can reach 30 (which is based only on simple morphological observations, Kues 2015 Fungal Biol Rev., Bistis & Read 2003 Fungal Genet Biol.) l93-95 -the sentence reads like it has a bit of circularity in logic. l234 -As an alternative explanation, I would also argue that water currents distribute nutrients to all the cells of a clonal colony (in a sponge for example), whereas goods are more patchy on land and therefore need to be actively foraged for. l237 -please mention the containing lineages not example genera l239 -yes, but fungi with motile cells are connected to water. I would argue that multicellular fungi don't have a motile cell type, because fast apical growth of hyphae enables foraging for nutrients. l246 -Kiss et al (Nat Comms 2019) argues in a different way for facultative multicellularity in yeasts, the conservation of multicellularity-related genes and facultative hypha formation. l250 -I don't see a direct link between being obligate and being big. Please rephrase. l270 -there is some discussion on that in the below mentioned papers and in Nagy et al Microb Spectrum 2017. "Hyphal multicellularity" indeed seems to be an exception from many rules, and its not clearly clonal in the classic sense (clonal et the level of nuclei, not at cells). l279 -possibly? l301 -finishing the discussion with a broad conclusion on multicellularity would probably read better to me than a somewhat forced generalization to completely unrelated organismal traits. l322 -where was data for the ancestral environment of lineages taken? Ideally, this attribute could be inferred by ancestral state reconstructions. l341 -see Kiss et al Nat Comms 2019 and Nagy 2018 Biol Reviews on an update on fungi. The transition to simple multicellularity likely occurred once, whereas complex multicellularity emerged several times.

Laszlo G. Nagy
Referee: 3 Comments to the Author(s) Report on "The evolution of multicellular complexity: the role of relatedness and environmental constraints" by RM Fisher et al.
The paper has a compelling history, a fair amount of data analysis and metanalysis. The authors test the hypothesis (by John T. Bonner) that the evolution of multicellularity is affected by where such transition occur: in an aquatic vs. terrestrial environment. In particular, what is interesting is that clonality vs. aggregativity differ significantly in the number of cell types (and organismal size), which could be well explained by how the environment favored one or the other type of transition to multicellularity. By doing an extensive meta-analysis of the literature, the authors concluded that the theory of social evolution (relatedness, in particular) cannot completely account for these differences, and that the environment has played a major role. I believe that the paper makes an important contribution to the understanding of the transition to multicellularity, and it also brings forth a novel perspective in scaling relationships that (so far) have been seldom studied in the literature on multicellular evolution. I would suggest the following discussion/clarification points: 1. Why is the isometric scaling the frame of reference (the null hypothesis)? It seems unrealistic that any organism would ever evolve one cell type directly proportional to its cell count. Thus, a null model that cannot ever exist(?), even in principle, might not be that useful.
2. Is there a conflict in the way different authors in the papers cited conceive (and account for) a "cell type"? If so, how is this addressed in a way that makes the conclusions robust to such discrepancies?
3. Considering the wide range of organisms studied, it also seems a sensible discussion to mention how the developmental stage at which the cell types and cell number were "counted" could affect the conclusions. Are these cell types the total number of types present throughout the life cycle? Are they the types at a "fully developed" organism? How these temporal/developmental boundaries can be compared across such diverse groups of organisms?
4. I understand that one can break a regression into multiple sections, but why was 2 groupings the chosen one? (at 10^4 cells). Even though it gives a 'nice' discussion between "small" vs. "big" organisms, it seems a bit arbitrary. Was it sensitive to the sampling distribution (the small organisms seem to be "overrepresented" at ~4 cell types)? At the boundary, are there organisms qualitatively distinct? (i.e. close taxa grouped in one or the other groups).

5.
Even though there is a discussion of sampling bias at the end, could the authors test the robustness of the main conclusions to sampling biases? (i.e. some bootstrap method on the datasets, weighted perhaps by taxa?).
6. Some of the data, especially on the number of independent transitions to multicellularity, is not current with the latest phylogenetic developments. Specifically, the 'red algae' are described as having 1+ transitions to multicellularity in Figure 1  Perhaps more importantly, the fungi have recently come into focus much more than before. The old estimate was 2 origins, but Lazlo Nagy's groundbreaking 2018 paper showed it was likely between 8 and 11. Nagy LG, Kovács GM, Krizsán K (2018) Complex multicellularity in fungi: evolutionary convergence, single origin, or both?Biological Reviews93(4):1778-1794 Small observations: 1. In Fig. 1, the X-axis legend in the left panel reads "Total number of cells" and the right panel reads "Total number of cells (log)". I'd keep the legends consistent. 2. X-axis legend of Fig. 3 (right panel) seems to have formatting errors (with 10^11 and 10^13).

08-Jun-2020
Dear Dr Fisher I am pleased to inform you that your manuscript RSPB-2019-2963.R1 entitled "The evolution of multicellular complexity: the role of relatedness and environmental constraints" has been accepted for publication in Proceedings B pending minor revisions. These are listed below, in the comments from the Associate Editor and the reviewers. I invite you to respond to the referee(s)' comments and revise your manuscript. Because the schedule for publication is very tight, it is a condition of publication that you submit the revised version of your manuscript within 7 days. If you do not think you will be able to meet this date please let us know.
To revise your manuscript, log into https://mc.manuscriptcentral.com/prsb and enter your Author Centre, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision. You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript and upload a new version through your Author Centre.
When submitting your revised manuscript, you will be able to respond to the comments made by the referee(s) and upload a file "Response to Referees". You can use this to document any changes you make to the original manuscript. We require a copy of the manuscript with revisions made since the previous version marked as 'tracked changes' to be included in the 'response to referees' document.
Before uploading your revised files please make sure that you have: 1) A text file of the manuscript (doc, txt, rtf or tex), including the references, tables (including captions) and figure captions. Please remove any tracked changes from the text before submission. PDF files are not an accepted format for the "Main Document".
2) A separate electronic file of each figure (tiff, EPS or print-quality PDF preferred). The format should be produced directly from original creation package, or original software format. PowerPoint files are not accepted.
3) Electronic supplementary material: this should be contained in a separate file and where possible, all ESM should be combined into a single file. All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI.
Online supplementary material will also carry the title and description provided during submission, so please ensure these are accurate and informative. Note that the Royal Society will not edit or typeset supplementary material and it will be hosted as provided. Please ensure that the supplementary material includes the paper details (authors, title, journal name, article DOI). Your article DOI will be 10.1098/rspb.[paper ID in form xxxx.xxxx e.g. 10.1098/rspb.2016.0049]. 4) A media summary: a short non-technical summary (up to 100 words) of the key findings/importance of your manuscript.

5) Data accessibility section and data citation
It is a condition of publication that data supporting your paper are made available either in the electronic supplementary material or through an appropriate repository.
In order to ensure effective and robust dissemination and appropriate credit to authors the dataset(s) used should be fully cited. To ensure archived data are available to readers, authors should include a 'data accessibility' section immediately after the acknowledgements section. This should list the database and accession number for all data from the article that has been made publicly available, for instance: • DNA sequences: Genbank accessions F234391-F234402 • Phylogenetic data: TreeBASE accession number S9123 • Final DNA sequence assembly uploaded as online supplemental material • Climate data and MaxEnt input files: Dryad doi:10.5521/dryad.12311 NB. From April 1 2013, peer reviewed articles based on research funded wholly or partly by RCUK must include, if applicable, a statement on how the underlying research materials -such as data, samples or models -can be accessed. This statement should be included in the data accessibility section.
If you wish to submit your data to Dryad (http://datadryad.org/) and have not already done so you can submit your data via this link http://datadryad.org/submit?journalID=RSPB&manu=(Document not available) which will take you to your unique entry in the Dryad repository. If you have already submitted your data to dryad you can make any necessary revisions to your dataset by following the above link. Please see https://royalsociety.org/journals/ethics-policies/data-sharing-mining/ for more details.
6) For more information on our Licence to Publish, Open Access, Cover images and Media summaries, please visit https://royalsociety.org/journals/authors/author-guidelines/.
Once again, thank you for submitting your manuscript to Proceedings B and I look forward to receiving your revision. If you have any questions at all, please do not hesitate to get in touch. I am pleased to inform you that your manuscript entitled "The evolution of multicellular complexity: the role of relatedness and environmental constraints" has been accepted for publication in Proceedings B.
You can expect to receive a proof of your article from our Production office in due course, please check your spam filter if you do not receive it. PLEASE NOTE: you will be given the exact page length of your paper which may be different from the estimation from Editorial and you may be asked to reduce your paper if it goes over the 10 page limit.
If you are likely to be away from e-mail contact please let us know. Due to rapid publication and an extremely tight schedule, if comments are not received, we may publish the paper as it stands.
If you have any queries regarding the production of your final article or the publication date please contact procb_proofs@royalsociety.org Your article has been estimated as being 9 pages long. Our Production Office will be able to confirm the exact length at proof stage.
Open Access You are invited to opt for Open Access, making your freely available to all as soon as it is ready for publication under a CCBY licence. Our article processing charge for Open Access is £1700. Corresponding authors from member institutions (http://royalsocietypublishing.org/site/librarians/allmembers.xhtml) receive a 25% discount to these charges. For more information please visit http://royalsocietypublishing.org/open-access.
Paper charges An e-mail request for payment of any related charges will be sent out shortly. The preferred payment method is by credit card; however, other payment options are available.
Electronic supplementary material: All supplementary materials accompanying an accepted article will be treated as in their final form. They will be published alongside the paper on the journal website and posted on the online figshare repository. Files on figshare will be made available approximately one week before the accompanying article so that the supplementary material can be attributed a unique DOI.
You are allowed to post any version of your manuscript on a personal website, repository or preprint server. However, the work remains under media embargo and you should not discuss it with the press until the date of publication. Please visit https://royalsociety.org/journals/ethicspolicies/media-embargo for more information.
Thank you for your fine contribution. On behalf of the Editors of the Proceedings B, we look forward to your continued contributions to the Journal.

Scientific importance: Is the manuscript an original and important contribution to its field? Good
General interest: Is the paper of sufficient general interest? Good Quality of the paper: Is the overall quality of the paper suitable? Good

Do you have any concerns about statistical analyses in this paper? If so, please specify them explicitly in your report. No
It is a condition of publication that authors make their supporting data, code and materials available -either as supplementary material or hosted in an external repository. Please rate, if applicable, the supporting data on the following criteria.

Comments to the Author
The authors have substantially revised the manuscript. It seems that the statistical analyses reveal a trend that is inherent to the data and not imposed by the analyses, which was a major concern raised in the first round of reviews. However, we still feel that some of the problems associated with the classification of cell types in the dataset were not sufficiently addressed. In particular, we would like to see a bit more discussion on how the different authors of the datasets used in this paper classified cell types across different multicellular lineages, and how this affects the results in this paper. This is important because the classification in itself could have implicit biases in the number of cell types counted, especially because a substantial portion of the data is now fairly old (i.e., Bell 1997), and modern molecular techniques have changed our views of what counts as a cell type. Otherwise, the paper is nice!

Review form: Reviewer 2
Recommendation Accept as is Scientific importance: Is the manuscript an original and important contribution to its field? Good General interest: Is the paper of sufficient general interest? Excellent Quality of the paper: Is the overall quality of the paper suitable? Good Is the length of the paper justified? Yes

Do you have any concerns about statistical analyses in this paper? If so, please specify them explicitly in your report. No
It is a condition of publication that authors make their supporting data, code and materials available -either as supplementary material or hosted in an external repository. Please rate, if applicable, the supporting data on the following criteria.

Comments to the Author
The authors adequately addressed the referees' concerns in their revision. In particular, they clarified the statistical methods used, which answered my main concerns. As such, I believe the manuscript is suitable for publication in the current form. As a note, in line 265 the authors refer to "prokaryotes and archaea". I think they mean "bacteria and archaea" (or otherwise simply "prokaryotes") since archaea are prokaryotes also.

Response to Reviewers comments
Associate Editor 3 4 Comments to Author: 5 All three peer reviewers agree that the paper makes an important contribution to the 6 understanding of the transition to multicellularity, particularly in its incorporation of scaling 7 relationships. 8 However, this manuscript requires revision to be appropriate for publication in Proc B. 9 Specifically, multiple reviewers have concerns about the selection of the breakpoint 10 between "large" and "small" organisms, and the statistical support for a model that 11 estimates different power law regimes for each of these categories, as opposed to a single 12 unified model. Please address this concern as well as the other comments and suggestions 13 made by the individual reviewers. Thank you, 14 15 We are grateful for the chance to submit a revised version of this manuscript in which we 16 carefully responded to the helpful comments from reviewers. On advice from the reviewers, 17 we have improved the following aspects of the manuscript: 18 19 1. Comments to the Author(s) 35 The manuscript probes a previously published data set for a series of questions regarding 36 multicellularity evolution. This data set spans a phylogenetically diverse set of eukaryotes, 37 representative of the majority of known eukaryotic multicellular lineages. 38 39 Firstly, the authors use scaling theory to quantify the association between the organism size 40 (measured by the mean number of cells of an organism) and its complexity. As a proxy for 41 complexity, they rely on the number of cell types in each species. They conclude that the 42 relationship between the number of cell types and the total number of cells in an organism 43 follows two different regimes: one for small and another for large organisms.

45
The second question analyzed is the effect of the environment of origin in the process of 46 group formation (clonal or non-clonal). Their phylogenetically informed approach 47 corroborates Bonner's observation that clonal group formation is more common when the 48 transition happens in an aquatic environment.

50
Appendix A 2 Finally, the authors look into the complexity level of multicellular organisms on land vs in 51 water. They find that the complexity is in general higher for land organisms, a result that 52 remains strong even after controlling for phylogenetic factors. 53 54 The manuscript is clear and engaging to read, and the issue investigated is relevant. I 55 particularly like the strong statistical confirmation that clonal group formation is more 56 frequent for aquatic transitions. Nevertheless, I have some concerns regarding the scaling 57 relations obtained. 58 59 60 Major comments 61 The authors conclude that two different power-law regimes exist, corresponding to small 62 and large organisms. However, in my view, the evidence presented does not support this 63 conclusion. The R^2 for the fitting to the full data is larger than the ones obtained for partial 64 data in each subset, suggesting that a fit to the full data retains a relatively high explanatory 65 power without requiring additional hypotheses.

67
Furthermore, the authors state that they used an R package to test for the existence of the has its own drawbacks. A major concern is that the data is not up-to-date and is biased 203 towards particular taxa. We acknowledge this concern and agree that the data is not always 204 current. However, it is worth bearing in mind that for a species to be included in our full 205 analysis, where we include data on complexity (Table S5 - As a general suggestion I think the data used in the analyses could be made more 216 accessible by including it in the main text. Consider moving Figure S1 to the main text and 217 displaying the examined traits next to taxon names (like in a circos plot).

218
We have edited and moved the phylogenyis now Figure 4. It now includes more 219 information than the previous phylogeny in the Supplementary  Comments to the Author(s) 310 Report on "The evolution of multicellular complexity: the role of relatedness and 311 environmental constraints" by RM Fisher et al.

313
The paper has a compelling history, a fair amount of data analysis and metanalysis. The 314 authors test the hypothesis (by John T. Bonner) that the evolution of multicellularity is 315 affected by where such transition occur: in an aquatic vs. terrestrial environment. In 316 particular, what is interesting is that clonality vs. aggregativity differ significantly in the 317 number of cell types (and organismal size), which could be well explained by how the 318 environment favored one or the other type of transition to multicellularity. By doing an 319 extensive meta-analysis of the literature, the authors concluded that the theory of social 320 evolution (relatedness, in particular) cannot completely account for these differences, and 321 that the environment has played a major role. I believe that the paper makes an important 322 contribution to the understanding of the transition to multicellularity, and it also brings forth a 323 novel perspective in scaling relationships that (so far) have been seldom studied in the 324 literature on multicellular evolution. 325 326 I would suggest the following discussion/clarification points: 327 328 1. Why is the isometric scaling the frame of reference (the null hypothesis)? It seems 329 unrealistic that any organism would ever evolve one cell type directly proportional to its cell 330 count. Thus, a null model that cannot ever exist(?), even in principle, might not be that 331 useful.

332
We 6. Some of the data, especially on the number of independent transitions to multicellularity, 390 is not current with the latest phylogenetic developments. Specifically, the 'red algae' are 391 described as having 1+ transitions to multicellularity in Figure 1 ( 1. In Fig. 1, the X-axis legend in the left panel reads "Total number of cells" and the right 420 panel reads "Total number of cells (log)". I'd keep the legends consistent. 421 2. X-axis legend of Fig. 3 (right panel) seems to have formatting errors (with 10^11 and 422 10^13).

423
Have corrected these formatting errors.  We next found that there was a difference in the scaling relationship between number of cell 549 types and total number of cells for small versus large species (Table S1, (Table S2).  Table S1. N total 583 = 126 species where we had data for both number of cell types and total number of cells.

The origins of multicellularity in different environments 586
Our results show that the physical environment (whether or not a species livesancestral 587 lineages lived in the water or on land) has had a major impact on both the origins and 588 subsequent elaborations of multicellularity, both in determining how multicellular groups 589 originally formformed and how organisational complexity subsequently evolvesevolved. 590

591
We found that lineages in aquatic environments were significantly more likely to form 592 multicellular groups through daughter cells remaining attached to mother cells after division 593 The only multicellular lineage that has evolved obligate multicellularity on land is the Fungi. 608 This is consistent with this lineage also being a rare example of clonal group formation that 609 originated on land, as the resulting clonal relatedness between cells is significantly 610 associated with the transition to obligate multicellularity [4].

Multicellular organisational complexity on land versus in water 634
We found that the number of cell types of multicellular species currently found on land was 635 significantly higher than those currently found in aquatic environments (Figure 3a3 organisms evolved larger body sizes.and that the specific scaling relationship is different for 676 small versus large species. Our comparative analyses also show that the environment 677 (aquatic or terrestrial) has a crucial impact on the trajectory of multicellular evolution. Firstly, 678 we found that clonal group formation giving rise to obligate multicellularity has beenis 679 significantly more common in lineages that evolved in aquatic environments. Secondly, we 680 showed that current environmental conditions have an impact on multicellular evolution, with 681 species living on land having a higher number of cell types compared to species found in 682 aquatic environments. cerevisiae has a variety of multicellular phenotypes [32] and yet is still often not recognised 704 as being facultatively multicellular [33]. Other lineages, notably the green algae, have evolved 705 facultative multicellularity manymultiple times [21] and there are likely new examples to be 706 found on land as well. However, it is unlikely that we have underestimated the number of 707 lineages with obligate multicellularity. This is because there is some evidence that these 708 species tend to be bigger (Fisher et al 2013) and therefore more visible and complex [4] and 709 potentially better studied [18,34]. ThereFurthermore, there is no obvious reason to assume 710 that under-or overestimation would be biased towards terrestrial or aquatic species. By 711 inflating the number of facultative lineages, we would therefore not alter the pattern and the 712 result we findthat obligate multicellularity has evolved much more often in water compared 713 to on land. Secondly, the dataset underlying this study comes from a limited set of studies 714 Not only does the environment affect how multicellular groups form, but we show that it also 720 has a major impact on the scaling relationships between size and complexity. Species that 721 live on land tend to be more complex for their size compared to species that live in water (i.e., 722 with a higher slope, Figure 3b4) and this could be for several reasons. Land dwelling 723 organisms need more support structures than their aquatic counterpartsthis is because 724 water provides natural support through buoyancy whereas air does not. Organisms living on 725 land therefore needed to increasingly invest in stems and skeletons to 'hold themselves' up 726 as their body size increasesincreased (i.e. skeleton mass ~ M b > 1 , [12]), possible leading to 727 greater diversification of cell types and tissues than required by organisms in the sea. This 728 scaling logic can further be extended to resource allocation dynamics within organisms (e.g., 729 vascular networks,), although systematic effects on cell diversity and differences between 730 land and water remain to be elucidated. There are also other potentially confounding 731 physiological parameters, for example the possibility that autotrophic lineages compete for 732 light, both on land and in the water, whereas heterotrophic lineages do not, but that. 733 Assuming such competition has selected for support tissues on land with specialized cell 734 types, these are more costly to maintain on land (e.g. rain forest trees versus kelp forest). 735

736
Our study reveals intriguing outliers that deserve further consideration. The Fungi are a 737 strikingly unusual (mostly) multicellular lineage that present particular challenges infor this 738 study for several reasons. Our dataset includes only three fungal species with an average of 739 X cell types, which doesn't, that do not accurately reflect the true level of fungal complexity 740 in nature (e.g. Neurospora crassa, which has 28 distinct cell types (Bistis et al 2003)). would not affect our phylogenetic analyses (that focus onwould be unaffected (as they deal 748 with independent contrasts between lineages, not species (Figure 2, Table 1))per se) so we 749 can be confident in our conclusion that clonality is associated with aquatic ancestral 750 environments (Figure 2a). clonal or aggregative, due to the way in which multi-nucleate fungal hyphae form, and this is 755 a potential limitation of our study where we had to classify every species in these two discrete 756 categories. Finally, it is also difficult to confidently class fungi as either terrestrial or aquatic, 757 as they live and have evolved mostly at the air-water interface. Perhaps a closer look at the 758 Fungi as putative 'exceptions to the rule' could help to unravel the generality of the 759 relationship between the environment and multicellular complexity that we uncovered. 760 Finally, the it could also be argued that it is difficult to confidently class them as either 761 terrestrial or aquatic, as they live and have evolved at the air-water interface.

775
Data collection 776 The data used in this study were originally published in Fisher et al. (2013)  Our full dataset can be found in Table S6. Information of both the number of cell types and 785 total number of cells (allowing us to estimate 'complexity') was essential for analyses where 786 we included complexity (Table S5) and we therefore focused our data collection effort on 787 species where information on both these traits was available. 788

789
In this study, we expanded on the eukaryotic species in the original dataset by adding 790 information on the ancestral and current environment of each species. We considered any 791 species found on land as terrestrial and any species found in freshwater, brackish or marine 792 environments as aquatic. We found information about the current environment of a species 793 by searching on Google Scholar for publications and also taxa-specific websites, such as 794 AlgaeBase and WoRMs. Where there was only information about ancestral or current 795 environment at a higher taxonomic level (i.e. at the family level but no generic or species 796 information) we assumed it was the same environment for the species in our dataset. We 797 found information on the ancestral environment of each species through broad reviews on 798 the origins of multicellularity including Bonner 1998& Umen 2014 . It 799 is important to stress that we were interested in the ancestral environment when 800 multicellularity evolved and therefore that was not always the same as the ancestral

Independent transitions to multicellularity 812
Using information from published papers, we identified that within the eukaryotes there have 813 been at least 14 independent transitions to multicellularity (both facultative and obligate) 814 (Table 1, Figure 3a3). However, we have most likely underestimated the number of 815 transitions in several groups due to uncertainty about the number of independent transitions 816 within them. For example, it is thought that there have been at least 2 transitions to obligate 817 multicellularity within the Fungi [27,28] and manymultiple transitions to facultative 818 multicellularity in the green algae [21] and in the red algae [18] . Therefore, our analyses are 819 conservative and assumed just 1 transition within each group. in the scaling of log10(cell type) against log10(cell number) across all data. We used OLS 829 regression rather than RMA regression even though our X variables contained measurement 830 error, based on the X-Y symmetry principle of Smith (2009). We note however, that OLS and 831 RMA approaches yielded very similar results, just with steeper slopes in all cases. We then 832 used the package 'segmented' in R [43] to test if there is a 'breakpoint' in the regression -833 the point at which the shape of the relationship changes abruptly. We then used the OLS 834 approach described above to evaluate scaling relationships on either side of the breakpoint. 835

836
We also noted that the scatter plots producing average scaling relationships appeared 837 triangular and thus hypothesized that they reflect a constraint function such that total number 838 of cells is necessary, but not sufficient to explain variation in number of cell types [44,45]. To 839 test this hypothesis, we used least quantile regressions to describe scaling for the upper 840 ninetieth quantiles of the overall plot and separately for scaling relationships on either side of 841 the breakpoint [46,47]. 842 As a first step in analyzing the data we began with a least square regression to estimate a 843 and b in the scaling equation log10y = log10a + blog10M and describe nature of the dependence 844

Commented [RF5]
: This is what we should refer to in the reviewers comments when they say that red algae is at least 2 or whatever. Say that we are trying to be conservation or seomthing and that it won't change our results?
of the number of cell types on the total number of cells. We used the R package 'lmodel2', 845 we used reduced major axis (RMA) regression to estimate the intercept and slope in the We also noted that the scaling relationships appeared triangular and thus hypothesized that 858 they reflect a constraint function such that total number of cells is necessary, but not sufficient 859 to explain variation in number of cell types [44,45]. To test this hypothesis, we used least 860 absolute deviation regression to describe scaling for the upper ninetieth quantiles of the 861 overall plot and separately for the small and large taxa plots [46,47]. 862 863

Bayesian analyses 864
We used the statistical package MCMCglmm [48] to run Bayesian general linear models with 865 Markov Chain Monte Carlo (MCMC) estimation. We fitted three models. Firstly, we tested 866 whether the environment affected the way in which multicellular groups form by fitting a model 867 with group formation as a categorical response variable and the ancestral environment as a 868 categorical explanatory variable (Table S3). Secondly, we tested whether the environment 869 affected the likelihood of obligate or facultative multicellularity by fitting a model with 870 obligate/facultative as a categorical response variable and the ancestral environment as a 871 categorical explanatory variable (Table S4).  (Table S4). This allowed 877 us to use both number of cell types and the total number of cells as a combined measure of 878 multicellular complexity, rather than having to run several analyses using different response 879 variables. We fitted several categorical fixed effects: the current environment (aquatic or 880 terrestrial), whether the species is obligately or facultatively multicellular, and the mode of 881 group formation (non-clonal or clonal) to control for the known effects of group formation and 882 obligateness on multicellular complexity [4]. 883

884
In the first two models, we used uninformative inverse-gamma priors because we had a 885 categorical response variable. We also fixed the residual variance to 1 and specified family 886 = categorical. In the final model, we used uninformative priors because we had a multi-887 response model with both poissonPoisson and Gaussian response variables and categorical 888 explanatory variables. We ran the models for 6000000 iterations, with a burn-in of 1000000 889 and a thinning interval of 1000. These were the values that optimised the chain length whilst 890 also allowing our models to converge, which we assessed visually using VCV traceplots. We 891 then ran each model three times and used the Gelman-Rubin diagnostic to quantitatively 892 check for convergence. We showedassumed that our models had converged when the PSR 893 was < 1.1.

Phylogeny construction 904
We built the phylogeny for this study using the Open Tree of Life (opentreeoflife.org), which 905 creates synthetic trees built from published phylogenies and taxonomic information. We then 906 used the R package 'rotl' that interacts with the online database and constructs phylogenies 907 (https://cran.r-project.org/web/packages/rotl/index.html). For the majority of species in our 908 dataset, the exact species was also present in a published phylogeny and so we could use 909 phylogenetic information about that species. However, for a few species that were not present 910 in the Open Tree of Life dataset, we had to assign instead a closely related species in the 911 same genera or use a family-level classification. Due to the fact that most species in our 912 dataset represent phylogenetically distant groups on the eukaryotic tree and our phylogeny 913 does not include branch lengths, we wereare confident that this compromise did not affect 914 our statistical analysis. The phylogeny presented in Figure 3 was created for visual purposes 915 using Anvi (anvi-server.org).         1076 1077 cell types, as well as reviewers 1's comment on prokaryotes.

12
As well, before publication we also require data and code deposition for reproducibility. 13 While the authors link to a well documented repository containing the data published in a 14 previous analysis, and the updated data is provided in the supplement, the data deposition 15 guidelines are: "To allow others to verify and build on the work published in Royal Society 16 journals, it is a condition of publication that authors make available the data, code and 17 research materials supporting the results in the article." In order to make the work from this 18 publication reproducible, their expanded categorized data set, inferred phylogeny, and 19 associated analysis code should be shared in a dryad repository for this paper. life (citations to the taxonomy, datastore, and synthesis would be appropriate). 28 29 We have added these missing citations. 30 31 Minor figure comments: 32 The resizing of the axes between panel a and b in figure 1 makes the match between them 33 harder to see. Also, consider using different symbols in addition to different colors in panel 34 b and Figure 4. to make the figure more comprehensible in B/W and for people with limited 35 color vision.

37
We have edited the figures so they are more 'in line' and comparable. We accept the 38 comment regarding accessibility with Figure 1b, Figure 3. Could make use of the extra space in the key to provide more visual clarification 43 about the bar charts, and label the axis values for the bar charts.

45
We have added the cell types values to the legend, as this would be too small to add to the 46 figure and still be legible (lines 449 -450).

48
Consider B/W printing and color challenges in phylogeny figure as well -e.g. dividing line 49 between ciliates and red algae is hard to see even with color. 50