Evolution of cultural traits occurs at similar relative rates in different world regions

A fundamental issue in understanding human diversity is whether or not there are regular patterns and processes involved in cultural change. Theoretical and mathematical models of cultural evolution have been developed and are increasingly being used and assessed in empirical analyses. Here, we test the hypothesis that the rates of change of features of human socio-cultural organization are governed by general rules. One prediction of this hypothesis is that different cultural traits will tend to evolve at similar relative rates in different world regions, despite the unique historical backgrounds of groups inhabiting these regions. We used phylogenetic comparative methods and systematic cross-cultural data to assess how different socio-cultural traits changed in (i) island southeast Asia and the Pacific, and (ii) sub-Saharan Africa. The relative rates of change in these two regions are significantly correlated. Furthermore, cultural traits that are more directly related to external environmental conditions evolve more slowly than traits related to social structures. This is consistent with the idea that a form of purifying selection is acting with greater strength on these more environmentally linked traits. These results suggest that despite contingent historical events and the role of humans as active agents in the historical process, culture does indeed evolve in ways that can be predicted from general principles


Coding of ethnographic data
Not all variables in the Ethnographic Atlas were suitable for this analysis. Reasons for not including variables were: 1) variables on Sex Differences and Task Specialization rely on the practice being present to be meaningful, 2) variables which are overtly continuously distributed (e.g. variable 31, mean size of local communities) would result in arbitrary categorization, 3) redundancy between variables (e.g. variable 43, Descent, summarizes three other variables, which are therefore not included individually), 4) characters are invariable in at least one of the language families (meaning the rate of change is not measurable), 5) variable relates to a secondary rather than a primary form.
Where possible the coding of ethnographic variables into categories that was employed by Murdock was retained. However, some variables were re-coded into a smaller number of categories to enable phylogenetic comparative analyses. Variable 9, Marital Composition: Monogamy and Polygamy, was condensed from 7 categories to 3: Monogamy (categories 1 and 2), Polygamy (categories [3][4][5][6], and Polyandry (category 7). This was done in order to more fully distinguish variable 9 from variable 8 (Domestic organization). Variable 34, High Gods, was re-categorized into two categories reflecting simple presence versus absence, rather than more fine-grained subjective distinctions about the role of such gods within a society. Variable 37, Male Genital Mutilations, had categories based on rather arbitrary age classifications (10-20, 30-40 etc.) at which such mutilations occur. Therefore this variable was re-categorized to reflect the presence or absence of male genital mutilations (present: categories 2-10; absent: category 1). Similarly variable 38, Segregation of Adolescent Boys, was re-categorized into presence versus absence (present: categories 2-5; absent: category 1), as there were several categories reflecting different levels of segregation. Variable 39, Animals and Plow Cultivation, was also recategorized into presence versus absence (present: categories 2 and 3; absent: category 1), as the present categories only distinguish between the use of plow being aboriginal prior to contact or not. For variable 42, Subsistence economy, the category "agriculture, type unknown", means that the most sensible categorization for the present purposes was to have three categories (Foraging: categories 1-3; Pastoralism: category 4; Agriculture: categories [5][6][7][8][9]. Variable 70, Type of Slavery, has a category "reported, but type not identified" meaning that it had to be recategorized as presence/absence (present: categories 2-4; absent: category 1). A copy of Murdock's ethnographic data for the Austronesian and Bantu societies listed in Section 2 can be found at doi:10.5061/dryad.pv84f, with variable coding adjusted in the manner outlined above.

Ecological versus Social variables
For the purposes of the present study a judgment was made by the authors about whether variables should be classified as "ecological" (i.e. relating to direct interactions with the external environment, such as subsistence, physical dwelling, and settlement variables), or "social" (i.e. relating to norms and institutions of social organization). Again we stress that this distinction is not designed to suggest that social traits are completely divorced from external environmental conditions. Ideally, for future research it would be desirable to have some objective measure of the degree to which such variables relate to external environmental conditions.
Interestingly, in a previous study Guglielmino and colleagues [3] argue that traits from the Ethnographic Atlas that broadly match our "ecological" variables showed the strongest associations with a measure of ecological similarity in terms of habitat type, while traits relating to kinship and social organization are less strongly associated, with all traits showing some degree of association with linguistic similarity. This is broadly consistent with our argument in this study, however it should be noted that this earlier study has a number of methodological short-comings including rather course-grain measures of linguistic and ecological similarity (the latter of which may be more or less suitable for different traits), and the non-independence of the units of analysis [4]. Furthermore, this previous study was concerned with addressing whether language, ecology, or geographic distance (as a proxy for borrowing) was the best predictor of cultural trait diversity in Africa, which is a somewhat different question to the present study (see section 8).

Inclusion of inheritance variables
The  (Table S2) and inheritance distribution (Table S3) variables (i.e. cultures that practice primogeniture for real property, also tend practice primogeniture for movable property. It was therefore decided that only one inheritance rule variable, and one inheritance distribution variable should be used. The variables relating to movable property were therefore chosen as they have data for a slightly higher number of taxa in both AN and BN.

Non-parametric analyses
There is a moderately strong and significant correlation between the rank order of the estimated number of changes in Austronesian and Bantu societies under Maximum Parsimony (Spearman's rho 28 =.65, p<0.001). To control for the potential confounds in a manner similar to partial correlation two linear regression analyses were performed with the number of AN societies for each trait, the number of BN societies for each trait, and the number of Ethnographic Atlas categories as predictors of the number of changes in AN and BN. The rank correlation of the unstandardized residuals from these analyses was then calculated. There is a moderately strong and significant correlation between the rank order of the residuals (rho 28 =.651, p<0.001), indicating that the rates of change in these two groups of societies follow approximately the same order (i.e those traits that evolve fastest in AN, also evolve fastest in BN) Mann-Whitney U tests (non-parametric equivalent of the t-test) show that the residual scores of the "Social" variables are ranked significantly higher than the residual scores of the "Ecological" variables in both language families (Bantu, U 26 =40, Z=-2.40, p=0.017; Austronesian U 26 =20, Z=-3.356, p=0.001).

Alternative ways of modelling trait evolution in phylogenetic comparative methods
Phylogenetic comparative methods cover a wide range of different statistical approaches to tracing the evolutionary history of traits over a phylogenetic tree. These different approaches provide flexibility and the ability to address a wide range of evolutionary hypotheses, for both continuously-and discretely-distributed characters. They can often involve very different assumptions about how evolutionary change is modelled, therefore it can be important to assess whether different methods or statistical approaches produce the same results. Broadly speaking for discrete traits (such as those used in this study) these methods can be divided into those approaches that either attempt to trace character changes with reference to some kind of optimality criterion, or propose a statistical model of evolutionary change and infer the parameters of this model [5].
In this study we use both approaches in employing two methods: 1) Maximum Parsimony (MP), and 2) Stochastic Character Mapping (SCM).
Under MP we estimate the minimum number of trait changes required to produce the observed distribution of trait states over the phylogenetic tree. The advantage of this approach is that it is intuitive and produces output (i.e. the number of inferred trait changes) that is straightforward to interpret for our purposes. Another practical consideration is that is computationally efficient and produces results relatively quickly (an important point when the number of traits to be analysed is large, as is the case in this study). Legitimate criticism of parsimony is that some of its assumptions are not explicit, i.e. it does not employ an explicit model of character evolution, and implicitly assumes that the rate of evolution is relatively slow with respect to the phylogeny. Another criticism is that it does not make use of the information contained in the branch lengths of the phylogenetic tree (one feature of this method is that only a single change can occur along each branch of the tree).
SCM was chosen as a model-based approach that produces output (i.e. inferred number of changes) that is directly comparable to that of MP. SCM (as implemented in the Mesquite program) infers the rate parameters of a statistical model of trait evolution using Maximum Likelihood (ML). It then uses this information to propose possible character changes over the phylogeny (i.e. trait histories, or character maps) that are consistent with this rate of evolution and the probabilities of different trait states at internal nodes in the phylogeny. It allows multiple changes along the branch lengths of the phylogeny and makes use of information about branch lengths such that more changes are likely along longer branches. Thus the estimated number of changes under SCM will be equal or greater than those under MP.

SCM (and other approaches such as the DISCRETE, and MULTISTATE algorithms in
BayesTraits, ace command in the R package ape) are based on a Markov-chain model of character change, where traits are modelled as switching between different states over an infinitesimally small interval of time. The parameters of these models are these instantaneous transition rates between different states, and are calculated with reference to the branch lengths of the phylogenetic tree (which are in units of linguistic change rather than time), i.e. they are not directly interpretable as number of changes, or changes per unit of time. Different models of evolution are possible by specifying whether these rate parameters should be estimated separately, whether certain parameters should be set to be equal to each other, or whether some parameters should take a value of zero (meaning that certain changes from one state to another cannot occur). For traits that take more than two states the Mesquite program used here only allows a model of evolution with a single rate parameter for all possible transitions, i.e. a trait can change from one state to any other at the same rate. This model is obviously an oversimplification, and interesting evolutionary hypotheses can be tested by examining competing models of trait evolution (e.g. [6,7]). However, there may not always be enough information in the data to justify fitting a more complex model with differing rate parameters [8]. The number of possible models of evolution increases dramatically as the number of states a trait can take increases [9]. Ideally, assessing the appropriate model of evolution should be done with care and with reference to relevant existing theory and other lines of data. Such a task is beyond the scope of this study given that the particular model of evolution is not our focus per se, and the fact that we are examining a large number of traits. Even if models with more than one rate parameter were to be estimated it is unclear how these values could be combined meaningfully to produce an overall rate of change for the trait. Given these complexities associated with interpreting these instantaneous rate parameters here we have chosen to focus on the estimated number of trait changes produced by SCM under the single rate model of evolution. Given that we control for the number of states a trait can take, there is no reason to suspect that this measure leads to the observed correlation between relative rates of change in Bantu and Austronesian even if it is ultimately based on a rather simplified model of evolution. For completeness, we include the instantaneous rate parameter values under ML as a further point of comparison. As the next section demonstrates, the inferences made under these different comparative methods are broadly comparable; meaning our finding that the relative rates of change in these traits are correlated across the two language families is robust to different assumptions by different methods.

Comparison of inferred number and rates of change under MP, SCM, and ML analyses
As figure S1 indicates both MP and SCM make comparable inferences about the relative number of changes in both Austronesian and Bantu. The figure also confirms that SCM estimates are always equal to, or higher than estimates under MP. Spearman's rank correlation analyses show that these estimates are highly correlated (Austronesian: rho=0.985, p<0.001; Bantu: rho=0.972, p<0.001). The non-parametric correlation is given here because in Bantu variable 15 (Community Marriage Organization) SCM analyses produced an extremely high number of mean estimated changes due to a high estimated instantaneous rate of change. The mean number of changes estimated was more than 9000, which is around two orders of magnitude greater than the other traits, and is a substantial outlier. If this variable is not included the mean number of trait changes inferred under MP and SCM is highly correlated in both Austronesian (Pearson's r = 0.971, p<0.001) and Bantu (r=0.972, p<0.001). In subsequent results using SCM or ML results, this variable has been omitted due to its status as a substantial outlier. The ML analyses were conducted in the program BayesTraits (http://www.evolution.reading.ac.uk/BayesTraits.html) using the MULTISTATE algorithm and specifying a model of evolution in which all rate parameters take the same estimated value [10]. ML analyses of variable 38 (Segregation of Adolescent Boys) in Austronesian societies initially indicated a mean rate value of 2.73, which was an order of magnitude greater than any of the other rate estimates. Closer inspection of the output revealed that this inflated value was driven by the estimates from two of the trees in the posterior sample. These trees produced rate values of 136 and 115 while all other trees produced rate values around 0.22 (standard deviation = 0.006). Therefore for the ML analyses presented below the mean rate value for this variable was calculated excluding these two outlying values, giving a mean of 0.22. The value this variable takes is not crucial to the overall results; excluding this variable from the correlational analyses presented in Table S6 does not substantially affect the results (e.g. the correlation between Bantu and Austronesian ML values is 0.477 (p=0.012) including v38, and 0.495 (p=0.01) excluding it).
The tables below show the correlations between the inferred numbers and rates of change and the potentially confounding factors of number of taxa, and number of categories, for both the Austronesian and Bantu results. In Austronesian (Table S3) parsimony scores are highly correlated with SCM scores, but neither parsimony nor SCM scores show a significant correlation with ML values. In Bantu (Table S4)  The main text presents the results from the correlations between parsimony scores for the Austronesian and the Bantu datasets. The following tables demonstrate that these overall findings are robust to the method used to estimate the amount or rate of character changes. Parsimony, SCM, and ML analyses all show significant positive correlations between the estimates from the Austronesian and the Bantu data. The correlations between estimated number of changes in Bantu and Austronesian datasets are robust to different combinations of control variables. The ML analyses still show a correlation when controlling for number of categories. This is not the case when controlling for the number of taxa, however, tables S3 and S4 show an inconsistent relationship between number of taxa and the ML rate value in Bantu and Austronesian (in Bantu there is a significant negative relationship, while in Austronesian there is no significant relationship). This suggests that number of taxa may not need to be controlled for in the ML analyses. More generally the results from the Parsimony and the SCM analyses show a consistent pattern, where as the results from the ML analyses are a little more complicated. This may be due to the difficulty of accurately estimating rate values in ML, particularly if the likelihood surface is quite flat, which will introduce error into the rate estimates. This error in estimating the rate values is likely to obscure any real relationship, rather than systematically introducing a bias in favour of finding a relationship that does not really exist. Furthermore, as we mentioned above the interpretation of the instantaneous rate parameters from ML is by no means straightforward. Interestingly, Dediu [11,12] also compared the results of using different phylogenetic comparative methods in analyses of the stability of linguistic structural features. He used the evolutionary-modelbased program BayesPhylogenies (which is usually used for inferring phylogenetic relationships) and the custom built program BayesLang (which employs a Bayesian algorithm for inferring the number of trait changes in a manner similar to maximum parsimony) and found that both methods gave broadly comparable results.

Evolutionary rates and phylogenetic signal
Some previous studies have examined rates of change in linguistic data partially with the aim of identifying features that could be used to examine deeper connections between languages and language families [11,13]. We should stress that this is not our goal in this study. Indeed, the kind of ethnographic data used here is not well-suited to phylogenetic inference due to the fact that it is coded into a limited number of states, with a high possibility that distantlyrelated societies will converge on the same state. In this respect it is more similar to structural linguistic data rather than lexical data [11]. Furthermore, although we are using phylogenies to examine relative amounts of change in different we are not explicitly assessing the degree to which different traits are predicted by linguistic relationships, i.e. we are not assessing the "phylogenetic signal" in the data. This is an interesting issue and has been the focus of a number of studies [3,[14][15][16][17][18], however it is beyond the scope of the present study. As Revell and colleagues [19] point out there is a complicated relationship between evolutionary rates and processes on the one hand, and phylogenetic signal on the other.