Demographic expansion of an African opportunistic carnivore during the Neolithic revolution

The diffusion of Neolithic technology together with the Holocene Climatic Optimum fostered the spread of human settlements and pastoral activities in North Africa, resulting in profound and enduring consequences for the dynamics of species, communities and landscapes. Here, we investigate the demographic history of the African wolf (Canis lupaster), a recently recognized canid species, to understand if demographic trends of this generalist and opportunistic carnivore reflect the increase in food availability that emerged after the arrival of the Neolithic economy in North Africa. We screened nuclear and mitochondrial DNA in samples collected throughout Algeria and Tunisia, and implemented coalescent approaches to estimate the variation of effective population sizes from present to ancestral time. We have found consistent evidence supporting the hypothesis that the African wolf population experienced a meaningful expansion concurring with a period of rapid population expansion of domesticates linked to the advent of agricultural practices.

of 10 μl. Reactions were performed in a BioRad T100 Thermal Cycler (for thermoprofile see electronic supplementary material, table S2). A negative control was included in each PCR to monitor possible contaminations. PCR products were purified using ExoSap IT® (Affymetrix) following manufacturer instructions, and then sequenced using DLH primer using the Big-Dye Terminator v3.1 Cycle Sequencing protocol (Applied Biosystems).

(c) Microsatellites genotyping and individual identification
A set of 47 microsatellite loci was amplified in five multiplex reactions for tissue samples following the methodology proposed by [3] and [4] (for details on markers see electronic supplementary material, table S3). For scat samples, we genotyped a subset of 13 microsatellites previously optimized in three pools by [5] following the methodology of these authors. Four PCR replicas of each marker were accomplished per non-invasive sample. Negative controls were included in all PCR amplifications to monitor possible DNA contamination. PCRs were performed in a BioRad T100 Thermal Cycler in final volume reactions of 10 μl including 5 μl of QIAGEN Multiplex PCR Kit, 1 μl of primer multiplex, 3 µl of H2O and 1 µl of DNA (2.5 µl of DNA for non-invasive samples). PCR profile was specific for each multiplex and according to previously published information referred to above. Amplification products were separated and detected on the ABI 3130xl Genetic Analyser (AB Applied Biosystems) and alleles were scored by comparison to the GeneScan™ 500 LIZ size standard using GENEMAPPER 4.1 (Applied Biosystems), and manually checked to control automatic binning. Identical genotypes corresponding to the same individual were grouped using GIMLET 1.3.3 [6] and excluded from subsequent analysis.

(d) Diversity and genetic structure
Mitochondrial diversity was assessed using sequences generated in this study (n=22), and then together with 46 sequences from Algeria and Tunisia, respectively, retrieved from previous works [5,7; supplementary material, table S5]. Diversity indices were assessed using DnaSP 5 [8]. Intraspecific genetic distances were estimated in MEGA 7 [9] using p-distance model. Phylogeographic relationships among the different mtDNA haplotypes were estimated using the Median-joining (MJ) network algorithm [10] implemented in PopArt [11].
The 47 microsatellite dataset was evaluated for deviations from Hardy-Weinberg equilibrium (HWE) using GENALEX 6.5 [12], and loci with significant departure from expectations after Bonferroni correction were excluded from the subsequent analysis.
Genetic diversity was estimated separately for the dataset in Algeria (n=18), and for the subset of 13 microsatellites in Algerian samples including 2 additional genotypes obtained from non-invasive samples from Algeria, and 27 genotypes from Tunisia [5] generated previously in our lab. Diversity measures were calculated using GENALEX 6.5 [12]. Population structure was tested using the Bayesian clustering approach implemented in STRUCTURE 2.3.4 [13]. Analyses were performed independently 5 times for 10 6 iterations after a burn-in period of 5x10 5 iterations, using the admixture model with correlated allele frequencies among populations. We tested 1 to 10 clusters (K) without prior population information. Structure Harvester [14] was used to summarize the posterior probabilities of each K over all runs [15]. We carried out a Principal Components Analysis (PCA) using the Adegenet package in R [16].
Isolation by Distance was evaluated through Mantel tests for mitochondrial and microsatellite loci separately. Three matrices were built including: i) pairwise genetic distance between individuals for each molecular markers estimated in GENALEX 6.5 and, ii) pairwise geographic distance in kilometers from the latitude and longitude of the sampling sites calculated using Geographic Distance Matrix Generator [17]. The Mantel tests were performed in GENALEX 6.5, with significance determined via 999 permutation tests. The same software was used to test population structure between the two sampling areas (Algeria and Tunisia) through an Analysis of Molecular Variance (AMOVA).

(e) Demographic analysis
Demographic history of the African wolf was inferred using mitochondrial and microsatellite loci separately, compiling data from Algeria and Tunisia in a single dataset.
For mtDNA we estimated mismatch distributions and Harpending's raggedness statistics [18], and tested deviation from neutrality through Tajima's D [19] and Fu's Fs [20] statistics, using DnaSP 5 [8]. Coalescence simulations with 1,000 replicates were applied to determine the p-value of each statistic. Smooth and unimodal mismatch distributions, non-significant Harpending's raggedness statistics [18], and significant negative values (p-value <0.05) of Tajima's D and Fu's Fs were taken as evidencing a scenario of demographic expansion. Past population dynamics was inferred using Extended Bayesian Skyline Plot (EBSP) implemented in BEAST 2.3.2 [21]. We used the strict clock, an evolutionary rate of 5.48% per million years estimated for canids [22] and previously used in this species [7], and HKI+G as best model of nucleotide substitution as selected in MrModeltest2.3 [23]. Two independent runs of 10 8 generations each and sampled every 10 4 generations were performed. Tracer 1.6 [24] was used to check convergence of the MCMC chains. The Extended Bayesian skyline plot (EBSP) was constructed in R platform (R Core Team, 2018).
For microsatellite loci we estimated the variation of effective population sizes (Ne) from present to ancestral time with a coalescent approach using the method VarEff [25] implemented in a R package. The method uses an approximate likelihood of the distribution of distance frequencies between alleles in a Monte Carlo Markov Chain framework [25]. After several trial runs, the final analyses were conducted using the two phase mutation model assuming a proportion of 0.22 for multi-step mutations [26], a mutation rate of 3.5x10 -3 [27] and allowing three population size changes (JMAX = 3).
Prior for current Ne was set according to estimation on trial runs (NBAR = 1,600). Prior for the number of generations since the origin of the population (GBAR) was set to 8000 generations (equivalent to 32 kya, based on a generation time of 4 years for wolves from [28]) to encompass timing of Neolithic expansion in North Africa. Final run was carried out using 10,000 batches with a length of 10, saved every 10 batches in the MCMC chain and with a burn-in period of 10,000 batches.

(f) Verification of demographic inference results
Demographic inference can be affected by population structure, non-random sampling, lack of information in molecular markers and natural selection [29][30][31]. In order to rule out obvious confounders and possible batch effect, we performed additional demographic analyses for microsatellites using subsets of our samples per country.
We implemented the same coalescent approach using the method VarEff with the same priors as described above, except for the current Ne which was set to 1,600 and        Figure S1. Mean log-likelihood distribution for each cluster (K) and standard deviation bars as summarized through Structure Harvester [14] for (a) Algerian dataset using 38 loci, and (b) Algerian and Tunisian datasets using 13 loci.