Telomeres as integrative markers of exposure to stress and adversity: a systematic review and meta-analysis

Telomeres have been proposed as a biomarker that integrates the impacts of different kinds of stress and adversity into a common currency. There has as yet been no overall comparison of how different classes of exposure associate with telomeres. We present a meta-analysis of the literature relating telomere measures to stresses and adversities in humans. The analysed dataset contained 543 associations from 138 studies involving 402 116 people. Overall, there was a weak association between telomere variables and exposures (greater adversity, shorter telomeres: r = −0.15, 95% CI −0.18 to −0.11). This was not driven by any one type of exposure, because significant associations were found separately for physical diseases, environmental hazards, nutrition, psychiatric illness, smoking, physical activity, psychosocial and socioeconomic exposures. Methodological features of the studies did not explain any substantial proportion of the heterogeneity in association strength. There was, however, evidence consistent with publication bias, with unexpectedly strong negative associations reported by studies with small samples. Restricting analysis to sample sizes greater than 100 attenuated the overall association substantially (r = −0.09, 95% CI −0.13 to −0.05). Most studies were underpowered to detect the typical association magnitude. The literature is dominated by cross-sectional and correlational studies which makes causal interpretation problematic.


Introduction
Exposure to stress and adversity across the lifespan is associated with increased morbidity and mortality from many causes. This implies that stress and adversity have a lasting impact on general physiological processes 'under the skin'. However, until recently, there were few candidate markers of this accumulation of physiological damage. In the last 15 years, the idea that telomeres might serve such a role has rapidly gained in scientific popularity. In particular, telomeres offer a potential 'psychobiomarker' that integrates the organism's experience of psychological states, social and environmental contexts, as well as physical damage, into a common currency [1]. Telomeres are DNA -protein complexes that form protective caps on the ends of chromosomes, and are thought to play a key role in preserving chromosomal stability. At the cellular level, critically short telomere length leads to replicative senescence. At the whole organism level, average telomere length reduces with age. Thus, telomere length or attrition is a biomarker of ageing. As the impact of stress and adversity may be to increase the individual's biological age (as opposed to chronological age), telomere measures offer a metric with which to assess Hans Selye's famous contention that: 'Every stress leaves an indelible scar, and the organism pays for its survival after a stressful situation by becoming a little older' [2].
Interest in using telomeres as a 'psychobiomarker' has grown rapidly, not just in human epidemiology, but also in animal ecology [3] and animal welfare [4]. In the human literature, telomeres have been studied in association with a wide range of exposure variables, including psychological stress [5], psychiatric illness [6], socioeconomic status [7], environmental pollutants [8], nutrition [9], smoking [10] and physical activity [11]. In several of these cases, the number of studies is sufficient that meta-analyses have appeared [12][13][14][15][16][17][18][19][20], often finding that telomere length is associated with the exposure, though weakly and variably. Reviewing the associations between telomeres and different exposures separately is appropriate to answer questions about that particular exposure. However, it loses sight of the most exciting promise of telomeres as a 'psychobiomarker': namely their potential to integrate the consequences of quite different kinds of stress and adversity into a common currency. Here, we set out to simultaneously review relationships of telomere length and attrition with all the different kinds of stress and adversity that are being studied.
Having a single integrated dataset allows several possibilities not available in separate, specialist meta-analyses. First, it offers a synopsis of the whole burgeoning field of telomere epidemiology. Second, it allows explicit comparison of different association strengths on the same scale (is the association of telomeres with psychological stress generally weaker or stronger than the association with physical disease, or exposure to pollution?). Third, it offers potential to address unanswered questions about telomere dynamics, such as whether, overall, early-life stressors are more strongly associated with telomere shortening than stressors experienced in adult life, as has been suggested [21]. Fourth, it leads to a large dataset within which some methodological issues of broad relevance can be examined. These include whether different tissue types produce different patterns, and whether the popular telomere length measurement method, quantitative polymerase chain reaction (qPCR) [22], leads to generally weaker associations than other methods. This should be expected, because measurement error is generally found to be higher in qPCR than more intensive methods [23], and measurement error attenuates observed correlations.
With these objectives in mind, we carried out a systematic review and meta-analysis of the published literature on telomeres in relation to stress and adversity in vivo, to May 2016. Our searches involved the terms 'stress' and 'adversity' in combination with 'telomere'. We extracted all associations with telomeres reported in the papers, not just the associations of primary interest to the original authors. Our search strategy was not intended to find the whole of the literature on telomeres and any particular exposure variable. Authors may not always have used the descriptors we searched, and may have been more likely to do so for some exposure variables than others. Nonetheless, our searches did produce the largest telomere literature dataset assembled to date, and we believe that though not exhaustive, it constitutes a good transect through the field of telomere epidemiology.

Methods
Our methods are described in detail in our protocol, which was registered via the Open Science Framework prior to data extraction [24]. Raw data files and data analysis scripts are freely available in an online archive on the Zenodo repository [25].

Search strategy and inclusion criteria
A PRISMA diagram for our study is available as electronic supplementary material. We searched the Scopus and PubMed databases for papers including the words 'stress' or 'adversity', and 'telomere'. All records up to the date of the search (11 May 2016) were screened (n ¼ 3647). We removed rsos.royalsocietypublishing.org R. Soc. open sci. 5: 180744 duplicates and then screened the remaining papers based on their titles and abstracts. In summary, this involved removing any papers that: (i) were not complete original research papers available electronically and in the English language; (ii) used study organisms from outside the animal kingdom; (iii) did not study whole organisms; (iv) used genetically modified organisms; (v) experimentally applied nonnaturalistic exposures in captive animals; (vi) examined telomere length in transplanted tissues or organs; (vii) were presented as concerning the physiological consequences of telomere length, rather than the correlates of exposures; (viii) examined intergenerational questions (e.g. the effects of paternal infection status on offspring telomere length); or (ix) used the same dataset, or participants reported in a previously recorded paper, to address an exposure-telomere relationship we had already recorded (where this occurred, the first-recorded association was the one used). This led to a candidate set of 286 papers.
Although our searches were based on the terms 'stress' and 'adversity', we extracted all reported associations with potential exposure variables found in the papers returned by our search, whether or not they were the focus of the study's stated objectives. This included control variables and covariates as long as sufficient detail was provided. Thus, our search strategy consisted of finding papers on the subjects of stress/adversity and telomeres, and sampling the full variety of exposures that fell out of the papers captured by the search.

Association format
Associations could only be used if convertible into a correlation coefficient, the common association metric that we chose based on initial scoping and piloting. Standard conversion formulae were used [26][27][28][29], and the conversion algorithms are provided in the data repository. Usable statistics comprised: correlation coefficients, standardised bs from regression models; unstandardized bs where standard deviations for the independent and dependent variables were provided; unstandardized bs from regressions with dichotomous independent variables where the standard deviation of the dependent variable was provided; F-ratios from ANOVAs comparing two groups; t-statistics; Cohen's d or standardized mean difference statistics; group means with standard errors; and group means with standard deviations. Where several alternative analyses were presented, we chose unadjusted in preference to adjusted analyses; and from several adjusted analyses where no unadjusted data were available, we chose the analysis that adjusted for the fewest variables. This was to maximize comparability between studies that used different sets of control variables in multivariate models. For longitudinal studies, associations were between change in telomere length (i.e. difference between follow-up and baseline) and the exposure variable. For 125 papers, the reported information was not sufficient to create a usable correlation coefficient.

Data extraction
Data from 218 associations (30%) were extracted independently by both G.V.P. and D.N. Any differences were identified and resolved as part of the process of refining our data extraction methods. The remaining associations were extracted by either G.V.P. or D.N., with G.V.P. checking and correcting the whole dataset after extraction. As well as sample size and other statistics necessary for association conversion, we extracted bibliographic information and a series of classificatory variables as described in table 1. The life stage prior to birth was classed as embryonic, and the life stage prior to sexual maturity (4745 days for human females/5110 days for males [30]) was defined as childhood. We also identified any associations that could be considered subparts of others, for example, separate-by-sex associations where the combined association was also reported, or associations between telomeres and subscales where the association with the main scale was also reported. We did not include these subscale and subgroup associations in our final analyses, though they are included in the unprocessed dataset, in case they are of interest to others.

Final dataset
The extracted data are the 'unprocessed data' file in the data archive. We categorized exposure variables a posteriori. We created both broad categories (11 categories plus 'Other', as specified in table 1), and fine ones, for example, using specific diseases rather than 'physical disease', and specific types of psychosocial or socioeconomic measures (35 fine categories, plus 'Other'). No categories (broad or fine) were created if the number of independent associations (i.e. from different papers) was less than 3 prior to the exclusions described below. The final dataset analysed here ('processed data' file) differs from the unprocessed data in a number of regards. We excluded 27 associations from studies of non-human animals, because we deemed these too few (typically one study per species) for further analysis. We then excluded 11 associations that appeared to be duplicates of those reported in other papers using the same data, and 14 associations where the exposure variable was a medical treatment-the designs of these studies generally confounded effects of the treatment on telomeres with effects of the disease the treatment was for. Finally, we excluded 129 associations based on subscales and subgroups of other associations reported in the same studies.
For analysis, we reversed 149 correlations in sign to align all correlations into the same direction; that is, so that a negative correlation indicates that greater stress and adversity is associated with short (or shortening), rather than long (or lengthening), telomeres. For example, we reversed correlations where the exposure variable represented higher socioeconomic status, better sleep or more parental care, to align them with the more common case where a higher value of the exposure indicates more adversity (e.g. disease, psychosocial stress, pollution exposure). The case of nutritional variables was challenging, because it was often unclear if higher consumption was predicted to be positive or negative in effect. We therefore reversed the direction of all nutritional variables, so that a negative correlation means a deficit in consumption is associated with short telomeres. This had little impact on the results, because the overall associations for many nutritional categories were null. The categories with the strongest associations-fruit, legumes and vegetables, and vitamins-are cases that clearly conform to our assumed 'more is better' principle.

Data analysis
Data were meta-analysed in R [31] using the 'metafor' package [32]. Estimation was by REML. As the dataset includes multiple associations from the same studies, we used multilevel models containing nested random effects of association and study. Meta-regression was used to examine differences in association strength for different types of exposure, and different methodological features. Additional analyses to detect outliers and account for possible publication bias patterns were implemented using R packages 'metaplus' [33] and 'weight' [34] and are reported in the electronic supplementary material, supplementary analysis. The full data analysis script is included in the data archive.

Description of dataset
The final dataset consisted of 543 associations from 138 unique studies of human participants (associations per study 1 -43, mean 3.93). One hundred and sixty-eight associations were reported by the study authors as being statistically significant, 349 as null and 26 were not reported as either null or significant. Two hundred and ninety-four associations (54%) were completely unadjusted; the remaining 249 (46%) featured some degree of statistical adjustment (for example, for age, though the exact specification of the adjustment varied from study to study). Table 1 describes the associations and studies included. Typically, they used qPCR to measure telomeres; did so in leucocytes or whole blood; were correlational rather than experimental; and were cross-sectional rather than longitudinal. This meant that the telomere variable was overwhelmingly a single measure of average length, rather than the rate of attrition. Associations with both length and attrition are included in our main analyses, though we test whether study design moderates observed correlations. The studies were mostly of adults, and mostly related to sources of stress and adversity that were experienced in adulthood.

Overall association and publication bias
The distribution of correlations between exposure measures and telomeres is shown in figure 1a. Though the modal correlation was close to zero, the distribution was asymmetric, with 399 correlations less than zero (indicating that greater adversity was associated with shorter telomeres), and 144 greater than zero (indicating that greater adversity was associated with longer telomeres). The majority of correlations fell into what is conventionally defined as a 'small' effect size (20.2 , r , 0.2), with 294 small negative correlations and 131 small positive ones. There were 105 instances of a moderate or large negative association, against just 13 of a moderate or large positive association. In a simple meta-analytic model with no moderators, the overall estimate of association was conventionally small and significantly negative (r ¼ 20.15, 95% CI 20.18 to 20.11, p , 0.001). Thus, greater adversity was correlated with shorter telomeres. There was substantial heterogeneity between associations (t ¼ 0.21, Q 542 ¼ 12 742.54, p , 0.001), and most of the variability resided at the between-study level rather than between associations from the same study (r represents the intra-class correlation coefficient between the associations from the same study; r ¼ 0.87). In the electronic supplementary material ( §2), we present evidence of over-dispersion of associations relative to a normal distribution: 12 studies were classed as outliers, 10 reporting very strong negative associations and two reporting moderately strong positive associations. The estimated association may have been affected by publication bias. The funnel plot of correlation coefficient against sample size showed the inevitably broader range of observed correlations at smaller sample size. However, the funnel was asymmetric, with strongly negative correlations appearing at small sample sizes, but rather few of the strongly positive correlations that ought also to be expected, given that true effects were weak and the precision of estimation low in small studies (figure 1b; see also electronic supplementary material, §3). To further explore this, we divided sample sizes into four bins (fewer than 100, 101 -250, 251-1000, more than 1000; these bins represent approximate quartiles of sample size). We then added sample size bin to the meta-analytic model as a moderator (this is the conceptual equivalent of the Egger test for the multilevel model situation). Sample size bin explained a significant amount of variability (Q 3 ¼ 30.27, p , 0.001), though substantial heterogeneity remained (t ¼ 0.19). In particular, associations with sample sizes of 'fewer than 100' were significantly more negative than the reference category of more than 1000 (B ¼ 20.17, 95% CI 20.24 to 20.10, p , 0.001; figure 1c; see also electronic supplementary material, §2). The other sample size bins did not differ significantly from the reference category of 'more than 1000'. Because the trim and fill methodology for imputing the associations assumed to be missing is not defined for the multilevel situation, we performed all subsequent analyses both on the whole dataset and, in parallel, on only the 370 correlations from 82 studies where the sample size was greater than 100 (henceforth the 'reduced dataset'). The central estimate from the reduced dataset was considerably weaker than the full dataset (r ¼ 20.09 compared to r ¼ 20.15, 95% CI 20.13 to 20.05, p , 0.001), with substantial heterogeneity (t ¼ 0.18, Q 369 ¼ 4512.37, p , 0.001), and again, most of the variation residing between studies, rather than between associations within studies (r ¼ 0.88).

Categories of exposure
We divided our exposure variables into 11 broad categories plus 'other' and added category of exposure to the meta-analytic model as a moderator. Exposure category did not explain a significant amount of the heterogeneity (whole dataset: Q 11 ¼ 17.25, p ¼ 0.10; reduced dataset: Q 11 ¼ 14.70, p ¼ 0.20), suggesting that the type of exposure studied, at this coarse level, does not explain the variation in association between telomeres and exposure variables. We also fitted separate meta-analytic models to the correlations in each of the 12 broad exposure categories (figure 2). In the whole dataset, the central We also created a finer 36-category classification of exposures. For example, we considered each physical disease, psychiatric condition or psychosocial construct for which multiple independent data points were available separately. The fine categories explained a significant amount of heterogeneity in both the full (Q 35 ¼ 59.28, p , 0.01; t ¼ 0.20) and reduced datasets (Q 32 ¼ 58.03, p , 0.01; t ¼ 0.16; note only 33 of the 36 fine categories were represented in the reduced dataset). We took smoking as the reference category as this association is estimated with good precision due to a large number of studies. Compared to smoking, in the full dataset, we found significantly stronger negative In the reduced dataset, the significant differences from smoking for environmental hazards and schizophrenia became non-significant; the significant differences persisted for HIV and AIDS (B ¼ 20.14, 95% CI 20.25 to 20.03, p ¼ 0.02) and Parkinson's (B ¼ 0.41, 95% CI 0.21 -0.60, p , 0.001); and two new significant differences were found: poor parental care gave significantly stronger negative correlations than smoking (B ¼ 20.14, 95% CI 20.27 to 20.01, p ¼ 0.04), and low carbohydrate consumption gave significantly weaker ones (B ¼ 0.09, 95% CI 0.00 -0.19, p ¼ 0.04). We also considered whether the associations in the 36 fine categories differed significantly from zero when considered separately (figure 3). In the full dataset, anxiety, cardiovascular disease, depression, diabetes, lower education, environmental hazards, lower fruit, legume and vegetable consumption, lower income, lower meat, fish and egg consumption, lower physical activity, post-traumatic stress disorder (PTSD), schizophrenia, smoking, stress, traumatic experience and lower vitamin consumption were all significantly correlated with shorter telomeres. Restricting consideration to the reduced dataset, the associations remaining significantly less than zero were: anxiety; cardiovascular disease; diabetes; education; lower fruit, legume and vegetable consumption; lower meat, fish and egg consumption; lower physical activity; and smoking. In addition, the parental care correlation (receiving poorer care associated with shorter telomeres), which had not been significantly different from zero in the full dataset, became so in the reduced dataset.

Other moderators
We tested whether a series of different methodological features explained any significant amount of heterogeneity between associations (table 2; we were unable to simultaneously include exposure category and all the methodological variables in a single model for reasons of statistical power). There was no strong evidence that the study design (longitudinal versus cross-sectional, or experimental versus correlational), the life stage of the participants (either at exposure or telomere measurement), the type of tissue or the sex of the participants explained a significant amount of the heterogeneity, either in the full or reduced datasets.
There was some evidence for variation in association strength by telomere measurement technique in the full dataset (table 2). Fluorescent in situ hybridization (FISH) techniques produced significantly stronger negative associations than the dominant qPCR technique. Southern blot and TelSeq associations did not differ significantly from qPCR, though TelSeq was represented in just one study. However, measurement technique was confounded with sample size in the data: FISH and Southern blot were used in relatively small sample studies (medians 49 and 56), TelSeq in one very large study (sample size 11 670) and qPCR in a range of sample sizes (median 285). We have already established that correlations were weaker in larger samples, and the order of central correlation estimates for the four techniques (FISH: r ¼ 20. 29

Discussion
Telomere length or attrition has been proposed as a common currency 'psychobiomarker' of the impact of many different types of stress and adversity on the individual. Here, we meta-analysed an exceptionally large and diverse dataset consisting of 543 associations from 138 studies featuring over 400 000 human participants. The results confirm that, in the published literature, telomere length is indeed significantly associated, in the predicted direction, with a wide range of exposures including environmental hazards, smoking, psychiatric illness, psychosocial factors, socioeconomic factors, parental care, poor nutrition and physical activity. The large heterogeneity estimate even after  3  3  9  8  11  8  3  3  5  3  5  5  8  3  3  2  3  3  5  3  6  4  3  3  3  3  3  2  2    controlling for exposure type suggests that there is more variation in results between different studies of the same exposure than between different types of exposure. We emphasize that because our search strategy was based on 'stress' and 'adversity', our dataset is neither exhaustive nor a representative sample of all the work being carried out in human telomere epidemiology. We are likely to have captured almost all work on psychosocial stress, which necessarily involves one of our search terms, but only some of the studies of smoking or physical diseases. Thus, the relative abundance of exposure types in the dataset should not be interpreted as informative: for example, our dataset contains more studies of psychosocial exposures than of smoking, whereas the true abundances in the literature may well be the opposite way around (see [17,18]). However, the dataset is very large, and there are substantial numbers of studies in every broad category. Thus, it is still useful for comparing the typical strength of association of different types of exposure with telomere measures, as well as exploring cross-cutting issues. For several of the exposure variables in our dataset, there are published specialist meta-analyses covering just that exposure type. In many cases, these have appeared since we began data collection for this paper. Where such a specialist meta-analysis exists and we had more than five associations in our dataset, we compared the by-category results from our figure 3 with the key results of the corresponding specialist meta-analyses (table 3). There was, overall, a high degree of agreement. We view this good agreement between the specialist reviews and subsets of our dataset as confirmation that the search strategy we used yielded a sufficiently robust transect of the telomere epidemiology literature for the comparisons we have presented to be meaningful.
In detecting significant negative associations between telomere variables and a wide variety of different exposures, our findings appear to support the contention that telomeres are a useful integrative 'psychobiomarker' [1,4]. Nonetheless, they bring to the fore a number of important caveats. The first caveat is that the observed correlations are in the range that would conventionally be considered weak or small [39]. This has methodological implications for the use of telomere measures in research. A correlation coefficient of r ¼ 20.15 (our central estimate from the whole dataset) requires a sample size of 359 to be detected as significant (p , 0.05) with 80% power. Only 38% of the associations had this sample size. Using the central correlation estimate from the reduced dataset (r ¼ 20.09), the required sample size rises to 1007 (which was met by 21% of associations). Thus, significance tests from individual small-n studies should not be taken as strong evidence that a given exposure variable does or does not associate with telomere length, or associates differently from other variables. Moreover, with correlations typically of small magnitude, telomere length is likely to have limited value as an indicator of adversity exposure in individual people.
The weakness of observed associations may be related to the fact that the extant literature relies almost entirely (93% of associations) on cross-sectional studies using telomere length measured at a single point in time. Where there are true environmental effects on telomere attrition, measured associations between telomere length and environmental factors in cross-sectional studies are likely to be weak, because the individual variation in telomere length at birth, which is substantially heritable, dwarfs the amount by which telomeres shorten over the life-course [40,41]. Thus, any environmental signal in cross-sectional telomere studies will be diluted by a large component of irrelevant individual variability. Longitudinal studies that examine telomere attrition, rather than telomere length, thus effectively controlling for different individual starting telomere lengths, are potentially much more powerful for detecting possible environmental influences (see [42] for discussion and [43,44] for   [45]. The second caveat is that the literature may be affected by publication bias, a conclusion that echoes those of some narrower reviews [18]. We found evidence of stronger negative correlations in published studies with small samples, and removing the small samples nearly halved the strength of the overall association between exposure variables and telomeres. (An alternative approach to correcting for publication bias reported in the electronic supplementary material suggests an even greater degree of attenuation.) Directionally stronger associations in small samples are usually taken as evidence that small studies with results contrary to prediction are being selectively withheld or rejected. Publication bias is not the only possible interpretation of this pattern, though. It could be that smaller studies measure stresses and adversities with greater precision, or use more selected participant samples, so that the variation in exposure is greater, and as such genuinely detect stronger negative correlations with telomeres. Nonetheless, many of the most striking claims, for example, regarding psychosocial associations with telomeres, are based on small-n findings that are atypically strong. The problem of selective appearance of associations in the literature may be worse than our findings suggest. For example, large-n epidemiological studies are likely to be published whatever the results, but authors often have degrees of freedom concerning which of many available predictor variables they report, and in how much detail. Even a slight bias towards including or providing detailed results preferentially for those measures that produce patterns conforming to expectation would suffice to distort the meta-analytic conclusions considerably.
The third caveat is that it is hard, from the present literature, to make inferences about causality in the relationships between telomere variables and exposures to stress and adversity. This is because of the overwhelming reliance on cross-sectional and correlational designs. Several of the specialist metaanalyses have concluded with calls for more longitudinal research [18,19]. It is disappointing to note that in the course of this review, we have recurrently encountered correlational findings described as if they were causal (e.g. [7,19,46,47]), and cross-sectional findings described as if they were longitudinal (e.g. [9,16,38,46]) in article titles, abstracts and discussions.
A cross-sectional correlation between telomere length and an exposure could arise for three reasons: the exposure causes telomeres to shorten (causality); short telomeres cause the exposure (reverse causality); or some third variable is causally related to both telomeres and the exposure (third variable). Causality should not be assumed without further evidence. Reverse causality is plausible for many physical diseases. In some cases, this is supported by longitudinal evidence (e.g. [45,48,49]) and Mendelian randomization studies [50,51]. Reverse causality may be possible for psychological and behavioural variables too, because short telomere length can change patterns of gene expression [52], with possible consequences for brain function. Third-variable explanations are plausible for many of the correlations described here. Childhood adversity, for example, is a third variable of potential general importance [21]. Childhood adversity is a known risk factor for a number of the variables considered here as exposures, such as poor physical and psychiatric health, smoking and low socioeconomic status. Childhood adversity may also accelerate telomere shortening [53]. As the highest rate of telomere shortening occurs early in life [54,55], it is perhaps more plausible that developmental conditions affect both the risk of the adult exposures and adult telomere length, than the adult exposures affecting adult telomere length directly. However, we did not find evidence in the present dataset that exposures during childhood produce significantly stronger correlations with telomere length than exposures during adulthood (though see [56] for a Cohen's d effect size of 20.35 in a specialist meta-analysis of early-life adversity and telomere length). In relation to our objective of understanding methodological sources of variation in measured associations, we were not able to reach any strong conclusions. Most of the methodological variables we recorded did not explain any significant fraction of the observed heterogeneity, but we cannot infer that they make no systematic difference. This is because many of the non-standard methodological choices in the dataset (e.g. longitudinal design, tissue other than blood, measurement technique other than qPCR) were rare. Moreover, the different features of the methodology did not vary independently of one another, or of exposure type. We found some evidence suggesting FISH might produce stronger correlations with predictor variables than other measurement techniques. However, the FISH studies also featured small samples, and small sample size was associated with stronger correlations. After controlling for sample size, the moderating effect of measurement technique was attenuated. To make progress on methodological questions such as whether, for example, qPCR produces weaker associations than other techniques due to greater measurement error [23,57,58], it will be necessary to take more homogeneous sets of studies, all focussing on the same relationship, in order to isolate the consequences of this single methodological factor. For example, in recent meta-analyses of telomeres and sex [59], and telomeres and depression [36], stronger associations were found by Southern blot and/or FISH than by qPCR.
We conclude with a plea to the field. We had to exclude 125 papers because of failure to describe data in enough detail; this is nearly as many as we were able to include (138). Common omissions were simple, such as not providing means and standard deviations per group, not providing sufficient detail of regression models, or providing only a p-value for the key result. Moreover, there may have been cases where researchers measured more variables than those they reported. These failings could easily be addressed by more careful reporting of statistics, better refereeing and, above all, fostering a culture in which all raw data are made freely available. Given the subtlety of any associations between telomere dynamics and environmental exposures, it will be necessary to pool our collective evidence in order to understand them. It is a great waste if much of that evidence is not usable for meta-analysis.
Ethics. This study was based on secondary analysis of study-level data already in the public domain, and as such required no ethical approval.
Data accessibility. Raw data and the R scripts for converting and analysing them are freely available via the Zenodo repository: http://doi.org/10.5281/zenodo.1189538.