Spreading of healthy mood in adolescent social networks

Depression is a major public health concern worldwide. There is evidence that social support and befriending influence mental health, and an improved understanding of the social processes that drive depression has the potential to bring significant public health benefits. We investigate transmission of mood on a social network of adolescents, allowing flexibility in our model by making no prior assumption as to whether it is low mood or healthy mood that spreads. Here, we show that while depression does not spread, healthy mood among friends is associated with significantly reduced risk of developing and increased chance of recovering from depression. We found that this spreading of healthy mood can be captured using a non-linear complex contagion model. Having sufficient friends with healthy mood can halve the probability of developing, or double the probability of recovering from, depression over a 6–12-month period on an adolescent social network. Our results suggest that promotion of friendship between adolescents can reduce both incidence and prevalence of depression.


Formal definitions
Throughout, we write N for healthy mood and D for depression. Letters A, B, . . . ∈ {N, D} and overlining is used as follows: Let individuals be labelled with indices i, j, . . . ∈ {1, . . . , n}. At (discrete) time t individual i has state X t i ∈ {N, D}. These are connected on a network with adjacency matrix G with elements G ij = 1 if individual i named j as a friend, 0 otherwise.
Our general N -transmits model is of a discrete-time Markov chain X t = (X t i ) with transition where I is an indicator function I{ω} = 1 if ω is true, 0 otherwise.
In the no transmission model, p k and q k are independent of k and in the D-transmits model we exchange D for N on the right-hand side of (3). We will often be interested in just two timepoints, which we will write as t and t + 1 in general.

Simulation methods
The Markov chain defined by (3) can be simulated using standard Monte Carlo methods. Note that since we consider a situation where then we can get from any state to any other in a finite number of steps (the chain is irreducible) and the expected time to return to any state will be finite (all states are non-null persistent) and hence by e.g. Theorems (6.4.3) and (6.4.17) of Grimmett and Stirzaker (2001), there will be a unique stationary distribution π that describes the behaviour of the chain at large times.
To sample from this distribution, we perform discrete-time Monte Carlo simulation of the models that are specified by values of p k , q k on a directed network of named friends constructed from the n = 3084 individuals in the dataset satisfying our inclusion criteria at the first time point (wave 1). Depending on the simulated output required, we took 10 4 time-separated samples of either network pairs at a single time point or temporally adjacent node-level state transitions from the stationary distribution for each model. : Number of D → D edges for the stationary distributions of the models versus real data.Asterisks above a plot denote a significant statistical difference at the 5% level, corresponding to p < 0.01 using the Bonferroni method to account for multiple testing. Observed data could be plausibly generated by both transmission (p = 0.59) and no transmission (p = 0.60) models.
4 Dependence on degree a Out−degree for depressive symptom individuals 5 Goodness-of-fit

Residual error calculation
For logistic regression, a standard approach to assessing goodness-of-fit is the Hosmer-Lemeshow (HL) test, which is based on the distribution of residual errors (Hosmer and Lemeshow, 2005) -i.e. the differences between the observed and the model values. Our model is not a standard regression, and so we test goodness-of-fit in a similar manner to the HL test but with assumptions more appropriate for our model. In particular, we define a residual error function stratified by number of friends, where Y A→A k is the observed number of state transitions from A to A of individuals with k friends in state N , and X A→A k (θ) is the modelled number of such events given parameters θ; The quantity E A is positive definite and will tend to zero for a model that perfectly captures the data.

Simulations
The distribution for E A is not analytically available, and so we use a parametric bootstrap approach, simulating from the model once it has been fitted to observed data by maximumlikelihood estimation (MLE), giving MLE parameter estimateθ.
We performed simulations as detailed in §2 above, extracting the proportion of individuals who recover from depressive symptoms / gain depressive symptoms within a year, dependent on the number of friends in different states they had at the initial time point. This sampling process was repeated many times as for other bootstrap methods to obtain an accurate estimate of the distributions of E D and E N . Note that in this case, the simple residual error summary statistic does not have any asymptotic properties that suggest it should be used in model selection in the way that AIC is, therefore special attention should not be paid to any particular threshold of p value; rather, a larger p value simply denotes a better fit. These results therefore support our broad conclusion that N -transmits should be preferred to no transmission. 6 Analysis of confounding

Setup
Our aim here is to state in mathematical language what is meant by transmission of mood, how confounding is possible and not possible. We will do this using pairwise model notation, and will write [A] for the number of nodes of state A, [A → B] for the number of individuals in state A naming an individual in state B, at a given time point that we will normally omit; formally We are going to consider how to calculate relevant quantities for both a transmission model and a model with homophily relating to some unobserved property ξ.

Homophily model
Suppose we have a property (or vector of properties) that individuals have, for example age, socio-economic status, or spatial location. We label these properties with ξ and write [ξ] for the number of nodes that are of ξ etc.
Now consider a relatively general model in which the probability of changing state if in state A and with property ξ is ρ A ξ . We can then write down equilibrium values for the expected number of pairs under the stationary distribution π, which are It is clear that by tuning the propensity of different property types to name each other as friends, and the transition probabilities, arbitrary pair structures can be created. But for the transitions, we have that at equilibrium These do not depend on k. Overall, therefore, this model cannot be falsified from observations of numbers of pairs [A → B], but can be falsified from observations of transitions stratified by k, Y A→B k .

Transmission model
In general, therefore This means that given the freedom to choose p k , q k for a given network configuration, it is possible to tune the expected values of these transitions to whatever value is required. The probabilities assigned to different network configurations under the invariant distribution π do not in general have an analytic closed form solution. In the event where p k and q k do not depend on k, then equations of the form (10) will hold where every individual has the same property ξ.
In the event where the population has size n and there are on average m friends per individual, note that basic combinatorial considerations give that meaning that there are only three independent parameters: [N ]; [N → D]; and [D → N ]. Now suppose that p k is monotone decreasing with k and if q k is monotone increasing with k, this will lead to fewer [D → N ] pairs than equations of the form (10) would suggest due to transmission of N .
It is, of course, possible to combine elements of the transmission and homophily models in various ways. We take the philosophical position that anything more complex than the homophily model above will constitute a mechanism for the phenomenon of social contagion rather than an alternative to it.

Parameter Identifiability
We now turn to the question of how accurately model parameters can be inferred from data. To do this, we performed simulations as in §2 above. Each set of simulated data was then fitted to the same model that it had been generated from using MLE.