Abstract
Understanding the nature of human contact patterns is crucial for predicting the impact of future pandemics and devising effective control measures. However, few studies provide a quantitative description of the aspects of social interactions that are most relevant to disease transmission. Here, we present the results from a detailed diary-based survey of casual (conversational) and close contact (physical) encounters made by a small peer group of 49 adults who recorded 8661 encounters with 3528 different individuals over 14 non-consecutive days. We find that the stability of interactions depends on the intimacy of contact and social context. Casual contact encounters mostly occur in the workplace and are predominantly irregular, while close contact encounters mostly occur at home or in social situations and tend to be more stable. Simulated epidemics of casual contact transmission involve a large number of non-repeated encounters, and the social network is well captured by a random mixing model. However, the stability of the social network should be taken into account for close contact infections. Our findings have implications for the modelling of human epidemics and planning pandemic control policies based on social distancing methods.
1. Introduction
The pattern of human interactions has important implications for the spread and management of infectious diseases (Hethcote & Yorke 1984; Garnett et al. 1996; Wallinga et al. 1999; Keeling & Eames 2005). The identification of ‘core groups’ of individuals with large numbers of interactions forms the basis of many sexually transmitted disease control policies (Hethcote & Yorke 1984; Macke & Maher 1999). Following the emergence of HIV/AIDS in the 1980s, many studies have attempted to quantify the structure of sexual interactions (Klovdahl et al. 1994; Liljeros et al. 2001; Wylie & Jolly 2001). Encounters between individuals who facilitate the transmission of airborne or close contact infections are harder to define and occur at greater frequency than sexual contacts (Eames & Keeling 2002). Social networks have been documented in the sociological literature, but are generally inappropriate for epidemiological purposes. Definitions of contacts that do not correlate closely with transmission opportunities, such as relationship-based definitions, inclusion of remote (letter, telephone or e-mail) interactions or the measurement of a particular subset of social contacts, render many such studies unsuitable from the epidemiological perspective (de Sola Pool & Kochen 1978; Bernard et al. 1990; Wasserman & Faust 1994; Dunbar & Spoors 1995; Beutels et al. 2006). As a consequence, there is relatively little available information about the patterns of human social interactions relevant to the transmission of many infectious diseases (Edmunds et al. 2006). Here, for the first time, we present the results of a detailed longitudinal survey that, although limited by its small sample size, was undertaken specifically to elucidate the structure of human social interactions that could permit the transmission of airborne and close contact infections.
2. The contact survey
A diary-based survey was conducted among a small convenience sample of 49 adult volunteers (see appendix A) over 14 non-consecutive days. Participants (egos) were instructed to record all face-to-face conversational encounters with other people (alters), whether each encounter included direct skin-to-skin physical contact and the social context of the encounter. As far as possible, alters were recorded by names or unique identifiers, allowing repeat encounters between the same ego–alter pair and encounters between a single alter and several different egos to be identified. A total of 8661 encounters involving 3528 different individuals were recorded. The interaction network recorded by the participants is illustrated in figure 1a.
Simple metrics of social interactions can be powerful determinants of epidemic progression and are central to many individual-based predictive models of human pandemics (Anderson & May 1991; Ferguson et al. 2003, 2006; Longini et al. 2005; Germann et al. 2006). The number of encounters an infectious individual makes with susceptibles, for example, sets an upper limit to the number of secondary cases (Wallinga et al. 1999), while high transitivity, or clustering, of contacts reduces the rate at which an infection can spread through a network (Watts & Strogatz 1998; Keeling 1999). As participants recorded two different levels of contact intimacy, we can compare the contact patterns that confront airborne diseases, requiring casual proximity of hosts to transmit, with infections that spread via closer contact between hosts. We assume that all encounters, whether including physical contact or not, permit casual contact transmission, whereas close contact transmission can only occur when an encounter includes physical contact. Self-reported contact diaries of this sort have been demonstrated to explain observed transmission patterns of an airborne infection (Wallinga et al. 2006).
Of the encounters reported, 14.6% included physical contact and 85.4% were conversational only. There is a marked difference between the daily frequency of these two contact types (figure 1b); individuals have approximately seven times more conversational contacts than physical contacts per day, and there is a much longer right-hand tail in the distribution of conversational contacts. Most encounters (57.56%; where the 95% binomial CI is 56.51–58.60) occur in the workplace and almost all of these (95.97%; 95.37–96.49) are conversational, while encounters involving physical contact mainly occur at home or in social contexts (figure 1c); the social network among colleagues has a high degree of clustering (see the electronic supplementary material).
Human social networks change over time: we typically do not meet exactly the same individuals every day. When considering disease transmission, it is not sufficient only to measure the number of contacts made by an infectious ego; it is also necessary to know how often each alter is encountered during the infectious period, i.e. how regular the interaction is. We find that the majority of encounters (76.70%; 75.26–78.07) occur with individuals never again encountered by the participant during the 14 days of the survey, irrespective of social context and/or intimacy of contact (figure 2, see the electronic supplementary material). Some of these non-repeated alters may have been encountered again if the survey had been extended. However, for an acute airborne infection with a short infectious period, these infrequent contacts may be considered as ‘one-offs’; this suggests that there is a strong random element to transmission routes for such infections (as represented by conventional mean-field models; see Anderson & May 1991). Repeated encounters between individuals are more likely at home or work than when socializing, shopping or travelling (see the electronic supplementary material). While it is surprising that home contacts include many infrequently encountered individuals, this can be explained by the type of housing reported by participants (figure 2d). The participants within large households (typically student accommodation within this study) have a greater proportion of irregular contacts, while the participants with family-type households have a more stable home contact structure. Pairs of individuals who have physical contact are far more likely to interact regularly within the home, but repeated physical contacts are uncommon among encounters made in a social context (figure 2c).
3. Modelling epidemics in social networks
We can use the social mixing data uncovered by this survey to help understand the spread of infectious diseases through human populations. We use the contact information in the survey to parametrize a weighted social network consisting of individuals with the same range of behavioural characteristics as those who completed the survey (see appendix A). Each individual in the network has the same social mixing characteristics as one of the survey participants: the same number and the same regularity of contacts. Each edge of the network has an associated weight representing the frequency of encounters between the two linked individuals; transmission across a link is proportional to the link weight. Having formed the network, we simulate a stochastic epidemic (Bartlett 1960; Anderson & May 1991; Keeling & Rohani 2007) spreading through the population (see appendix A).
To assess the influence of network effects and contact regularity, we compare the predictions from the detailed network model, consisting of fixed contacts of known weights, to several simplified alternatives. First, we retain the network but replace all the link weights with the average link weight. Second, we retain the variation in the weights of contacts but remove the network structure: each individual maintains his/her reported rate of encounters, but each is modelled as a random encounter within the whole population. Finally, we remove the variation in both contact weights and the network structure. We apply this for both conversational and physical interactions.
There is little observable difference between the epidemic models for infections that spread through casual (conversational) contacts (figure 3a); neither the placing of interactions on a network nor the variation in interaction weights significantly affects the average final size of the epidemic. Since the great majority of conversational contacts are only encountered once (figure 2a, see the electronic supplementary material), the impact of repeated encounters is low. By contrast, if the infection requires close (physical) contact to transmit, then the regularity of encounters and the weight of interactions are both important (figure 3b). In the case of physical interactions, there are fewer encounters and a higher proportion of contacts is encountered more than once (figure 2c), so these repeated interactions have a greater influence on an epidemic. The mean neighbourhood size (number of alters encountered by an ego) is 97.6 for casual contacts and 15.2 for close contacts (both excluding individuals with fewer than nine survey days), though not all ego–alter pairs meet with equal weight. In the simulations shown in figure 3, the basic reproductive rate, R0, for the infection in the unweighted mean-field model (Anderson & May 1991) reaches up to approximately 3 in the casual contact model and up to approximately 5 in the close contact model. The closer R0 is to the neighbourhood size, the more impact the network structure has on the progression and size of an epidemic (Riley & Ferguson 2006).
We can investigate the significance of encounters made within a particular context by simulating epidemics upon networks with the appropriate interaction context absent (see appendix A). We find that the work environment is the most important one for the spread of casual contact infections, whereas home and social settings are far more important for close contact transmission (figure 3c,d). This suggests that a control measure such as closing workplaces would effectively reduce the spread of casual contact infections but would have little impact on the spread of close contact infections. Our simulations suggest that restricting social gatherings would only have a significant impact for diseases that require close contact for transmission.
Our conclusions are unaltered by changing the initial infected seed size although, as expected, stochastic fade-outs are more likely when the seed size is small. The seed size also influences the variation between realizations (greatest when transmission rate is close to the critical value for an epidemic to take off, R0≈1), but the model assumption (mean field versus network; weighted versus unweighted) does not.
4. Discussion
Different diseases require different levels of contact between individuals to effect transmission (Beutels et al. 2006). Meningitis and smallpox, for example, are thought to normally require very close contact between individuals to cause infection, while influenza and measles are thought to transmit more easily via airborne droplets and therefore may only require conversational proximity between individuals to transmit. The study presented here cannot inform upon all potentially important routes of transmission, such as indirect fomite transmission from shared objects (e.g. contaminated door handles) or exposure that does not involve conversation or touch (e.g. a sneezing bus passenger). Self-reported conversational and physical contact diaries provide a useful way to collect data that, while not perfect, give a straightforward method of classifying the intensity of encounters. We find striking differences between the structural properties of the potential transmission networks for these two contact types. Casual contacts are more numerous and less regular than close contacts. A large proportion of encounters occur at work, and encounters between colleagues are predominantly casual contacts. The potential for transmission in the workplace, however, is curtailed by a high degree of clustering that reduces the transmission potential at the population scale (see the electronic supplementary material). Close contacts generally occur within social or home contexts, where clustering may well be even greater than among colleagues, especially among members of the same household.
We do not find very long tails on the distributions of daily contacts (figure 1b), suggesting that the distribution of social contacts does not follow a power law, as has been claimed for sexual contacts (Liljeros et al. 2001). The survey population does not contain any individual who met more than 61 individuals in a single day; only 5% of day reports returned more than 28 alters. The observed distribution may be biased by sample error and the limited demographic scope of the survey. It is worth bearing in mind that the participants, staff and students, were all drawn from the same university environment and the resulting social network structure may not be representative of different peer groups, such as those from other or differently sized institutions or workplaces. It would, therefore, be valuable to study the contact patterns of individuals from a wide range of occupations, particularly individuals, such as service workers, whose activities necessarily involve making many unique contacts during a day, and who could contribute to ‘super-spreading’ phenomena. However, as the number of contacts per day increases, handling time per contact, and therefore transmission probability, is expected to decrease (Dunbar & Spoors 1995; Feld & Carter 2002). Therefore, individuals with many social contacts may not prove to be as important in the transmission of casual or close contact infection as their number of contacts may suggest. In our simulations, we assume that all encounters present an equal opportunity for disease transmission. In reality, not all encounters are equal. Experience suggests that each encounter between individuals who meet frequently is likely to be of a closer intimacy and of a longer duration than ‘one-off’ encounters. Further studies are required to explore this aspect of social interactions and its impact on disease transmission.
The limited size of the survey reported here limits the generality of our findings; a larger survey with a more representative study population would help to better understand population behaviour. Ideally, we would like to know everything about social mixing patterns: who interacts with whom; how often; for how long; and how this relates to transmission risk. However, a full understanding of dynamic social networks requires a vast amount of information about each study participant and their interactions. Self-reported contact diaries such as those presented here already burden participants with a considerable amount of work—more complicated, detailed surveys may well have to rely on automated data collection methods rather than participant recall; this in turn may limit the size and demographical breadth of possible studies.
Increasing concerns about an influenza pandemic, and the perceived threat of bio-terrorism using infectious pathogens, have prompted many researchers to develop predictive models of national-scale epidemics (Ferguson et al. 2003, 2006; Eubank et al. 2004; Longini et al. 2005; Germann et al. 2006). Such models demand a detailed understanding of host mixing patterns and movement of individuals, and necessarily make a variety of behavioural assumptions. Interactions between individuals, however, are typically assumed to follow simple mass action principles, essentially random encounters, albeit restricted to a sub-population determined by social context. There is, therefore, a need to determine whether such mixing approximations are valid for the types of infection modelled. The work described here provides a detailed description of the number of daily contacts, their social context or location and the intimacy of the contact. Most encounters happen at work or home, and the networks in each location have different properties including the number and stability of interactions. We find that frequent repeated interactions are important for some infections, suggesting that control policies using social distancing measures should be tailored to the particular transmission mode of infection. For instance, the low level of physical contact within the workplace, if replicated more widely, suggests that there is little to be gained by closing office-type workplaces for diseases that spread through close contact. Predictive models of pandemic infections would be improved if they could incorporate this information.
A.1 Contact survey
Participants were asked to record all interactions with other people on 14 non-consecutive sample days, occurring every 10 days between 14 October 1997 and 7 March 1998, with one interval of 24 days between 13 December 1997 and 6 January 1998. A day was defined as lasting from waking until going to sleep. Encounters were defined as any face-to-face conversation or skin-to-skin physical contact (such as a handshake or kiss). The participants were asked to record encounters as either conversational or physical, and also to record the social context or location of the encounter, choosing from the following classes: ‘Home’, ‘Work or College’ (termed Work in our analyses), ‘Shopping’, ‘Travel’, ‘Social’ and ‘Other’ (amalgamated into Social in our analyses). The participants were also instructed to attribute a unique identifier (a name or a description) to each individual encountered and to use that identifier every time they recorded an encounter with the same individual. Completed questionnaire forms were collected the day following each sample day (or the Monday following a weekend sample day). Informed consent was obtained from all participants. There were 27 male and 22 female participants, and all participants were staff or students at the University of Warwick during the period of the survey. The identities of individuals, both participants and their contacts, were anonymized prior to analysis. Survey data are available upon request to W.J.E.
A.2 Analysis of repeat encounters
As not all 49 participants completed the questionnaire for the full 14 days (see the electronic supplementary material), combining information on repeat contacts for analysis is problematic, and although most participants reported contacts for 13 or more days, without care this heterogeneity in reporting could bias the analysis. To take account of this, we compute a binomial likelihood distribution for the observed number of repeat encounters between each pair of individuals. Thus, where participant i records for n days, during which they report encountering individual j on m different days, we can compute a likelihood probability distribution
A.3 Network formation
Including survey information on the frequency of encounters between alter–ego pairs enables us to develop weighted network models: each link has an associated weight that represents the frequency of contact, and modifies the rate of transmission between individuals (see below). Weighted networks containing individuals with the appropriate interaction characteristics on which to simulate epidemics are formed as follows: the population consists of a number, N, of copies of the survey participants, for each of whom is known the number of contacts whom they meet once during the survey, the number whom they meet twice and so on, up to a maximum of 14 encounters. Only survey participants with records for 9 or more days were used, excluding two individuals. For each individual, each of their contacts is treated as an unconnected ‘stub’, with an associated weight given by the frequency of meeting. Network formation requires the joining up of these stubs: this is achieved by choosing, for each stub, a randomly selected stub of the same weight from elsewhere in the population and joining the individuals together if no prior link exists between them. Thus, two individuals can only form a contact if they each have at least one interaction of the same weight. This method allows networks to be formed using the data from all contact events, or from contact events that only include physical contact, or contact events restricted to a particular setting or a set of settings, such as all contacts except work. Unweighted networks are formed using this method, as for fully weighted networks, but once the network is generated all weights are set to the average weight of links in the population .
A.4 Epidemic models
We simulate epidemics upon networks generated as described previously. Once generated, the network is fixed; dynamic social interactions are represented by weighted transmission across edges: the rate at which transmission takes place between an ego–alter pair is proportional to the frequency with which they interact. To explore the importance of repeated encounters and heterogeneity in contact frequency, we also make three alternative assumptions about how individuals interact. We consider an infection that generates lifelong immunity on recovery, so individuals can be susceptible, infected or recovered (and immune). All epidemics are seeded by infecting 1% of hosts selected at random, with the remainder being susceptible (simulations with a seed size of 0.1% of the population, five initial individuals, give the same qualitative results). Epidemics are iterated stochastically and updated after every time interval T. At each updating, each susceptible may become infected, and each infected individual can recover with probability 1−exp(−gT) to represent exponentially distributed infectious periods with recovery rate g. The probability of susceptible individual j becoming infected is given by 1−exp(−τFT), where τ is the transmission parameter and F, which depends on the model used, is a measure of the amount of interaction with infected individuals. For the models considered, F is given by
The authors would like to thank the participants of the survey for providing their time and information, and the volunteers who collected the data, in particular Chris Bauch, Ben Cooper, Joel Mossong and James Nokes. Also, thanks go to Roger Bowers, Nina Fefferman, Mark Handcock, Matt Keeling and Martina Morris for their helpful advice and comments, and we are grateful to two anonymous referees for their thoughtful comments. This work was supported by the National Institute of Health (J.M.R.), EPSRC and Emmanuel College, Cambridge (K.T.D.E.).
Footnotes
Electronic supplementary material is available at http://dx.doi.org/10.1098/rsif.2008.0013 or via http://journals.royalsociety.org.