Simon Cauchemez; Peter Horby; Annette Fox; Le Quynh Mai; Le Thi Thanh; Pham Quang Thai; Le Nguyen Minh Hoa; Nguyen Tran Hien; Neil M. Ferguson

doi:10.1371/journal.ppat.1003061

Abstract

Serological studies are the gold standard method to estimate influenza infection attack rates (ARs) in human populations. In a common protocol, blood samples are collected before and after the epidemic in a cohort of individuals; and a rise in haemagglutination-inhibition (HI) antibody titers during the epidemic is considered as a marker of infection. Because of inherent measurement errors, a 2-fold rise is usually considered as insufficient evidence for infection and seroconversion is therefore typically defined as a 4-fold rise or more. Here, we revisit this widely accepted 70-year old criterion. We develop a Markov chain Monte Carlo data augmentation model to quantify measurement errors and reconstruct the distribution of latent true serological status in a Vietnamese 3-year serological cohort, in which replicate measurements were available. We estimate that the 1-sided probability of a 2-fold error is 9.3% (95% Credible Interval, CI: 3.3%, 17.6%) when antibody titer is below 10 but is 20.2% (95% CI: 15.9%, 24.0%) otherwise. After correction for measurement errors, we find that the proportion of individuals with 2-fold rises in antibody titers was too large to be explained by measurement errors alone. Estimates of ARs vary greatly depending on whether those individuals are included in the definition of the infected population. A simulation study shows that our method is unbiased. The 4-fold rise case definition is relevant when aiming at a specific diagnostic for individual cases, but the justification is less obvious when the objective is to estimate ARs. In particular, it may lead to large underestimates of ARs. Determining which biological phenomenon contributes most to 2-fold rises in antibody titers is essential to assess bias with the traditional case definition and offer improved estimates of influenza ARs.

Author Summary

Each year, seasonal influenza is responsible for about three to five million severe illnesses and about 250,000 to 500,000 deaths worldwide. In order to assess the burden of disease and guide control policies, it is important to quantify the proportion of people infected by an influenza virus each year. Since infection usually leaves a “signature” in the blood of infected individuals (namely a rise in antibodies), a standard protocol consists in collecting blood samples in a cohort of subjects and determining the proportion of those who experienced such rise. However, because of inherent measurement errors, only large rises are accounted for in the standard 4-fold rise case definition. Here, we revisit this 70 year old and widely accepted and applied criterion. We present innovative statistical techniques to better capture the impact of measurement errors and improve our interpretation of the data. Our analysis suggests that the number of people infected by an influenza virus each year might be substantially larger than previously thought, with important implications for our understanding of the transmission and evolution of influenza – and the nature of infection.

Citation: Cauchemez S, Horby P, Fox A, Mai LQ, Thanh LT, et al. (2012) Influenza Infection Rates, Measurement Errors and the Interpretation of Paired Serology. PLoS Pathog 8(12): e1003061. doi:10.1371/journal.ppat.1003061

Editor: Ron A. M. Fouchier, Erasmus Medical Center, Netherlands

Received: August 9, 2012; Accepted: October 14, 2012; Published: December 13, 2012

Copyright: © 2012 Cauchemez et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was supported by research grants from the Wellcome Trust (grants 081613/Z/06/Z and 077078/Z/05/Z), the NIH MIDAS program, EU FP7 EMPERIE and PREDEMICS projects and the MRC. SC also thanks Research Council UK. SC received consulting fees from Sanofi Pasteur MSD for a project on the modelling of varicella zoster virus transmission. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: SC received consulting fees from Sanofi Pasteur MSD for a project on the modelling of the transmission of varicella zoster virus (i.e. different subject than submission). This does not alter our adherence to all PLOS Pathogens policies on sharing data and materials.

Introduction

Each year, seasonal influenza is responsible for about three to five millions severe illnesses and about 250,000 to 500,000 deaths worldwide [1]. These epidemics can generate important economic losses due to high levels of worker absenteeism as well as a saturation of emergency services at the peak of the epidemic [1]. In addition, avian or swine influenza viruses occasionally adapt to humans and generate influenza pandemics like in 1918, 1957, 1968 and 2009, sometimes with catastrophic consequences like in 1918, when 20 to 50 million people died worldwide.

Appropriate assessment of the epidemiological characteristics of the influenza virus is important to guide control policies. In particular, this requires being able to track the number of influenza cases with severe clinical outcomes (i.e. the tip of the severity pyramid) as well as the total number of people infected by an influenza virus (i.e. the base of the severity pyramid). For example, the case fatality ratio (proportion of influenza cases who die) is a key measure of severity that informs decision making during influenza pandemics, and takes the number of influenza related death as numerator and the number of influenza cases as denominator. Estimates of infection attack rates are also essential for characterizing the spread of the virus in human populations in order to predict epidemic trajectory, the potential impact of control measures such as social distancing measures, and the likelihood and magnitude of subsequent epidemics arising from continued circulation of the same virus [2], [3].

Although it is usually possible to estimate the number of severe influenza cases from sentinel surveillance (e.g. based on data collected at medical practices, clinics or hospitals), it is much harder to estimate the total number of people infected by an influenza virus. First, a substantial proportion of influenza infections are asymptomatic [4], [5]. Second, among those with symptoms, only a proportion seek healthcare; and this proportion may vary from season to season or even during the course of an epidemic. Last, Influenza-Like-Illness (ILI) symptoms are not specific to influenza. So, a substantial proportion of patients consulting for ILI may not have been infected by an influenza virus.

Serological studies have become the gold standard approach for estimating influenza infection attack rates due to the difficulty of estimating infection rates by other means. Although cross-sectional serological surveys can provide valuable and timely information, paired blood samples collected before and after an epidemic in a cohort of individuals is the optimal approach for precisely assessing infection rates. The haemagglutination-inhibition (HI) assay remains the most commonly used approach for detecting serological evidence of recent influenza infection [6]–[12]. The assay detects the presence of antibodies that prevent the haemagglutinin protein of the influenza virus from agglutinating red blood cells [13], [14]. For each serum sample, antibody titers are expressed as the reciprocal of the highest serum dilution that can still prevent a fixed concentration of virus from agglutinating red blood cells. A rise in antibody titers between the first and second blood is taken as a marker of infection. However, because the procedure is susceptible to measurement errors, a 2 fold rise (that is a 1-dilution increase) is usually considered as insufficient evidence for infection. Seroconversion is therefore typically defined as a 4-fold rise (i.e. a 2-dilutions increase) or more in antibody titers. This ad-hoc rule became established when these methods were first developed and is now widely adopted [15], [16]. In the meantime, however, statistical methods for addressing measurement errors have made substantial progress. In particular, there is now an extensive body of literature on methods to ensure that the presence of measurement errors does not bias estimates of key parameters of interest. Given these developments, it is timely to revisit the way serological data are interpreted.

Central to the traditional approach to analyzing serological data is the belief that data about 2-fold rises provide no information since such increases can be caused by frequent measurement errors. This concern about measurement errors is certainly relevant when trying to make specific diagnoses for individual cases. For example, one may be averse to the risk of false positives; but less so to the risk of false negatives. However, estimating infection attack rates at the population level is a very different aim from setting up a specific diagnostic tool, and may benefit from a different use of the data.

First, it is important to note that estimating infection attack rates is not just a matter of specificity (i.e. ensuring that subjects satisfying the diagnostic definition of infection were indeed infected by an influenza virus) but also a matter of sensitivity (i.e. ensuring that all subjects infected are diagnosed as such). An approach that favours specificity over sensitivity may lead to underestimating infection attack rates.

A second important observation is that, even in a context of frequent 2-fold errors, data about 2-fold rises may still be informative. Consider for example a situation where all individuals exhibit a 2-fold rise during the season: such a pattern cannot be explained by measurement error alone since measurement errors are made both at baseline and post-epidemic and should be about equally distributed provided the sample size is sufficiently large.

Here, we explore how modern statistics for the analysis of data with measurement errors can change and improve our interpretation of serology. We present a new method to quantify errors in the measurement of antibody titers and to estimate the true distribution of paired serological measurements corrected for measurement errors. The methodology is applied to data collected in a cohort study conducted in Vietnam between 2007 and 2009.

Results

Measurement errors

We estimate that the 1-sided probability of a 2-fold error was 9.3% (95% CI: 3.3%, 17.6%) when the true antibody titer was below detection levels, rising to 20.2% (95% CI: 15.9%, 24.0%) otherwise (posterior probability that latter larger than former: 98.7%). There was a satisfying fit of the model to replicate measurement data (Figure 1). The model where measurement errors were independent of true antibody titers failed to fit the data (Figure S2 and Supplementary Material).

Download:

PPT
PowerPoint slide
PNG
larger image ()
TIFF
original image ()

Figure 1. Fit of the model to data on replicate measurements.

Observed (red point) and expected (mean: blue point/95% CI: blue bar) number of pairs (observed AT level, replicate AT level). Pairs are sorted by panel according to the number of dilution difference between the observed and the replicate measurement.

doi:10.1371/journal.ppat.1003061.g001

Distribution of true paired serology

Figure 2 summarizes the distribution of paired serology, corrected for measurement errors for the different seasons (2008, Spring 2009, Autumn 2009) and subtypes (H1N1, H3N2 and B). A range of observations can be made.

Download:

PPT
PowerPoint slide
PNG
larger image ()
TIFF
original image ()

Figure 2. Distribution of paired serology, corrected for measurement errors as a function of season (2008, Spring 2009, Autumn 2009) and subtype (H1N1, H3N2 and B) (in Autumn 2009, subtyping was only conducted for H1N1pdm09).

In each panel, individuals are sorted by baseline AT levels on the y-axis. For a given baseline, the grey bar indicates the expected proportion of individuals with post AT level equal to baseline AT level; the yellow bar indicates the proportion with a 2 fold rise (2f.r.); the red bar indicates the proportion with a 4 fold rise or more (4f.r.+); the green bar indicates the proportion with a decay. The black thin lines give the 95% CI. The legend gives the mean [95% CI]. A: H1N1, 2008. B: H3N2, 2008. C: B, 2008. D: H1N1, Spring 2009. E: H3N2, Spring 2009. F: B, Spring 2009. G: H1N1pdm09, Autumn 2009.

doi:10.1371/journal.ppat.1003061.g002

The first observation concerns 2-fold rises in antibody titers between baseline and post serology (yellow bars). Such increases are usually ignored in analyses because 2-fold errors are common. In some instances, like for example subtypes H3N2 and B in 2008 and H1N1pdm09 in Autumn 2009, 2-fold rises appeared negligible and at levels that could be generated by measurement errors alone, since 0 was within the 95% CI of the estimated proportion of subjects having a 2-fold rise (Figures 2B, 2C, 2G). In other instances, however, the proportion of individuals experiencing a 2-fold rise ranged from 20% to 33% with lower bounds of the 95% CIs above 0 (range: 7%–23%), indicating that these rises cannot be solely explained by measurement errors. Assuming that most of these 2-fold rises were due to infection, our estimate of infection attack rates for H1N1 in 2008 and H1N1, H3N2 and B in Spring 2009 would be dramatically higher than traditional estimate based on 4-fold rises or more (Figure 3A). So, even if only a proportion of the 2-fold rises were due to influenza infections, the traditional estimate might still represent a substantial underestimate of the true infection attack rates

Download:

PPT
PowerPoint slide
PNG
larger image ()
TIFF
original image ()

Figure 3. Increases in antibody titers.

A: Posterior distribution of the percentage of subjects with a 4 fold rise or more in AT (pink) and with a 2 fold rise or more in AT (blue) for the different subtypes and the different seasons (2008 (08), Spring 2009 (S09), Autumn 2009 (A09)). B: Posterior distribution of the percentage of subjects with a 2 fold rise in AT among those with a rise in AT. Boxplots give percentiles 2.5%, 25%, 50%, 75%, 97.5% of the distribution.

doi:10.1371/journal.ppat.1003061.g003

The fact that and were very similar for H3N2 and B in 2008 and virtually identical for H1N1pdm09 in Autumn 2009 (Figure 3A) highlights important heterogeneities in the way antibody titers increase by season/subtype (Figure 3B). For example, for H1N1pdm09 in Autumn 2009, almost all those experiencing a rise in antibody titers exhibited a 4-fold rise or more; but for H1N1 in 2008, most of those experiencing a rise only had a 2-fold increase. The absence of a simple linear relationship between and the proportion of 2-fold rises suggests that the standard approach of inflating by a fixed proportion (generally equal to the proportion of PCR positive cases who do not seroconvert; around 10–20%) to get corrected estimates of infection attack rates may be inappropriate. Rather, corrections might have to be applied on a season-to-season and subtype-to-subtype basis.

The last notable observation is that decay in antibody titers is observed. For example, 30% (95% CI: 22, 36) of individuals exhibited a decay for subtype H3N2 in 2008.

PCR positive cases

Figure 4 shows the observed rise in antibody titers for PCR positive cases. Twenty seven percent of these cases experienced no rise or only a 2-fold rise in titer during the sea