 Research article
 Open Access
 Published:
The usefulness of “corrected” body mass index vs. selfreported body mass index: comparing the population distributions, sensitivity, specificity, and predictive utility of three correction equations using Canadian populationbased data
BMC Public Health volume 14, Article number: 430 (2014)
Abstract
Background
National data on body mass index (BMI), computed from selfreported height and weight, is readily available for many populations including the Canadian population. Because selfreported weight is found to be systematically underreported, it has been proposed that the bias in selfreported BMI can be corrected using equations derived from data sets which include both selfreported and measured height and weight. Such correction equations have been developed and adopted. We aim to evaluate the usefulness (i.e., distributional similarity; sensitivity and specificity; and predictive utility visàvis disease outcomes) of existing and new correction equations in populationbased research.
Methods
The Canadian Community Health Surveys from 2005 and 2008 include both measured and selfreported values of height and weight, which allows for construction and evaluation of correction equations. We focused on adults age 18–65, and compared three correction equations (two correcting weight only, and one correcting BMI) against selfreported and measured BMI. We first compared population distributions of BMI. Second, we compared the sensitivity and specificity of selfreported BMI and corrected BMI against measured BMI. Third, we compared the selfreported and corrected BMI in terms of association with health outcomes using logistic regression.
Results
All corrections outperformed selfreport when estimating the full BMI distribution; the weightonly correction outperformed the BMIonly correction for females in the 23–28 kg/m^{2} BMI range. In terms of sensitivity/specificity, when estimating obesity prevalence, corrected values of BMI (from any equation) were superior to selfreport. In terms of modelling BMIdisease outcome associations, findings were mixed, with no correction proving consistently superior to selfreport.
Conclusions
If researchers are interested in modelling the full population distribution of BMI, or estimating the prevalence of obesity in a population, then a correction of any kind included in this study is recommended. If the researcher is interested in using BMI as a predictor variable for modelling disease, then both selfreported and corrected BMI result in biased estimates of association.
Background
Obesity’s rise in prevalence over the past 30 years [1], coupled with knowledge of its public health burden [2–6], has opened debate over the best way to measure adiposity in populations. The body mass index  BMI [weight (kg)/height^{2} (m^{2})]  is a common measure in populationbased surveys: it is relatively inexpensive, simple, and nonintrusive. Notwithstanding the limitations of BMI as a measure of adiposity [7–10], it continues to be recommended by the World Health Organization [10] as the appropriate criteria to assess obesity status in populations due to the high correlation of high BMI with excess body fat and poor health outcomes [1].
The practice of gathering selfreported height and weight data from survey respondents has raised concerns about the inaccuracy of selfreported data. Where comparison to measured weight is possible, studies have demonstrated misreporting (both under and overreporting) of weight, which varies by sex, age, race/ethnicity, and BMI [11–19]. The consequence of this misreporting is that the potential exists for large bias in estimates of prevalence and measures of association in studies that use selfreported BMI [7].
By using populationbased datasets which contain both measured and selfreported values of BMI for the same individuals, it is possible to develop statistical adjustments that bring selfreported values of BMI closer to measured values. These correction methods can then be applied to datasets that only contain selfreported height and weight, resulting in a corrected height and weight and thus improving the usability of those datasets. Attempts to develop correction methods for use with Canadian data have resulted in two important papers. Connor Gorber et al. [20] used data from the 2005 Canadian Community Health Survey (CCHS) and, after considering many potential covariates, concluded that the most parsimonious and effective correction equations for men and women rely on one variable: selfreported BMI. Shields et al. [21] used data from the 2007–2009 Canadian Health Measures Survey to develop a correction equation, which was then used to correct selfreported BMI measures from the 2008 CCHS. This second paper likewise concluded that a correction equation using only selfreported BMI is appropriate for correcting BMI values. In both studies, additional covariates did not add enough predictive power to the models to justify the added complexity of including them in the correction equations.
The practice of correcting based on BMI alone (i.e., no other covariates) is appealing in its simplicity. Other authors have included covariates in an effort to increase the accuracy of their correction equations; including age [22], leisure time physical activity, selfreported health [14], education level [23], and ethnicity [24], among others. These studies find that including additional covariates in a correction equation can increase the correction equation’s accuracy when adjusting BMI or categorizing individuals by obesity status (e.g., BMI > 30). In this paper we did not include additional covariates to maximize comparability with the existing Canadian correction equations.
A separate issue is whether corrections should be based on BMI, or weight alone. Studies from the United States [11, 13], as well as from Sweden [14], and France [15] verify that average misreporting increases as measured weight and/or BMI increases, suggesting that correcting on either BMI or weight may be acceptable. Correcting on weight only raises the question of whether or how to deal with height. It is welldocumented that height is subject to bias in the elderly, i.e., those over age 60 [11, 22] and that bias in height has been shown to be substantial [25, 26]. Furthermore, international evidence shows that height bias also exists among those under age 60, is strongest for the shortest males and for females, and that bias might be changing over time [23, 24, 27]. Correction equations developed by other authors have directly corrected for either BMI as a whole [14, 20, 21, 23, 26], or height and weight separately and calculated a corrected BMI from those values [12, 23, 24] to incorporate both weight and height. However, there is evidence to suggest that average bias in selfreported height among adults as a whole is quite small: based on five national surveys conducted in Canada and the United States on males and females aged 18 to 74, the range of average bias in selfreported height ranged from 0.2 cm to 1.4 cm [28]. Because the bias in working age adults is almost wholly from bias in selfreported weight [11–17, 26], it is worth considering the value of correcting on weight only rather than overall BMI for the general population. This paper will consider such a correction.
The purpose of this study is to evaluate the usefulness of existing and new correction equations for BMI in populationbased research. To accomplish this, we have three objectives: 1) compare the selfreported and corrected BMI distributions; 2) compare selfreported and corrected BMI to measured BMI based on sensitivity and specificity of measured obesity; and 3) compare selfreported and corrected BMI to measured BMI in regression models of various health conditions, in terms of statistical significance, coefficient magnitude, and direction of the coefficient (above or below the measured coefficient). We compare three correction equations: first, an existing Canadian correction equation [20] (a correction that used selfreported BMI, so will be referred to as the “BMIonly” correction); second, a new correction equation developed here which corrects values of weight only; and third, another correction equation developed here which is a computationally simpler version of the weightonly correction. The term “weightonly” means that we use a corrected value for weight but selfreported height to correct overall BMI.
Methods
Data
We used data from two cycles (2005 and 2008) of Statistics Canada’s Canadian Community Health Survey (CCHS). The CCHS is a repeated crosssectional survey that provides sociodemographic and health information for individuals living in the ten Canadian provinces. The CCHS uses a multistage cluster sampling procedure to derive a sample that is representative of the Canadian population, excluding those that live in institutions, on First Nations reserves, on Canadian Forces bases, and in certain remote areas; the CCHS is representative of approximately 98% of the Canadian population over age 12 [29]. The overall response rates for households were 87.0% for the 2005 CCHS was and 85% for the 2008 CCHS [21].
In the 2005 and 2008 iterations of the survey, a random subsample of individuals was asked to selfreport their height and weight; those values were subsequently measured by the interviewer. The respondents were not told they would be measured when they selfreported their height and weight. The response rate for the subsample was 64.2% for 2005 CCHS and 59.7% for the 2008 CCHS; no information is available for reason of refusal to be measured [21]. We focused on individuals aged 18 to 65 for whom both selfreported and measured BMI data were available. We excluded adults over age 65, because of observed overreporting of height in the over 65 age group [11, 25, 26].
We used the master file versions of the CCHS, accessed through the Research Data Centres (secure data laboratories) program in Canada. Access was granted by Statistics Canada via the Canadian Research Data Centre Network (CRDCN) through a standardized application process. All analyses incorporated sampling weights as directed by Statistics Canada and were conducted in Stata 11.2. Ethics approval for this project was obtained from the University of Calgary’s Conjoint Health Ethics Research Board (Ethics ID: E23704).
Procedure
Below, we first describe the development of the weightonly correction equations. Then, we describe the procedure for achieving our two objectives that compare selfreported and corrected BMI to measured BMI. All analyses, including those involved in developing the correction equations, are conducted for males and females separately.
The justification for modelling misreporting based on weight only is best shown graphically. Additional file 1 shows three quantilequantile plots comparing measured BMI to three other BMI measures: selfreported BMI (graph a); a BMI constructed from selfreported weight and measured height (graph b); and a BMI constructed from measured weight and selfreported height (graph c), for males and females. The graphs show the average BMI for each percentile of measured BMI against the average BMI for each percentile of the BMI measures containing at least one selfreported value. Note that the quantilequantile plots in graphs a and b look similar, that is, there is very little improvement in modelling measured BMI by replacing selfreported height with measured height. Graph c shows that there is a large improvement in modelling measured BMI using only measured weight, which indicates that the majority of the measurement error of the distribution of selfreported BMI comes from selfreported weight, not height, in our sample. Thus, the weightonly correction should be addressing the main source of measurement error in the sample of working age individuals.
a) Development and estimation of weightonly correction equation
We can model the selfreported value of BMI as being a function of an individual’s measured (true) BMI multiplied by a misreporting term:
That is, an individual’s selfreported BMI (which is their selfreported weight (W _{ SR }) over their selfreported height squared (${\mathit{h}}_{\mathit{\text{SR}}}^{2}$) is equal to their measured BMI and a misreporting term that is made up of random noise, ϵ, and measured weight W _{ M }. Equation (1) was chosen because, with the right parameters^{a}, we can mimic the nonlinear relationship between measured and selfreported BMI described in the literature, whereby the discrepancy increases across measured BMI at an increasing rate due to increases in weight. By including weight in the exponential term, we allow the difference between selfreported BMI and measured BMI to grow at an increasing rate as measured weight increases, a relationship that is supported by published literature [11–17]. A linear error term would not accurately capture that association.
Next, we can take the natural logarithm of both sides of the equation to reduce the BMI relationship into its constituent parts, which can then be rearranged into the following equation:
where ln(W _{ SR }), the natural logarithm of an individual’s weight, is a function of measured weight and the ratio of selfreported to measured height (the relative misreport in height). This is an equation for which we can estimate regression parameters, using a sample of individuals with both measured and selfreported height. The estimated regression equation is:
Where i denotes individual values. If the ratio of selfreported to measured height is not related to selfreported weight on average (which we would expect from the literature on nonseniors), then ${\widehat{\mathit{\beta}}}_{3}$ would be statistically equal to zero, leaving us with the equation:
The restriction necessary for equation (3) to be an appropriate step, that ${\widehat{\mathit{\beta}}}_{3}$ is statistically equal to 0, was tested using our dataset during the model building exercise and is not an assumption. Specifically, using a ttest, we could not reject the null hypothesis that the coefficient ${\widehat{\mathit{\beta}}}_{3}$ was equal to 0 (pvalue of 0.609 in males and 0.559 in females). Removal of the ratio of selfreported to measured height did not impact the values of the other coefficients in the model.
Equation (3) can be rearranged to put measured weight in terms of selfreported weight and estimated parameters (constants):
One can solve this equation numerically using a dataset that contains both measured and selfreported weight. Using selfreported values, we backsolved for measured weight by iteratively substituting in values for measured weight until the equality held at a predetermined tolerance (number of decimal places, in our case 0.001) to solve for a corrected weight in place of measured weight.
The parameter associated with the measured weight term in misreporting, ${\widehat{\mathit{\beta}}}_{2}$ in equation (2), could be quite small in practice. ${\widehat{\mathit{\beta}}}_{2}$ could be quite small because it is the coefficient of a variable that is measured in kilograms while the other terms in the equation are measured in the natural log of kilograms. Thus, measured weight, the variable that ${\widehat{\mathit{\beta}}}_{2}$ is associated with, need only have a small effect to make a large impact on the natural log of selfreported weight. It might be the case that, for the ranges of BMI exhibited by the majority of the population, ${\widehat{\mathit{\beta}}}_{2}$ is essentially zero. To accommodate this possibility, we also consider an alternative form of equation (1):
Equation (5) assumes that natural log of selfreported weight depends only the natural log of measured weight and on a constant term that captures the average effect of unobserved variables. This equation leads to a much simpler regression equation:
and correction equation if we follow the same steps as outlined for equation (1):
For the remainder of this paper, equation (4) will be referred to as the “weightonly correction”, and equation (7) as the “simple weightonly correction”. Equation (6) is similar to the equation developed by Connor Gorber et al. [20] in the sense that it does not include other covariates other than the measured and selfreported versions of the variable being corrected. The evaluation process below aims to test correction equations that would be widely usable due to their simplicity. Connor Gorber et al. [20] showed that the inclusion of other covariates such as age, perception of one’s own weight, life dissatisfaction, ethnicity, and activity limitations did not importantly improve the accuracy of their models. Thus, we do not include any other variables in the correction equations to facilitate comparison with the existing recommended Canadian correction equations.
We first defined the full sample and identified outliers. The full sample consists of the pooled crosssection of the 2005 and 2008 CCHS respondents who provided measured and selfreported height and weight (n = 6294: 3208 female, 3086 male), restricted to workingage individuals (age 18 to 65) and nonbreastfeeding, nonpregnant women. Outliers (n = 145) were defined as those for whom the discrepancy between selfreported and measured height or weight exceeded three standard deviations from the sex and cyclespecific mean discrepancy. The identification and removal of outliers served to remove their potentially undue influence during correction generation.
The full sample of nonoutliers was split randomly in half. One half, randomly selected, was used to calibrate the weightonly correction model parameters (the “model generating group”, n = 3084). The other half (“test group”, n = 3210) was used to test the model parameters to see how well the adjusted BMI values compared to measured BMI. The previously excluded outliers were included in the test group to simulate a real dataset where outliers may appear to have reasonable values of selfreported height or weight and may be impossible to identify and exclude. The regression equations of interest, (3) and (6), were run on the model generating group for males and females separately to obtain the necessary parameters for the correction equations. Male and femalespecific correction equations were developed separately to allow for known different trends in misreporting weight for males and females [12, 13], and to match the convention used in other Canadian corrections [20, 21] and international corrections [14, 23, 24]. This stratification by sex was maintained for all analyses.
The modelgenerating group consisted of only working age adults (i.e., 18–65), but the correction equations are appropriate to apply to adults over age 65. This was confirmed by a Chow test for equation (3) which was run separately for males and females. The mutually exclusive groups under consideration were 1) working age adults and 2) adults over age 65. The null hypothesis that the models have the same coefficients for both working age adults and adults over age 65 could not be rejected (for males the pvalue was 0.167 for females the pvalue was 0.292).
After obtaining estimates for the parameters, we applied correction equations (4) and (7) to the test group. This model solves for corrected weight, so to make a corrected value of BMI with the weightonly model we used corrected weight with selfreported height. For the test group, we report weightonly as well as simple weightonly corrections, along with BMIonly corrections using the Connor Gorber et al. model [20].
b) Evaluation of correction equations: comparison of BMI distributions; estimation of sensitivity and specificity for weight categories; and prediction of health outcomes
The different weightonly correction equations (regular and simple), which are calibrated versions of equations (4) and (7), for males and females, are given below:
Males (Weightonly Correction):
Females (Weightonly Correction):
Males (Simple weightonly Correction):
Females (Simple weightonly Correction):
We evaluate the correction equations for usefulness in three ways: first, we compare selfreported, corrected, and measured BMI distributions. Second we compare selfreported and corrected BMI to measured BMI by estimating the sensitivity/specificity of the equations visavis BMI categories of normal weight (18 < BMI < 25), overweight (25 ≤ BMI < 30), and obese (BMI ≥ 30). Lastly, we compare selfreported and corrected BMI to measured BMI by association with selected health outcomes in regression models. We compare the regression results in terms of statistical significance, coefficient magnitude, and direction of the coefficient (above or below the measured coefficient). Assessment of association with health outcomes entailed comparing coefficients (statistical significance, magnitude, and direction) across the different corrected BMI values in regression equations modeling arthritis, heart disease, diabetes, high blood pressure, selfreported health, and activity limitation, following Connor Gorber et al. [20]. We use the BMI categories of normal weight (18 < BMI < 25), overweight (25 ≤ BMI < 30), obese (30 ≤ BMI < 35), and obese class II or higher (BMI ≥ 35) for this part of the analysis. We further restrict this part of the analysis to individuals aged 40 or older, to follow the convention set by Connor Gorber et al. [20]; however we also test the disease association models with the full age range (18–64). Throughout our analysis we do not report results for underweight individuals because there were so few underweight individuals by measured BMI.
Results
Distribution of BMI
Figures 1 and 2 show the BMI distributions estimated from the measured, selfreported, and corrected BMI values. For males the corrected distributions all trend together, and all are closer to the measured distribution than to the selfreported distribution. For females, the weightonly corrections resemble the measured BMI distribution more closely than the BMIonly correction does, most notably between BMI 23 kg/m^{2} and 28 kg/m^{2}.
Sensitivity and specificity of corrected and selfreported BMI
Table 1 displays the sensitivity and specificity estimates based on the selfreported and corrected values for BMI for males and females. Focusing on those instances with at least a 5 percentage point difference in sensitivity or specificity within sexBMI category groups, we observed the following patterns: All corrections were similar to one another and superior to selfreport in specificity of normal weight among women, sensitivity of overweight among both men and women, and sensitivity of obese among both men and women. There were two instances in which selfreport was superior to corrected values: sensitivity of normal weight among both men and women. Overall, any corrected BMI was superior to selfreported BMI in estimating prevalence within weight categories.
Predictive utility of corrected BMI in health condition models
Tables 2 and 3 show the results of models regressing six health conditions on BMI (measured; selfreport; BMIonly correction; weightonly corrections) for men and women, controlling for age. Focusing on differences in statistical significance (i.e., presence/absence) between coefficients in the measured BMI model and coefficients in each of the other models, the following is apparent: For men, of the 18 coefficients in each column, 16 in the selfreported BMI column had the same statistical significance status as measured BMI. The numbers for the BMIonly, weightonly, and simple weightonly corrections were 15/18, 16/18, and 15/18, respectively. For women, of the 18 coefficients in each column, 13 in the selfreported BMI column had the same statistical significance status as measured BMI. The numbers for the BMIonly, weightonly, and simple weightonly corrections were 15/18, 15/18, and 14/18, respectively. From this preliminary consideration of statistical significance, the correction equations and selfreport BMI appear to perform similarly for men, while for women the correction equations appear to perform similarly to each other and better than selfreported BMI.
In terms of the magnitude and direction of the coefficients for the relationships between BMI (selfreported and corrected) and disease outcomes (Tables 2 and 3), the selfreported and corrected BMI measures did not exhibit a consistent pattern for males or females across diseases. For males, the corrected odds ratios were higher than the measured odds ratios in almost all comparisons; exceptions were heart disease and diabetes (obese class I), high blood pressure (obese class II+), and selfreported health (overweight). All of the corrected odds ratios for activity limitation were lower than the measured estimates. For females, the corrected odds ratios are higher than the measured odds ratios for heart disease, diabetes, high blood pressure, and selfreported health (except for the simple weightonly correction for obese class I). The corrected odds ratios for activity limitation are all lower than the measured odds ratios, and for arthritis the weightonly corrections were lower than the measured odds ratios for overweight (simple correction only) and obese class II+. Thus, the magnitudes and directions of the estimated coefficients in the health condition models do not clearly point to a superior correction equation^{b}.
Furthermore with respect to estimates in Tables 2 and 3, the corrected odds ratios tended to have wider confidence intervals (suggesting less precision of estimate) than the corresponding measured odds ratios, but not in every case. Considering only obese class II + as an example, for males the corrected confidence intervals were wider than the measured confidence intervals for arthritis, heart disease, diabetes, and selfreported health. For females the corrected confidence intervals were wider than the measured confidence intervals in all cases but one (simple weightonly correction for arthritis).
Discussion
Distribution of BMI
A corrected BMI distribution, regardless of whether BMIonly or weightonly, was found to be more accurate than the selfreported BMI distribution (see Figures 1 and 2). However, this only refers to the ability of the correction equations to bring the distribution of selfreported BMI into line with the distribution of measured BMI, unconditional on any other variables or restrictions. In terms of which correction is better, for males, it is not obvious that there is a superior corrected distribution. For females, the weightonly corrections follow the measured BMI distribution more closely than the BMIonly correction, making the weightonly corrections the best overall candidate correction for simply constructing the BMI distribution.
Sensitivity and specificity
Findings from the sensitivity and specificity analyses were mixed, with the weightonly and BMIonly performing similarly well in some cases (specificity for normal weight women, sensitivity for overweight men and women, and sensitivity for obese men and women), and selfreport best in others (sensitivity for normal weight men and women). The high sensitivity for selfreported BMI in the normal categories for females is consistent with past observations that normal BMI females, on average, under or overreport weight to a lower degree than other BMI groups [20, 21]. In the absence of a superior correction equation, researchers must tradeoff sensitivity and specificity when choosing a correction equation. The outcome of these tradeoffs depend on the situation: for example, if a researcher were interested in studying a sample of predominantly normal weight males, where the weightonly corrections result in losing 10 percentage points in sensitivity in exchange for 6 percentage points in specificity, it is not clear that a weightonly correction is superior to selfreported BMI. Simply stating that one percentage point gain offsets another is inappropriate, especially when specificity may be more important for conditions like obesity, where the majority of the population is not obese [7].
Predictive utility of corrected BMI in health condition models
Taking the population distribution and the sensitivity/specificity findings as a whole, our results suggest that if a researcher is interested in BMI statistics across a population, including estimating prevalence within weight categories, any correction presented here will be preferable to selfreported BMI. However, findings from the analysis of associations with health outcomes tell a different story: namely, findings were mixed, such that no correction was uniformly superior and in some cases the selfreported data outperformed the corrected data. None of the correction equations were able to consistently provide coefficient estimates closer to the measured BMI estimates than those provided by the selfreported estimates. The implication of this finding is that in a disease modelling context correction equations are not necessarily more useful than selfreported BMI.
Our findings differ from other Canadian correction equation studies that have showed the BMIonly correction consistently provides coefficient estimates closer to the measured values [20, 21]. In statistics, it is recognized that including a variable exhibiting measurement error on the right hand side of the regression equation can provide a biased estimate of effect [30]. The magnitude of this bias depends on the variance of the mismeasured variable (in our case, measured BMI) and the variance of the measurement error itself (in our case, the difference between measured and selfreported BMI). While the corrections presented here adequately correct the average BMI at percentiles of interest and can be used to estimate the distribution of BMI or the prevalence of obesity, they are unable to correct the variance of selfreported BMI. As a consequence, corrected BMI measures provide biased and inconsistent estimates of association when used as regressors, just as any mismeasured variable would. The key issue in our case is whether the magnitude of the bias is substantial (i.e., clinically or socially significant). We suggest that the magnitude of this bias is too large, and the direction of the bias too unpredictable, for corrected BMI variables to be used in this context of modelling health outcomes.
The health outcome association analysis shows that, given a disease in a logit model and a categorical BMI variable (a very common modelling convention), researchers should not necessarily use a correction equation in an effort to improve selfreported BMI. Further, if a researcher, seeking a conservative range of estimates, chose to report both a corrected and a selfreported estimate, it is not clear that the difference between the two should be meaningful anyway in this regression framework: when the corrected estimate is larger than the selfreported coefficient, it does not conform to the idea that the selfreported BMI estimate serves as an upper bound; when the corrected estimate is smaller than the selfreported coefficient, it is not a guarantee that the corrected estimate is closer to the measured BMI estimate or even on the correct side of the null value. Resorting back to selfreported BMI is going to provide biased estimates of association between BMI and the dependent variable, as well as the biased and inconsistent estimates of association for all the other regressor variables in the model [30]. In short, our results suggest that if a researcher is interested in using BMI as a predictor variable for modelling disease, then both selfreported and corrected BMI result in biased estimates of association.
Limitations
It is important to acknowledge that, because correction equations are based on particular populations, which change over time and place, the equations themselves are likewise somewhat time and place dependent and should be updated over time. Changing misreporting patterns over time have been shown using data from the United States [27], and Ireland [31] and may reflect, in part, changing social attitudes about obesity. The fact that correction equations can change across time is important for modelling the BMI distribution, but updating a correction equation will not fix the issue of a corrected BMI providing biased and inconsistent regression results unless the update somehow deals with the variance of the misreported variable.
Although the response rate for the CCHS surveys as a whole were reasonably high (87.0% for the 2005 CCHS was and 85% for the 2008 CCHS), another limitation of the study is the lower response rate for the subsamples among which both selfreported and measured data were available and which constituted the basis for this study. The overall response rate for the subsample that provided measured height and weight was 55.9% for 2005 CCHS and 50.7% for the 2008 CCHS, most of the nonresponse was from refusal to be measured [21]. It is unlikely that those who refuse to be measured are randomly distributed across the BMI distribution, so if the individuals who are heavier are refusing to be measured, any correction equation based on this dataset will be inaccurate, including others that have been developed [20, 21]. Strategies to improve response rate across the population are thus desirable.
Conclusions
The BMIonly correction has been applied to Canadian data extensively (e.g., Orpana et al. [32], Janssen et al. [33], and Barberio and McLaren [34]), attesting to the value of such corrections in the literature. Our findings support the use of BMIonly corrections if the researcher is interested in reporting the distribution of BMI, the prevalence of those above and below the obesity threshold of BMI 30 kg/m^{2}, or any other cutpoint. On the other hand, if the researcher is interested in estimating the effect of BMI on a health condition, then our findings suggest that corrected BMI, using any of the methods examined here, does not represent an improvement over selfreport data.
Endnotes
^{a}“The right parameters” implies that the correct equation would not have a coefficient of 1 on every term, as it stands in equation (1). For instance, a data generating process with the coefficients 0.01 on the measured weight term in the error and 0.01 on the noise term would generate an exaggerated version of the relationship we observe in the literature. So it is just a matter of finding the right parameters to match the data.
^{b}The disease models were repeated for the entire sample (age 18 to 65). The results of those models (not reported) are consistent with those from the truncated sample; namely, they do not clearly point to any correction equation being superior.
Abbreviations
 BMI:

Body mass index
 CCHS:

Canadian Community Health Survey.
References
 1.
Must A, Evans EW: The Epidemiology of Obesity. The Oxford Handbook of the Social Science of Obesity. Edited by: Cawley J. 2011, New York: Oxford University Press, 9
 2.
Calle EE, Rodriguez C, WalkerThurmond K, Thun MJ: Overweight, obesity, and mortality from cancer in a prospectively studied cohort of U.S. adults. N Engl J Med. 2003, 348 (17): 16251638. 10.1056/NEJMoa021423.
 3.
Mokdad AH, Ford ES, Bowman BA, Dietz WH, Vinicor F, Bales VS, Marks JS: Prevalence of obesity, diabetes, and obesityrelated health risk factors, 2001. JAMA. 2003, 289 (1): 7679.
 4.
Field AE, Coakley EH, Must A, Spadano JL, Laird N, Dietz WH, Rimm E, Colditz GA: Impact of overweight on the risk of developing common chronic diseases during a 10year period. Arch Intern Med. 2001, 161 (13): 15811586. 10.1001/archinte.161.13.1581.
 5.
Visscher TL, Seidell JC: The public health impact of obesity. Annu Rev Public Health. 2001, 22: 355375. 10.1146/annurev.publhealth.22.1.355.
 6.
Anis AH, Zhang W, Bansback N, Guh DP, Amarsi Z, Birmingham CL: Obesity and overweight in Canada: an updated costofillness study. Obes Rev. 2009, 11 (1): 3140.
 7.
Rothman KJ: BMIrelated errors in the measurement of obesity. Int J Obes (Lond). 2008, 32 (Suppl 3): S56S59.
 8.
Elgar FJ, Stewart JM: Validity of selfreport screening for overweight and obesity. evidence from the Canadian community health survey. Can J Public Health. 2008, 99 (5): 423427.
 9.
Gorber SC, Tremblay M, Moher D, Gorber B: A comparison of direct vs. selfreport measures for assessing height, weight and body mass index: a systematic review. Obes Rev. 2007, 8 (4): 307326. 10.1111/j.1467789X.2007.00347.x.
 10.
World Health Organization: Obesity and Overweight Fact Sheet. [http://www.who.int/mediacentre/factsheets/fs311/en/index.html]
 11.
Kuczmarski MF, Kuczmarski RJ, Najjar M: Effects of age on validity of selfreported height, weight, and body mass index: findings from the Third National Health and Nutrition Examination Survey, 1988–1994. J Am Diet Assoc. 2001, 101 (1): 2834. 10.1016/S00028223(01)000086.
 12.
Rowland ML: Selfreported weight and height. Am J Clin Nutr. 1990, 52 (6): 11251133.
 13.
Villanueva EV: The validity of selfreported weight in US adults: a population based crosssectional study. BMC Public Health. 2001, 1: 1110.1186/14712458111.
 14.
Nyholm M, Gullberg B, Merlo J, LundqvistPersson C, Rastam L, Lindblad U: The validity of obesity based on selfreported weight and height: Implications for population studies. Obesity (Silver Spring). 2007, 15 (1): 197208. 10.1038/oby.2007.536.
 15.
Niedhammer I, Bugel I, Bonenfant S, Goldberg M, Leclerc A: Validity of selfreported weight and height in the French GAZEL cohort. Int J Obes Relat Metab Disord. 2000, 24 (9): 11111118. 10.1038/sj.ijo.0801375.
 16.
Engstrom JL, Paterson SA, Doherty A, Trabulsi M, Speer KL: Accuracy of selfreported height and weight in women: an integrative review of the literature. J Midwifery Womens Health. 2003, 48 (5): 338345. 10.1016/S15269523(03)002812.
 17.
Ezzati M, Martin H, Skjold S, Vander Hoorn S, Murray CJ: Trends in national and statelevel obesity in the USA after correction for selfreport bias: analysis of health surveys. J R Soc Med. 2006, 99 (5): 250257. 10.1258/jrsm.99.5.250.
 18.
Yun S, Zhu BP, Black W, Brownson RC: A comparison of national estimates of obesity prevalence from the behavioral risk factor surveillance system and the National Health and Nutrition Examination Survey. Int J Obes (Lond). 2006, 30 (1): 164170. 10.1038/sj.ijo.0803125.
 19.
Chang VW, Christakis NA: Extent and determinants of discrepancy between selfevaluations of weight status and clinical standards. J Gen Intern Med. 2001, 16 (8): 538543. 10.1046/j.15251497.2001.016008538.x.
 20.
Connor Gorber S, Shields M, Tremblay MS, McDowell I: The feasibility of establishing correction factors to adjust selfreported estimates of obesity. Health Rep. 2008, 19 (3): 7182.
 21.
Shields M, Gorber SC, Janssen I, Tremblay MS: Bias in selfreported estimates of obesity in Canadian health surveys: an update on correction equations for adults. Health Rep. 2011, 22 (3): 3545.
 22.
KuskowskaWolk A, Bergstrom R, Bostrom G: Relationship between questionnaire data and medical records of height, weight and body mass index. Int J Obes Relat Metab Disord. 1992, 16 (1): 19.
 23.
Hayes AJ, Clarke PM, Lung TW: Change in bias in selfreported body mass index in Australia between 1995 and 2008 and the evaluation of correction equations. Popul Health Metr. 2011, 9: 537954953.
 24.
Jain RB: Regression models to predict corrected weight, height and obesity prevalence from selfreported data: data from BRFSS 1999–2007. Int J Obes (Lond). 2010, 34 (11): 16551664. 10.1038/ijo.2010.80.
 25.
Spencer EA, Appleby PN, Davey GK, Key TJ: Validity of selfreported height and weight in 4808 EPICOxford participants. Public Health Nutr. 2002, 5 (4): 561565. 10.1079/PHN2001322.
 26.
Stommel M, Schoenborn CA: Accuracy and usefulness of BMI measures based on selfreported weight and height: findings from the NHANES & NHIS 2001–2006. BMC Public Health. 2009, 9: 42124589421.
 27.
Stommel M, Osier N: Temporal changes in bias of body mass index scores based on selfreported height and weight. Int J Obes (Lond). 2013, 37 (3): 461467. 10.1038/ijo.2012.67.
 28.
Gorber SC, Tremblay MS: The bias in selfreported obesity from 1976 to 2005: a CanadaUS comparison. Obesity (Silver Spring). 2010, 18 (2): 354361. 10.1038/oby.2009.206.
 29.
Canada S: Health Statistics Division: Canadian Community Health Survey, User Guide for the Public Use Microdata File. 2005
 30.
Durbin J: Errors in variables. Revue de l’Institut International de Statistique/Rev Int Stat Inst. 1954, 22: 2332. 10.2307/1401917.
 31.
Shiely F, Hayes K, Perry IJ, Kelleher CC: Height and weight bias: the influence of time. PLoS One. 2013, 8 (1): e5438610.1371/journal.pone.0054386.
 32.
Orpana HM, Berthelot JM, Kaplan MS, Feeny DH, McFarland B, Ross NA: BMI and mortality: results from a national longitudinal study of Canadian adults. Obesity (Silver Spring). 2010, 18 (1): 214218. 10.1038/oby.2009.191.
 33.
Janssen I, Bacon E, Pickett W: Obesity and its relationship with occupational injury in the canadian workforce. J Obes. 2011, 2011: 531403
 34.
Barberio A, McLaren L: Occupational physical activity and body mass index (BMI) among Canadian adults: does physical activity at work help to explain the socioeconomic patterning of body weight?. Can J Public Health. 2011, 102 (3): 169173.
Prepublication history
The prepublication history for this paper can be accessed here:http://www.biomedcentral.com/14712458/14/430/prepub
Acknowledgements
We thank J. C. Herbert Emery for helpful feedback on the manuscript.
DJD is supported by a Doctoral Traineeship in Population Health Intervention Research from the Canadian Population Health Intervention Research Network (PHIRNET). LM is supported by a Population Health Investigator Award from Alberta Innovates – Health Solutions.
Author information
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
DJD conceptualized the study and led the analysis. LM contributed to study conceptualization and interpretation of results. DJD and LM contributed equally to the writing of the paper. Both authors read and approved the final draft.
Daniel J Dutton and Lindsay McLaren contributed equally to this work.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Dutton, D.J., McLaren, L. The usefulness of “corrected” body mass index vs. selfreported body mass index: comparing the population distributions, sensitivity, specificity, and predictive utility of three correction equations using Canadian populationbased data. BMC Public Health 14, 430 (2014). https://doi.org/10.1186/1471245814430
Received:
Accepted:
Published:
Keywords
 Body mass index
 Measurement error
 Obesity
 Overweight
 Selfreport
 Bias
 Correction