Challenging the role of social norms regarding body weight as an explanation for weight, height, and BMI misreporting biases: Development and application of a new approach to examining misreporting and misclassification bias in surveys

Background Cultural pressures to be thin and tall are postulated to cause people to misreport their body weight and height towards more socially normative (i.e., desirable) values, but a paucity of direct evidence supports this idea. We developed a novel non-linear approach to examining weight, height, and BMI misreporting biases and used this approach to examine the association between socially non-normative weight and misreporting biases in adults. Methods The Survey of Lifestyles, Attitudes, and Nutrition 2007 (SLÁN 2007), a nationally representative survey of the Republic of Ireland (N = 1942 analyzed) was used. Self-reported weight (height) was classified as under-reported by ≥2.0 kg (2.0 cm), over-reported by ≥2.0 kg (2.0 cm), or accurately reported within 2.0 kg (2.0 cm) to account for technical errors of measurement and short-term fluctuations in measured weight (height). A simulation strategy was used to define self-report-based BMI as under-estimated by more than 1.40 kg/m2, over-estimated by more than 1.40 kg/m2, or accurately estimated within 1.40 kg/m2. Patterns of biases in self-reported weight, height, and BMI were explored. Logistic regression was used to identify factors associated with mis-estimated BMI and to calculate adjusted odds ratios (AOR) and 99% confidence intervals (99%CI). Results The patterns of bias contributing the most to BMI mis-estimation were consistently, in decreasing order of influence, (1) under-reported weight combined with over-reported height, (2) under-reported weight with accurately reported height, and (3) accurately reported weight with over-reported height. Average bias in self-report-based BMI was -1.34 kg/m2 overall and -0.49, -1.33, and -2.66 kg/m2 in normal, overweight, and obese categories, respectively. Despite the increasing degree of bias with progressively higher BMI categories, persons describing themselves as too heavy were, within any given BMI category, less likely to have under-estimated BMI (AOR 0.5, 99%CI: 0.3-0.8, P < 0.001), to be misclassified in a lower BMI category (AOR 0.3, 99%CI: 0.2-0.5, P < 0.001), to under-report weight (AOR 0.5, 99%CI: 0.3-0.7, P < 0.001), and to over-report height (OR 0.7, 99%CI: 0.6-1.0, P = 0.007). Conclusions A novel non-linear approach to examining weight, height, and BMI misreporting biases was developed. Perceiving oneself as too heavy appears to reduce rather than exacerbate weight, height, and BMI misreporting biases.


Background
One of the most commonly used proxy measures of obesity in large cross-sectional surveys is elevated body mass index (BMI). A cost-effective, practical approach for obtaining BMI values in large numbers of individuals is to collect self-reported weight and height data, but these parameters are liable to response and recall bias. Many studies in adults have indicated that self-reported weight and height tend to be under-and over-reported, respectively, in surveys in the United States [1][2][3][4], England [5,6], Germany [7], France [8], Spain [9], Italy [10], Sweden [11,12], Finland [13], New Zealand [14], the United Kingdom [15], and Ireland [16].
It has been postulated that cultural pressures to be thin and tall cause people to intentionally or unintentionally misreport their body weight and height towards more socially normative (i.e., desirable) values [15,17,18]. This theory, proposed by Zeibland et al. (1996), has been supported mainly by indirect evidence.
Three studies have directly examined the relationship between social desirability and biases in self-reported weight and height in adults, and the findings from these studies are conflicting [4,17,18]. In one small study of 56 non-obese individuals, stronger desires to conform to social norms, as measured using the Marlowe Crowne Social Desirability Scale, correlated with greater magnitudes of self-reported weight bias in females [17]. However, in another much larger study, the degree of bias in self-reported weight was negatively related to the difference between measured body weight and "socially ideal" body weight, defined as the mean self-reported weight of the sample [18]. Consistent with the second study, a third reported that those who consider themselves to be too heavy are half as likely to have a discrepancy between self-reported and measured weight than those who consider themselves to be about the right weight [4]. The association in this latter study, however, likely suffers from significant bias because the questionnaire did not reference the participants to an appropriate comparison group (e.g., those of the same age), a factor that is known to influence what constitutes a social norm [21,22].
Collectively, the evidence supporting the influence of social desirability on misreporting biases of anthropomorphic parameters is limited and controversial. Considering that social desirability and self-reported weight bias seem to be related non-linearly [18], further examination of this topic would benefit from a non-linear methodology, of which only one has been reported in the context of self-reported weight bias [4]. The current methodology does not, however, take into account technical errors of measurement (TEM) and short-term variations in weight or height, nor does it enable the examination of biases in self-report-based BMI, a parameter that better approximates adiposity. Therefore, the purposes of the present study were (1) to develop a non-linear approach to examining biased estimates of self-report-based BMI that accounts for TEM and short-term variations in weight and height; (2) to use this approach to characterize patterns of self-reported weight, height, and BMI biases in the Republic of Ireland using a nationally representative survey; and (3) to test directly the hypothesis that considering oneself to be heavier than the socially normative weight, while taking into consideration one's age and height, is positively associated with biases in selfreport-based estimates of BMI.

Setting and population
The Survey of Lifestyle, Attitudes and Nutrition (SLÁN) 2007 is a nationally representative cross-sectional study of the adult (18+ years) population residing in the Republic of Ireland. Detailed methodology is found in the SLÁN 2007 Main Report [23]. In brief, trained interviewers conducted face-to-face interviews with 10,364 subjects selected using a multistage area probability sampling procedure. In addition, SLÁN 2007 incorporated two sub-studies for (1) the measurement of anthropomorphic characteristics (N = 967, age 18-44 years) and (2) physical examination with clinical laboratory tests (N = 1207, age 45+ years). In the main survey, self-reported height and weight were obtained and used to calculate self-report-based BMI (BMI SR ), and in both sub-studies trained interviewers measured body height and weight for the calculation of measured BMI (BMI M ).

Inclusion criteria
Subjects who completed the SLÁN 2007 main survey and one of its sub-studies were selected for the present analyses, resulting in an initial sample size of 2174 (age 18+ years). Subjects were excluded if the interviewer deemed the height or weight measurements to be unreliable or slightly unreliable (N = 18) or if data was missing for BMI M (N = 3), self-reported height (N = 75), self-reported weight (N = 48), or any of the analysis variables (N = 68). The differences between self-reported and measured height or weight were determined, and the five most extreme cases in either direction were excluded because these cases were considered to have a high probability of data recording or entry error, as opposed to reporting error (N = 20). The total number of cases excluded was 232, resulting in a final sample size of 1942.
Definitions of under-and over-reported weight, height, and BMI Self-reported weight or height was deemed acceptably accurate if within ±2.0 kg or ±2.0 cm of measured weight or height, respectively. These cut-off points were determined to allow for various factors that might explain differences between self-reported and measured height and weight that can occur even in the absence of any intentional or unintentional misreporting. One factor is the technical error of measurement of weight (negligible) and height (0.5 cm), and another is temporal variation in weight and height depending on food/liquid consumption, water balance, and postural differences. In addition, height and weight in an individual can vary with age, and subjects may have meticulously reported their weight and height based on a measurement that was taken months or years ago. Finally, most subjects (97%) reported their height in terms of feet and inches, which entails additional errors. The combined effects of errors in self-reported height and weight on BMI bias were simulated and led us to the conclusion that selfreported BMI should be considered accurate if within ±1.40 kg/m 2 , under-estimated if < -1.40 kg/m 2 , or overestimated if > +1.40 kg/m 2 (see additional file 1: Method of BMI error simulation to establish a BMI accuracy definition). This simulation method can be readily applied in samples of other populations, as it uses the mean measured weight and height of the sample.

Patterns of bias in height, weight, and BMI
Measured weight (kg) and height (cm) were subtracted from their corresponding self-reported values to determine errors in self-reported weight and height. Degrees of error were categorized in 1.0 cm or 1.0 kg intervals, and the frequency of error within each interval was determined and plotted in a frequency map to demonstrate the distribution of errors in the study population ( Figure 1). Using the definitions of accurately reported, under-reported, and over-reported weight and height, we assigned subjects to one of nine possible types of error (visually demarcated in Figure 2, listed in Table 1). The number and proportion of subjects for each error type was determined. The difference between selfreported and measured BMI (BMI D ), an estimate of the degree and direction of bias, was calculated for the total population undergoing analysis and for each error type. To express the relative contribution of each error type to BMI D , the following calculation was used: where b = -1 when BMI D is negative or b = +1 when BMI D is positive for a given error type. The factor b accounts for the error type's direction of bias. These analyses were carried out in the total sample and the normal, overweight, and obese BMI ranges. Underweight subjects were not analyzed in this manner because the number of observations (n = 15) was not large enough for stratification by error type but were included in the total population analysis. Interval estimates and statistical comparisons were not performed because the error types are not fully stochastic.

Identification of factors associated with BMI bias and misclassification
Pearson χ 2 tests were used to compare the proportions of subjects in covariate sub-categories according to the definitions for under-estimated, accurately estimated, or over-estimated BMI ( Table 2). Similar comparisons were performed for subjects whose self-report-based and measured BMI categories were negatively discordant (BMI SR category < BMI M category), concordant (BMI SR Figure 1 Proportion of subjects with self-report-based BMI beyond or within ±1.40 kg/m 2 of measured BMI (A, top) and whose self-report-based and measured BMI resulted in concordant or discordant BMI category assignments (B, bottom). = BMI M ), or positively discordant (BMI SR > BMI M ) (not shown). Univariate and multivariate logistic regression were used to determine the associations (odds ratios [OR] and 99% confidence intervals [99%CI]) of selfdescribed weight status or attempted weight management with the following binary outcomes: (1) underestimated vs accurately estimated BMI, (2) over-estimated vs accurately estimated BMI, (3) negative discordance vs concordance, or (4) positive discordance vs concordance. Associations were also determined for under-or over-reported weight and height.
To determine self-described weight, subjects were asked: "Given your age and height, would you say that you are about the right weight, too heavy, too light, or not sure?" Attempted weight management was determined by asking subjects: "Are you actively trying to manage your weight?" If a respondent answered affirmatively, they were asked: "Is it to lose, gain, or maintain weight?" In a first analysis (Table 3, Model 1), associations were adjusted for age, sex, social class, ethnicity, marital status, highest level of education, physical activity level, current smoking status, and alcohol consumption. A subsequent analysis (Table 3, Model 2) included the same covariates as in Model 1 with BMI category as an additional covariate. A final model (Table 4) was constructed with all covariates, including both selfdescribed weight and attempted weight management. All analyses were conducted using PASW Statistics v18.0 (Macintosh). Significance was set at P ≤ 0.01.

Self-reported weight, height, and BMI bias
As BMI M category increased, the prevalence of underestimated BMI SR and of negative discordance between   Figure 2 Map of the frequencies of error in height and weight. Measured height (m) and weight (kg) were subtracted from their corresponding self-reported values to yield Error in Self-Reported Height (columns) and Error in Self-Reported Weight (rows). In both cases, negative values indicate that the self-reported value was less than the measured value. The degree of error was categorized in 1.0 cm or 1.0 kg intervals, and the frequency of error within each interval was determined. Solid lines indicate the cut-off values for accurately reported height (-0.02 m to +0.02 m) and accurately reported weight (-2.0 kg to +2.0 kg) to take into account technical errors of measurement and shortterm fluctuations in weight and height. These lines form nine regions that correspond to the nine error pattern categories listed in Table 1.
Counts are shown in each cell. Blank cells represent a count of zero, and each progressively darker shade represents the next seven-count level. BMI SR and BMI M categories increased ( Figure 1A and 1B, respectively, both P < 0.001). Opposite relationships were observed for the prevalence of over-estimated BMI SR and positive discordance, where in both cases the prevalence decreased with increasing BMI M category ( Figure 1A and 1B To determine whether the trend for negative bias in BMI SR was due mainly to misreporting of weight or height, we assigned subjects to one of nine error types that correspond to the nine regions depicted in Figure 2 and listed in Table 1. Figure 2 visualizes the distribution of errors in self-reported weight and height and suggests a trend towards under-reported weight and overreported height. Table 1 indicates that the degree of under-estimation in BMI SR increases as weight category increases. The overall BMI bias was -1.34 kg/m 2 and     [23]. § § § Self-described weight taking into account one's age and height. All comparisons were made using Pearson chi-squared tests. was -0.49, -1.33, and -2.66 kg/m 2 in the normal, overweight, and obese BMI ranges, respectively. Underweight subjects (n = 15) were excluded from this analysis because numbers in each group were too small for meaningful analysis. Although subjects in the normal BMI range exhibited a weak negative bias (-0.49 kg/m 2 ), this degree of bias remained well within the acceptably accurate range of -1.40 to +1.40 kg/m 2 . The error types that contributed the most to bias in the total study population and in each of the BMI categories analyzed were consistently, in decreasing order of influence: (1) under-reported weight combined with over-reported height, (2) under-reported weight with accurately reported height, and (3) accurately reported weight with over-reported height.

Identification of factors associated with BMI bias and misclassification
The proportions of subjects for each covariate category used in logistic regression analyses are shown in Table  2. Multivariate logistic regression without adjustment for measured BMI category (Table 3, Model 1) suggested that describing oneself as too heavy and actively attempting to lose weight were significantly associated . § § Self-described weight taking into account one's age and height. Multivariate logistic regression with the following covariates: sex, age, BMI category, ethnicity, current smoking status, alcohol consumption, physical acitvity level, level of education, marital status, self-described weight, weight management attempt. T P < 0.05, ** P < 0.01, *** P < 0.001.
with a greater likelihood of under-estimating BMI SR and exhibiting negative discordance. These covariates were also associated with an increased likelihood of underreporting weight (OR 1.4, 99%CI: 1.1-1.8 for both covariates, P < 0.01) but not of over-reporting height (data not otherwise shown). However, when Model 1 was further adjusted for measured BMI category, the associations described above were reversed (Table 3, Model 2). Specifically, in Model 2, describing oneself as too heavy and attempting to lose weight were, within an given BMI category, significantly associated with a lower likelihood of under-estimating BMI SR and exhibiting negative discordance within each BMI category. Describing oneself as too heavy was also associated with a lower likelihood of under-reporting weight (OR 0.5, 99%CI: 0.3-0.7, P < 0.001) and overreporting height (OR 0.7, 99%CI: 0.6-1.0, P = 0.007). Similarly, attempting to lose weight was associated with a lower likelihood of under-reporting weight (OR 0.7, 99%CI: 0.5-1.0, P = 0.011) and over-reporting height (OR 0.7, 99%CI: 0.5-1.0, P = 0.008) (data not otherwise shown).
The associations described above in Model 2 for describing oneself as too heavy held when both covariatesself-described body weight and attempting weight losswere entered simultaneously (Table 4). Attempting to lose weight was partially confounded by selfdescribed weight. Four additional factors emerged as being significantly associated with BMI mis-estimation: sex, age, measured BMI category, and marital status. The strongest factor associated with all outcomes shown in Table 4 was BMI category.

Discussion
In the present study, we developed a new methodological approach for the examination of bias in self-reportbased BMI in any given sample of any given population. Based on this method, a consistent pattern of weight and height misreporting biases emerged. Specifically, in the total population and in the BMI categories analyzed, the combinations of misreporting biases that contributed the most to overall BMI mis-estimation were consistently, in decreasing order of influence: (1) underreported weight combined with over-reported height, (2) under-reported weight with accurately reported height, and (3) accurately reported weight with over-reported height. Further examination of patterns of bias in BMI estimation revealed that subjects in the normal BMI range exhibited a slight negative bias overall but that the degree of bias was well within the acceptably accurate range. As BMI category increases beyond the normal range, both the prevalence and magnitude of selfreport-based BMI mis-estimation increase. These latter findings regarding BMI categories are consistent with a large number of previous studies on weight and height misreporting biases in adults, suggesting that the method described herein has good reliability [2,5,6,8,[10][11][12][13]15,16,[18][19][20].
It is widely believed that research participants misreport anthropometric characteristics to portray a more socially desirable weight and height. This view has been echoed in the literature [16,24] despite limited evidence [4,17,18]. We provide evidence that challenges the role of social desirability as an explanation for misreported weight and height. Within any given BMI category, it appears that those who describe themselves as too heavy are less likely to under-report their weight, over-report their height, to have under-estimated BMI, and to be misclassified in a lower BMI category. Conversely, considering oneself to be too heavy increases the likelihood of over-reporting weight and having over-estimated BMI. These findings are consistent with those reported by Gil and Mora (2010) and by Villanueva (2001) [4,18]. Thus, although overweight and obese participants tend to under-report their weight, it seems that social norms concerning weight tend to reduce rather than exacerbate misreporting bias of this parameter.
A plausible explanation of this finding is that persons who describe themselves as too heavy are more weightconscious. They may measure their weight more frequently and therefore report their anthropomorphic characteristics more accurately. We are unaware of data demonstrating a link between perceived weight status and weight consciousness or weighing frequency; however, in the present study those who were attempting to lose weighta behavior that is associated with greater self-weighing frequency [25,26] were, within any given BMI category, less likely to under-report their weight or to have under-estimated BMI. Rather, they were more likely to over-report their weight or to have over-estimated BMI. Further research is warranted to understand how social desirability influences self-reported weight and height and to explain why the degree of self-reportbased BMI mis-estimation increases with progressively higher BMI categories. Indeed, in our final model presented in Table 4, BMI category was by far the strongest factor associated with the mis-estimation of BMI. The reasons for this association remain elusive.
The present study has important research implications, as the described methods can be optimized for any given sample from any given population and may provide novel insights into the factors associated with weight and height misreporting biases and self-reportbased BMI mis-estimation. A particular advantage of this methodology is that it accounts for normal shortterm fluctuations in weight and height and for technical errors of measurement, both of which may lead to apparent rather than true misreporting bias. Factors that may influence short-term weight and height variation may include food consumption (i.e., fasted vs post-prandial measurements), shifts in water balance/volume status, and temporal postural changes. Other factors influencing self-report accuracy may include reporting weight or height using a US standard scale rather than a metric one and reporting correct weight or height measurements taken weeks, months, or years earlier. Collectively, these factors may explain differences between self-reported and measured values, even in the absence of any intentional or unintentional misreporting.
A major weakness of this study is its cross-sectional design, which limits any causal inferences from our logistic regression analyses. In addition, the study population was mostly white, thereby precluding the study of ethnicity, a factor that some studies have shown to be associated with weight and height misreporting biases [4,17,27,28]. Another weakness is that some combinations of misreported weight and height lead to accurate BMI estimates. However, instances of accurate BMI estimates in the context of misreported weight and height were relatively uncommon, and the expected effect would be to bias logistic regression analyses towards the null hypothesis. Future studies utilizing the methodology reported herein should consider these limitations.

Conclusions
A new methodological approach was developed for the examination of weight and height misreporting as well as BMI mis-estimation. This approach is useful for the exploration of patterns of such biases and for the analysis of factors potentially associated with misreporting bias. Using this new approach, we demonstrate that although BMI category is seemingly the most important factor associated with the misreporting of weight and height data, social norms concerning body weight appear to counteract such biases.

Additional material
Additional file 1: Method of BMI error simulation to establish a BMI accuracy definition. This file describes how to simulate the effects of various combinations of weight and height misreporting on BMI estimation in any given sample. This method is used to define the limits of acceptably accurate BMI estimation. A supplementary figure showing the application of this method in the present study is provided.