In this study we found that instrument error due to uncalibrated scales and stadiometers, combined with cut-off based classification systems, can lead to minor but systematic overestimation of the prevalence of overweight and obesity in a nationally representative sample.
To the best of our knowledge, this is the first study to show how increased variance of anthropometric measurements can affect outcome measures in obesity surveillance. Previous studies have considered the accuracy of anthropometric measurements, but did not evaluate the impact on the estimated prevalence of overweight including obesity (BMI ≥25 kg/m2) . It has been reported that scales in healthcare settings did not have higher accuracy than scales in fitness- or weight loss centres . Furthermore, beam-balance scales are more accurate than scales with electronic mechanisms, whilst bathroom-type scales with a spring mechanism are least accurate . These findings indicate that it is possible to increase accuracy by ensuring the school health service has good quality instruments.
Our findings are particularly relevant for population-based studies and surveillance programs that maximise the use of existing resources, such as instruments in the school health service. It is important to consider and balance accuracy with feasibility . We thus developed a simple yet effective procedure for the NCG-study in order to obtain correction values for instruments at each school. This need was clearly demonstrated by the wide range of instrument error (4.7 kg and 7.5 cm) observed in this study from 127 scales and stadiometers. Valid data were collected with limited costs. We also found that, on average, instruments in the NCG-study slightly underestimated both weight (mean: –0.14 kg) and height (mean: –0.07 cm) measurements. However, this information was not used in the current analyses since the aim of this study was solely to assess the effect of instrument error variation. The average of the error terms was therefore set to zero in the simulations. If the mean error term was not set to zero, but rather set to the values observed in the NCG-study, the entire distribution would have shifted to the left due to the negative means, whilst the effect of “heavy tails” demonstrated in this study would have been equalised or toned down.
A possible limitation of the approach adopted in the NCG-study, i.e. adjusting for the correction value, is that it may only be valid for the specific point on the measuring-scale that corresponds to the reference weight and length value. To ensure the collection of the correction value on the appropriate part of the measuring-scale in weighing scales, a reference weight was chosen within the range of value of our target population. For stadiometers corrected at one point, measurements are likely to be correct along the whole measuring-scale.
According to the procedures, the plastic containers used to collect correction values were only measured once. Duplicate measurements would have increased the reliability, but optimisation and feasibility must be considered. Overly complicated procedures would undoubtedly increase the risk of dropout. In the present study, no schools dropped out. At the majority of schools, the data collection was completed within one day.
Only a single average cut-off value for overweight and obesity was used for the entire sample of 8–9 year old girls and boys in the simulations, and not age- and sex specific cut-off values as recommended by the IOTF . This was done deliberately, in order to simplify the analyses. The expressed aim of the study was to explore the phenomenon of increased variance rather than to present correct estimates of the prevalence of overweight including obesity. For the same reason, the two-stage sampling methodology was not taken into account in the simulation analysis.
The rationale behind running the simulations according to two different models was to illustrate, in the first model, that an increasing instrument error will systematically impact upon the estimated proportion of overweight and obesity in surveys. The second model contained the actual CVs of instrument error for scales and stadiometers derived from the NCG-study and serves as a realistic example.
In the literature it is often stated that instruments are calibrated prior to data collection, without giving detailed calibration procedures. It has been claimed that once instruments are installed and calibrated, error due to the instruments is negligible . Our findings suggest the contrary. Indeed, the impetus for this paper stems from the finding that old flooring had been removed from a school health office without adjustments to the wall-mounted stadiometer and with subsequent inaccurate height measurements. It shows that even when measuring instruments are calibrated upon acquisition and installment they can become uncalibrated, underlining the need for procedures for regular maintenance of anthropometric instruments. Generally, measuring instruments are bought calibrated, but due to lack of awareness, they may never be maintained or recalibrated after years of use.